diff --git a/README.md b/README.md
index b536c6a..bfb4953 100644
--- a/README.md
+++ b/README.md
@@ -4,10 +4,12 @@ Local-first codebase intelligence for CLI workflows.
 
 Sia Code indexes your repo and lets you:
 
-- search code fast (lexical, semantic, or hybrid)
-- trace architecture with multi-hop research
+- search code fast via ChunkHound CLI (lexical or semantic)
+- trace architecture with ChunkHound research
 - store/retrieve project decisions and timeline context
 
+Search and research are hard-switched to ChunkHound CLI. Sia keeps index orchestration and memory storage local.
+
 ## Why teams use it
 
 - Works directly on local code (`.sia-code/` index per repo/worktree)
@@ -31,6 +33,9 @@ sia-code --version
 ## Quick Start (2 minutes)
 
 ```bash
+# install ChunkHound CLI once
+uv tool install chunkhound
+
 # in your project
 sia-code init
 sia-code index .
@@ -53,20 +58,20 @@ sia-code status
 | `sia-code index .` | Build index |
 | `sia-code index --update` | Incremental re-index |
 | `sia-code index --clean` | Rebuild index from scratch |
-| `sia-code search "query"` | Hybrid search (default) |
-| `sia-code search --regex "pattern"` | Lexical search |
-| `sia-code research "question"` | Multi-hop relationship discovery |
+| `sia-code search "query"` | ChunkHound-backed search (default mode from config) |
+| `sia-code search --regex "pattern"` | ChunkHound lexical search |
+| `sia-code research "question"` | ChunkHound research |
 | `sia-code memory sync-git` | Import timeline/changelog from git |
 | `sia-code memory search "topic"` | Search stored project memory |
 | `sia-code config show` | Print active configuration |
 
 ## Search Modes (important)
 
-- Default command is hybrid: `sia-code search "query"`
+- Default search mode comes from `chunkhound.default_search_mode` (default: `regex`)
 - Lexical mode: `sia-code search --regex "pattern"`
-- Semantic-only mode: `sia-code search --semantic-only "query"`
+- Semantic-only mode: `sia-code search --semantic-only "query"` (requires ChunkHound semantic setup)
 
-Use `--no-deps` when you want only your project code.
+Dependency visibility flags (`--no-deps`, `--deps-only`) are currently compatibility no-ops with ChunkHound-backed search.
 
 ## Git Sync Memory + Semantic Changelog
 
@@ -74,8 +79,10 @@ Use `--no-deps` when you want only your project code.
 
 - Scans tags into changelog entries
 - Scans merge commits into timeline events
+- For merge commits whose subject matches `Merge branch '...'`, also creates changelog entries
 - Stores `files_changed` and diff stats (`insertions`, `deletions`, `files`)
 - Optionally enhances sparse summaries using a local summarization model
+- `memory sync-git --limit 0` processes all eligible events
 
 How semantic summary generation works:
 
@@ -111,8 +118,8 @@ Useful commands:
 
 ```bash
 sia-code config show
-sia-code config get search.vector_weight
-sia-code config set search.vector_weight 0.0
+sia-code config get chunkhound.default_search_mode
+sia-code config set chunkhound.default_search_mode semantic
 ```
 
 Note: backend selection is auto by default (`sqlite-vec` for new indexes, legacy `usearch` supported).
diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md
index f803e89..17a3bf8 100644
--- a/docs/ARCHITECTURE.md
+++ b/docs/ARCHITECTURE.md
@@ -11,9 +11,9 @@ Sia Code has two core pipelines:
    - write lexical/vector indexes
 
 2. **Query pipeline**
-   - preprocess query
-   - run lexical and/or semantic search
-   - rank + return chunk matches
+   - resolve mode and build ChunkHound CLI command
+   - execute ChunkHound search/research
+   - parse and render results in Sia CLI formats
 
 ## Storage Model
 
@@ -34,17 +34,18 @@ Backend selection:
 - `cli.py`: command entry and orchestration
 - `indexer/coordinator.py`: full/incremental indexing lifecycle
 - `parser/*`: language detection, concept extraction, chunk building
-- `storage/*`: search execution and persistence
+- `search/chunkhound_cli.py`: ChunkHound command bridge and output parsing
+- `storage/*`: memory persistence plus legacy/local search paths
 - `memory/*`: git-to-memory sync and timeline/changelog tooling
 - `embed_server/*`: optional shared embed daemon
 
 ## Search Architecture
 
-- **Hybrid (default):** lexical + semantic
-- **Lexical (`--regex`):** exact token/symbol heavy queries
-- **Semantic (`--semantic-only`):** concept similarity only
+- **Default (`search`)**: mode from `chunkhound.default_search_mode` (default `regex`)
+- **Lexical (`--regex`)**: exact token/symbol heavy queries
+- **Semantic (`--semantic-only`)**: ChunkHound semantic mode
 
-Flags like `--no-deps` and `--deps-only` control dependency-code visibility.
+Flags like `--no-deps` and `--deps-only` are accepted for compatibility but currently no-op with ChunkHound-backed search.
 
 ## Design Goals
 
diff --git a/docs/BENCHMARK_METHODOLOGY.md b/docs/BENCHMARK_METHODOLOGY.md
index ef41ea4..a78eb5a 100644
--- a/docs/BENCHMARK_METHODOLOGY.md
+++ b/docs/BENCHMARK_METHODOLOGY.md
@@ -2,10 +2,12 @@
 
 This project uses RepoEval-style retrieval evaluation for search quality checks.
 
+> Note: The benchmark harness in `tests/benchmarks/` targets legacy in-process retrievers. ChunkHound-backed CLI search/research should be benchmarked separately as end-to-end CLI runs.
+
 ## Scope
 
 - Evaluate retrieval quality (not answer generation)
-- Compare lexical, hybrid, and semantic settings
+- Compare lexical, hybrid, and semantic settings in the legacy retriever stack
 - Use consistent query set and top-k metrics
 
 ## Minimal Reproduction Flow
@@ -23,7 +25,7 @@ pkgx python tests/benchmarks/run_full_repoeval_benchmark.py
 
 - Recall@k (especially Recall@5)
 - indexing time and query latency
-- configuration used (`vector_weight`, embedding settings)
+- configuration used (`chunkhound.default_search_mode` for CLI runs, `vector_weight` for legacy runs, embedding settings)
 
 ## Fairness Rules
 
diff --git a/docs/BENCHMARK_RESULTS.md b/docs/BENCHMARK_RESULTS.md
index 62c3f74..182283d 100644
--- a/docs/BENCHMARK_RESULTS.md
+++ b/docs/BENCHMARK_RESULTS.md
@@ -5,19 +5,21 @@
 - RepoEval Recall@5: **89.9%** (reported)
 - Improvement over cAST baseline: **+12.9 points** (reported)
 
+> Note: These numbers are historical baselines from legacy in-process retrievers. Current CLI `search` and `research` are ChunkHound-backed.
+
 ## Practical Takeaways
 
 - Lexical-heavy search performs strongly for code identifiers.
-- Hybrid can still be useful for natural-language style queries.
+- For legacy retriever experiments, hybrid can still help natural-language style queries.
 - For daily debugging, `--regex` is often the fastest path.
 
 ## Recommended Starting Config
 
 ```bash
-sia-code config set search.vector_weight 0.0
+sia-code config set chunkhound.default_search_mode regex
 ```
 
-Then adjust only if your query style is mostly conceptual.
+For legacy benchmark experiments, `search.vector_weight` remains available in the in-process retriever stack.
 
 ## Where to find raw benchmark tooling
 
diff --git a/docs/CLI_FEATURES.md b/docs/CLI_FEATURES.md
index 8ed1921..d2e6dab 100644
--- a/docs/CLI_FEATURES.md
+++ b/docs/CLI_FEATURES.md
@@ -18,8 +18,8 @@ sia-code status
 | --- | --- | --- |
 | `init` | Create `.sia-code/` index workspace | `--path`, `--dry-run` |
 | `index [PATH]` | Build index | `--update`, `--clean`, `--parallel`, `--workers`, `--watch`, `--debounce`, `--no-git-sync` |
-| `search QUERY` | Search code (default hybrid) | `--regex`, `--semantic-only`, `-k/--limit`, `--no-filter`, `--no-deps`, `--deps-only`, `--format`, `--output` |
-| `research QUESTION` | Multi-hop architecture exploration | `--hops`, `--graph`, `-k/--limit`, `--no-filter` |
+| `search QUERY` | Search code (ChunkHound-backed) | `--regex`, `--semantic-only`, `-k/--limit`, `--no-filter` (compat), `--no-deps` (compat), `--deps-only` (compat), `--format`, `--output` |
+| `research QUESTION` | Architecture exploration (ChunkHound-backed) | `--hops` (compat), `--graph` (compat), `-k/--limit` (compat), `--no-filter` (compat) |
 | `status` | Index health and statistics | none |
 | `compact [PATH]` | Remove stale chunks | `--threshold`, `--force` |
 | `interactive` | Live query loop | `--regex`, `-k/--limit` |
@@ -28,17 +28,17 @@ sia-code status
 
 | Command | Purpose | Key options |
 | --- | --- | --- |
-| `memory sync-git` | Import timeline/changelog from git (with diff stats and optional local semantic summaries) | `--since`, `--limit`, `--dry-run`, `--tags-only`, `--merges-only`, `--min-importance` |
+| `memory sync-git` | Import timeline/changelog from git (with diff stats and optional local semantic summaries) | `--since`, `--limit` (`0` means all), `--dry-run`, `--tags-only`, `--merges-only`, `--min-importance` |
 | `memory add-decision TITLE` | Add pending decision | `-d/--description` (required), `-r/--reasoning`, `-a/--alternatives` |
-| `memory list` | List memory items | `--type`, `--status`, `--limit`, `--format` |
+| `memory list` | List memory items | `--type`, `--status`, `--limit` (`0` means all), `--format` |
 | `memory approve ID` | Approve decision | `-c/--category` (required) |
 | `memory reject ID` | Reject decision | none |
 | `memory search QUERY` | Search memory | `--type`, `-k/--limit` |
-| `memory timeline` | View timeline events | `--since`, `--event-type`, `--importance`, `--format` |
-| `memory changelog [RANGE]` | Generate changelog | `--format`, `--output` |
+| `memory timeline` | View timeline events | `--since`, `--event-type`, `--importance`, `--limit` (`0` means all), `--format` |
+| `memory changelog [RANGE]` | Generate changelog | `--limit` (`0` means all), `--format`, `--output` |
 | `memory export` / `memory import` | Backup/restore memory | `-o/--output`, `-i/--input` |
 
-`memory sync-git` is the entrypoint for semantic changelog generation: it extracts git context, then (if enabled) uses the local summarizer to enrich release and merge summaries stored in memory.
+`memory sync-git` is the entrypoint for semantic changelog generation: it extracts git context, then (if enabled) uses the local summarizer to enrich tag releases and merge-derived changelog entries stored in memory.
 
 ## Embed Daemon
 
@@ -48,15 +48,15 @@ sia-code status
 | `embed status` | Show daemon status |
 | `embed stop` | Stop daemon |
 
-Use daemon when you rely heavily on hybrid/semantic search or memory embedding operations.
+Use daemon when you rely heavily on memory embedding operations.
 
 ## Config Commands
 
 ```bash
 sia-code config show
 sia-code config path
-sia-code config get search.vector_weight
-sia-code config set search.vector_weight 0.0
+sia-code config get chunkhound.default_search_mode
+sia-code config set chunkhound.default_search_mode semantic
 ```
 
 ## Output Formats
@@ -71,8 +71,8 @@ sia-code config set search.vector_weight 0.0
 - First index: `sia-code index .`
 - Ongoing work: `sia-code index --update`
 - Exact symbols: `sia-code search --regex "pattern"`
-- Project-only focus: `--no-deps`
-- Architecture questions: `sia-code research "..." --hops 3`
+- If output is noisy: tighten regex terms or add path-like query terms
+- Architecture questions: `sia-code research "..."`
 
 ## Related Docs
 
diff --git a/docs/CODE_STRUCTURE.md b/docs/CODE_STRUCTURE.md
index 699b9db..f9bc2b1 100644
--- a/docs/CODE_STRUCTURE.md
+++ b/docs/CODE_STRUCTURE.md
@@ -21,8 +21,8 @@ sia_code/
   core/                  # shared models and enums
   parser/                # AST concept extraction and chunking
   indexer/               # indexing orchestration, hash cache, metrics
-  search/                # query pre-processing and multi-hop logic
-  storage/               # sqlite-vec + legacy usearch backends
+  search/                # ChunkHound CLI bridge + query helpers
+  storage/               # memory persistence + legacy local search backends
   memory/                # git sync, timeline, changelog, decision flow
   embed_server/          # optional embedding daemon
 ```
@@ -35,7 +35,8 @@ sia_code/
 | Change default behavior | `sia_code/config.py`, `sia_code/cli.py` |
 | Tune indexing | `sia_code/indexer/coordinator.py`, `sia_code/indexer/chunk_index.py` |
 | Tune chunking | `sia_code/parser/chunker.py`, `sia_code/parser/concepts.py` |
-| Search ranking/filtering | `sia_code/storage/sqlite_vec_backend.py`, `sia_code/storage/usearch_backend.py` |
+| ChunkHound search/research bridge | `sia_code/search/chunkhound_cli.py`, `sia_code/cli.py` |
+| Legacy/local search ranking (interactive) | `sia_code/storage/sqlite_vec_backend.py`, `sia_code/storage/usearch_backend.py` |
 | Backend selection logic | `sia_code/storage/factory.py` |
 | Memory commands and sync | `sia_code/memory/git_sync.py`, `sia_code/memory/git_events.py`, `sia_code/cli.py` |
 
diff --git a/docs/LLM_CLI_INTEGRATION.md b/docs/LLM_CLI_INTEGRATION.md
index 1d999d8..ec51a7b 100644
--- a/docs/LLM_CLI_INTEGRATION.md
+++ b/docs/LLM_CLI_INTEGRATION.md
@@ -30,6 +30,7 @@ Load skill sia-code
 ## 3) Recommended agent workflow
 
 ```bash
+uv tool install chunkhound
 uvx sia-code status
 uvx sia-code init
 uvx sia-code index .
@@ -37,14 +38,21 @@ uvx sia-code search --regex "your symbol"
 uvx sia-code research "how does X work?"
 ```
 
+Notes:
+
+- `search` and `research` are ChunkHound-backed.
+- Memory commands stay in Sia's local memory database.
+
 ## 4) Optional memory workflow
 
 ```bash
-uvx sia-code memory sync-git
+uvx sia-code memory sync-git --limit 0
 uvx sia-code memory search "topic"
 uvx sia-code memory add-decision "Decision title" -d "Context" -r "Reason"
 ```
 
+`memory sync-git` also derives changelog entries from merge commits whose subject matches `Merge branch '...'`.
+
 ## 5) Multiple worktrees / multiple Claude Code instances
 
 Use one of these index strategies per session:
diff --git a/docs/MEMORY_FEATURES.md b/docs/MEMORY_FEATURES.md
index f7aa19b..c81b807 100644
--- a/docs/MEMORY_FEATURES.md
+++ b/docs/MEMORY_FEATURES.md
@@ -31,6 +31,7 @@ sia-code memory search "Adopt X" --type decision
 
 - Tags become changelog memory entries
 - Merge commits become timeline memory events
+- Merge commits whose subject matches `Merge branch '...'` also become changelog entries
 - Each event captures changed files and diff stats
 - Duplicate events are skipped automatically
 
@@ -69,6 +70,11 @@ Notes:
 | `memory changelog` | render changelog text/json/markdown |
 | `memory export` / `memory import` | backup/restore memory data |
 
+Limit behavior:
+
+- `memory sync-git --limit 0` processes all eligible events
+- `memory list --limit 0`, `memory timeline --limit 0`, and `memory changelog --limit 0` return all rows
+
 ## Good Practices
 
 - Add decisions with explicit `description` and `reasoning`.
diff --git a/docs/PERFORMANCE_ANALYSIS.md b/docs/PERFORMANCE_ANALYSIS.md
index ef32e5c..d6ac358 100644
--- a/docs/PERFORMANCE_ANALYSIS.md
+++ b/docs/PERFORMANCE_ANALYSIS.md
@@ -3,18 +3,18 @@
 ## Typical Expectations
 
 - `search --regex`: usually lowest-latency mode
-- hybrid `search`: additional semantic overhead
+- `search --semantic-only`: usually higher latency than regex
 - `index --update`: much faster than full rebuild for small changes
 
-Actual speed depends on repo size, hardware, and embedding configuration.
+Actual speed depends on repo size, hardware, and ChunkHound semantic/provider setup.
 
 ## Quick Optimization Checklist
 
 1. Use `sia-code index --update` for daily work
 2. Use `--regex` for symbol/identifier lookup
-3. Add `--no-deps` to reduce large dependency noise
+3. Use tighter regex terms (or include path-like hints) to reduce noise
 4. Use `--parallel` for large initial indexing runs
-5. Start embed daemon when doing repeated semantic/hybrid queries
+5. Start embed daemon when doing repeated memory embedding operations
 
 ## Useful Commands
 
@@ -28,8 +28,8 @@ sia-code search --regex "pattern"
 ## Bottleneck Hints
 
 - Slow index build: reduce indexed scope or enable parallel workers
-- Slow semantic/hybrid queries: ensure embed daemon is healthy
-- Noisy result set: use dependency filters (`--no-deps` / `--deps-only`)
+- Slow semantic queries: verify ChunkHound provider setup and model/network health
+- Noisy result set: narrow regex terms and include path-like query hints
 
 ## Related Docs
 
diff --git a/docs/QUERYING.md b/docs/QUERYING.md
index 6c58264..f0f5e47 100644
--- a/docs/QUERYING.md
+++ b/docs/QUERYING.md
@@ -3,49 +3,50 @@
 ## Search Commands
 
 ```bash
-# default hybrid
+# default mode from config (ChunkHound-backed; default is regex)
 sia-code search "authentication flow"
 
 # lexical / symbol-heavy
 sia-code search --regex "AuthService|token"
 
-# semantic only
+# semantic only (requires embedding setup)
 sia-code search --semantic-only "handle login failures"
 ```
 
 ## Useful Flags
 
 - `-k, --limit <N>`: number of results
-- `--no-deps`: only project code
-- `--deps-only`: only dependency code
-- `--no-filter`: include stale chunks
+- `--no-deps`: accepted for compatibility (currently no-op)
+- `--deps-only`: accepted for compatibility (currently no-op)
+- `--no-filter`: accepted for compatibility (currently no-op)
 - `--format text|json|table|csv`
 - `--output <path>`: write results to file
 
 ## Multi-Hop Research
 
 ```bash
-sia-code research "how does auth middleware work?" --hops 3 --graph
+sia-code research "how does auth middleware work?"
 ```
 
 Use this for architecture tracing, call-path discovery, and unfamiliar code.
 
+Compatibility flags for `research` (`--hops`, `--graph`, `--limit`, `--no-filter`) are accepted by Sia and ignored by ChunkHound.
+
 ## Practical Tuning
 
-- `search.vector_weight = 0.0` => lexical-heavy behavior
-- `search.vector_weight = 1.0` => semantic-heavy behavior
+- `chunkhound.default_search_mode = regex|semantic`
 - defaults come from `.sia-code/config.json`
 
 ```bash
-sia-code config get search.vector_weight
-sia-code config set search.vector_weight 0.0
+sia-code config get chunkhound.default_search_mode
+sia-code config set chunkhound.default_search_mode semantic
 ```
 
 ## Output Tips
 
 - Use `--format json` for scripts/agents.
 - Use `--format table` for quick terminal scanning.
-- Use `--no-deps` in large repos to reduce noise.
+- Use tighter regex terms or path-like query text when results are noisy.
 
 ## Related Docs
 
diff --git a/sia_code/cli.py b/sia_code/cli.py
index 528d9d9..0c64ed7 100644
--- a/sia_code/cli.py
+++ b/sia_code/cli.py
@@ -22,6 +22,16 @@
 from . import __version__
 from .config import Config
 from .indexer.coordinator import IndexingCoordinator
+from .search.chunkhound_cli import (
+    build_index_command,
+    build_research_command,
+    build_search_command,
+    chunkhound_db_path,
+    parse_search_output,
+    resolve_search_mode,
+    research_needs_llm_fallback,
+    run_chunkhound_command,
+)
 
 console = Console()
 
@@ -519,6 +529,29 @@ def update_progress(stage: str, current: int, total: int, desc: str):
             # Close backend to persist vectors to disk
             backend.close()
 
+            # Keep ChunkHound index in sync for search/research commands
+            chunkhound_command = build_index_command(
+                config=config,
+                project_path=directory,
+                db_path=chunkhound_db_path(sia_dir, config),
+                force_reindex=clean,
+            )
+            chunkhound_result = run_chunkhound_command(
+                chunkhound_command,
+                cwd=Path("."),
+                capture_output=True,
+            )
+            if chunkhound_result.returncode != 0:
+                console.print("[red]ChunkHound indexing failed[/red]")
+                if chunkhound_result.stdout:
+                    print(chunkhound_result.stdout, end="")
+                if chunkhound_result.stderr:
+                    print(chunkhound_result.stderr, end="", file=sys.stderr)
+                sys.exit(chunkhound_result.returncode)
+            console.print(
+                f"[dim]ChunkHound index synced at {chunkhound_db_path(sia_dir, config)}[/dim]"
+            )
+
             # Auto-sync git history (unless disabled or in watch mode)
             if not no_git_sync and not watch:
                 try:
@@ -624,6 +657,25 @@ def reindex(self):
                         f"[green]✓[/green] Re-indexed {stats['files_indexed']} files, {stats['chunks_indexed']} chunks"
                     )
 
+                    # Sync ChunkHound index for watch-mode updates
+                    chunkhound_command = build_index_command(
+                        config=config,
+                        project_path=Path(path),
+                        db_path=chunkhound_db_path(sia_dir, config),
+                        force_reindex=False,
+                    )
+                    chunkhound_result = run_chunkhound_command(
+                        chunkhound_command,
+                        cwd=Path("."),
+                        capture_output=True,
+                    )
+                    if chunkhound_result.returncode == 0:
+                        console.print("[green]✓[/green] ChunkHound index synced")
+                    else:
+                        console.print("[yellow]ChunkHound sync failed during watch update[/yellow]")
+                        if chunkhound_result.stderr:
+                            console.print(f"[dim]{chunkhound_result.stderr.strip()}[/dim]")
+
                 except Exception as e:
                     console.print(f"[red]Error during re-indexing: {e}[/red]")
                 finally:
@@ -677,181 +729,87 @@ def search(
     output_format: str,
     output: str | None,
 ):
-    """Search the codebase (default: hybrid BM25 + semantic)."""
-    from .indexer.chunk_index import ChunkIndex
+    """Search the codebase via ChunkHound CLI."""
+    import csv
+    import io
+    import json
 
     sia_dir, config = require_initialized()
 
-    # Load chunk index for filtering (if available and not disabled)
-    valid_chunks = None
-    if not no_filter:
-        chunk_index_path = sia_dir / "chunk_index.json"
-        if chunk_index_path.exists():
-            try:
-                chunk_index = ChunkIndex(chunk_index_path)
-                valid_chunks = chunk_index.get_valid_chunks()
-            except Exception:
-                pass  # Silently fall back to no filtering
-
-    # Handle mutually exclusive dependency flags
-    if no_deps and deps_only:
-        console.print("[red]Error: --no-deps and --deps-only are mutually exclusive[/red]")
-        sys.exit(1)
-
-    backend = create_backend(sia_dir, config, valid_chunks=valid_chunks)
-    backend.open_index()
-
-    # Determine dependency filtering
-    # Default: include deps (from config or True)
-    # --no-deps: exclude deps
-    # --deps-only: show only deps (include_deps=True, then filter results)
-    include_deps = not no_deps  # Exclude deps if --no-deps is set
-    tier_boost = config.search.tier_boost if hasattr(config.search, "tier_boost") else None
-
-    # Determine search mode (NEW: hybrid by default)
-    if regex:
-        mode = "lexical"
-    elif semantic_only:
-        mode = "semantic"
-    else:
-        mode = "hybrid"  # NEW DEFAULT: BM25 + semantic
-
-    filter_status = "" if no_filter or not valid_chunks else " [filtered]"
-    deps_status = " [no-deps]" if no_deps else " [deps-only]" if deps_only else ""
-
-    # Suppress progress messages for structured output formats
-    if output_format not in ("json", "csv"):
-        console.print(f"[dim]Searching ({mode}{filter_status}{deps_status})...[/dim]")
-
-    # Execute search based on mode
-    if regex:
-        results = backend.search_lexical(
-            query, k=limit, include_deps=include_deps, tier_boost=tier_boost
-        )
-    elif semantic_only:
-        results = backend.search_semantic(
-            query, k=limit, include_deps=include_deps, tier_boost=tier_boost
-        )
-    else:
-        # NEW: Hybrid search (BM25 + semantic) for best performance
-        results = backend.search_hybrid(
-            query,
-            k=limit,
-            vector_weight=config.search.vector_weight,
-            include_deps=include_deps,
-            tier_boost=tier_boost,
-        )
+    if no_deps:
+        console.print("[yellow]Note:[/yellow] --no-deps is ignored by ChunkHound-backed search")
+    if deps_only:
+        console.print("[yellow]Note:[/yellow] --deps-only is ignored by ChunkHound-backed search")
+    if no_filter:
+        console.print("[dim]Note: --no-filter has no effect with ChunkHound-backed search[/dim]")
+
+    mode = resolve_search_mode(config, regex=regex, semantic_only=semantic_only)
+    db_path = chunkhound_db_path(sia_dir, config)
+
+    command = build_search_command(
+        config=config,
+        query=query,
+        project_path=Path("."),
+        db_path=db_path,
+        mode=mode,
+        limit=limit,
+    )
 
-    # Filter for --deps-only after search
-    if deps_only and results:
-        results = [r for r in results if r.chunk.metadata.get("tier") == "dependency"]
+    result = run_chunkhound_command(command, cwd=Path("."), capture_output=True)
+
+    # Graceful semantic->regex fallback for embedding-misconfigured repos
+    if result.returncode != 0 and mode == "semantic":
+        combined = f"{result.stdout}\n{result.stderr}".lower()
+        if "no embedding providers available" in combined:
+            console.print("[yellow]Semantic search unavailable; retrying with regex mode.[/yellow]")
+            mode = "regex"
+            command = build_search_command(
+                config=config,
+                query=query,
+                project_path=Path("."),
+                db_path=db_path,
+                mode=mode,
+                limit=limit,
+            )
+            result = run_chunkhound_command(command, cwd=Path("."), capture_output=True)
 
-    if not results:
-        # Handle empty results based on output format
-        if output_format == "json":
-            import json
+    if result.returncode != 0:
+        if result.stdout:
+            print(result.stdout, end="")
+        if result.stderr:
+            print(result.stderr, end="", file=sys.stderr)
+        sys.exit(result.returncode)
 
-            empty_output = {"query": query, "mode": mode, "results": []}
-            print(json.dumps(empty_output, indent=2))
-        elif output_format == "csv":
-            # CSV header only for empty results
-            print("File,Start Line,End Line,Symbol,Score,Preview")
-        else:
-            console.print("[yellow]No results found[/yellow]")
-        return
+    parsed = parse_search_output(result.stdout, query=query, mode=mode)
 
-    # Format results based on output_format
     if output_format == "json":
-        import json
-
-        output_data = {"query": query, "mode": mode, "results": [r.to_dict() for r in results]}
-        formatted_output = json.dumps(output_data, indent=2)
+        rendered = json.dumps(parsed, indent=2)
     elif output_format == "csv":
-        import csv
-        import io
-
-        csv_buffer = io.StringIO()
-        csv_writer = csv.writer(csv_buffer)
-        # Write header
-        csv_writer.writerow(["File", "Start Line", "End Line", "Symbol", "Score", "Preview"])
-        # Write rows
-        for result in results:
-            chunk = result.chunk
-            preview = (result.snippet or chunk.code)[:100].replace("\n", " ").replace("\r", "")
-            csv_writer.writerow(
+        buffer = io.StringIO()
+        writer = csv.writer(buffer)
+        writer.writerow(["File", "Start Line", "End Line", "Symbol", "Score", "Preview"])
+        for item in parsed["results"]:
+            chunk = item["chunk"]
+            snippet = (item.get("snippet") or chunk.get("code") or "").replace("\n", " ")
+            writer.writerow(
                 [
-                    chunk.file_path,
-                    chunk.start_line,
-                    chunk.end_line,
-                    chunk.symbol,
-                    f"{result.score:.3f}",
-                    preview,
+                    chunk.get("file_path", ""),
+                    chunk.get("start_line", ""),
+                    chunk.get("end_line", ""),
+                    chunk.get("symbol", ""),
+                    f"{item.get('score', 0.0):.3f}",
+                    snippet[:120],
                 ]
             )
-        formatted_output = csv_buffer.getvalue()
-    elif output_format == "table":
-        table = Table(title=f"Search Results: {query}")
-        table.add_column("File", style="cyan")
-        table.add_column("Line", style="dim")
-        table.add_column("Symbol", style="bold")
-        table.add_column("Score", justify="right")
-        table.add_column("Preview", style="dim")
-
-        for result in results:
-            chunk = result.chunk
-            preview = (result.snippet or chunk.code)[:80].replace("\n", " ")
-            table.add_row(
-                str(chunk.file_path),
-                f"{chunk.start_line}-{chunk.end_line}",
-                chunk.symbol,
-                f"{result.score:.3f}",
-                preview + "..." if len(preview) == 80 else preview,
-            )
-        formatted_output = table
-    else:  # text format (default)
-        formatted_output = None
-        for i, result in enumerate(results, 1):
-            chunk = result.chunk
-            console.print(f"\n[bold cyan]{i}. {chunk.symbol}[/bold cyan]")
-            console.print(f"[dim]{chunk.file_path}:{chunk.start_line}-{chunk.end_line}[/dim]")
-            console.print(f"Score: {result.score:.3f}")
-            if result.snippet:
-                console.print(f"\n{result.snippet}\n")
-
-    # Save to file or print to console
+        rendered = buffer.getvalue()
+    else:
+        rendered = result.stdout
+
     if output:
-        try:
-            output_path = Path(output)
-            if output_format == "json" or output_format == "csv":
-                assert isinstance(formatted_output, str)
-                output_path.write_text(formatted_output)
-            elif output_format == "table":
-                from rich.console import Console as FileConsole
-
-                with open(output_path, "w") as f:
-                    file_console = FileConsole(file=f, width=120)
-                    file_console.print(formatted_output)
-            else:  # text format
-                # Re-format as plain text for file output
-                lines = []
-                for i, result in enumerate(results, 1):
-                    chunk = result.chunk
-                    lines.append(f"{i}. {chunk.symbol}")
-                    lines.append(f"   {chunk.file_path}:{chunk.start_line}-{chunk.end_line}")
-                    lines.append(f"   Score: {result.score:.3f}")
-                    if result.snippet:
-                        lines.append(f"\n{result.snippet}\n")
-                output_path.write_text("\n".join(lines))
-            console.print(f"[green]✓[/green] Results saved to {output}")
-        except Exception as e:
-            console.print(f"[red]Error saving to file: {e}[/red]")
-            sys.exit(1)
-    elif formatted_output is not None:
-        if output_format == "json" or output_format == "csv":
-            # Use print() for JSON/CSV to avoid rich console formatting
-            print(formatted_output)
-        else:  # table
-            console.print(formatted_output)
+        Path(output).write_text(rendered)
+        console.print(f"[green]✓[/green] Results saved to {output}")
+    else:
+        print(rendered, end="" if rendered.endswith("\n") else "\n")
 
 
 @main.command()
@@ -999,89 +957,51 @@ def interactive(regex: bool, limit: int):
 @click.option("-k", "--limit", type=int, default=5, help="Results per hop")
 @click.option("--no-filter", is_flag=True, help="Disable stale chunk filtering")
 def research(question: str, hops: int, graph: bool, limit: int, no_filter: bool):
-    """Multi-hop code research for architectural questions.
-
-    Automatically discovers code relationships and builds a complete picture.
-
-    Examples:
-        sia-code research "How does authentication work?"
-        sia-code research "What calls the indexer?" --graph
-        sia-code research "How is configuration loaded?" --hops 3
-    """
-    from .indexer.chunk_index import ChunkIndex
-    from .search.multi_hop import MultiHopSearchStrategy
-
+    """Run architecture research via ChunkHound CLI."""
     sia_dir, config = require_initialized()
 
-    # Load chunk index for filtering (if available and not disabled)
-    valid_chunks = None
-    if not no_filter:
-        chunk_index_path = sia_dir / "chunk_index.json"
-        if chunk_index_path.exists():
-            try:
-                chunk_index = ChunkIndex(chunk_index_path)
-                valid_chunks = chunk_index.get_valid_chunks()
-            except Exception:
-                pass  # Silently fall back to no filtering
-
-    backend = create_backend(sia_dir, config, valid_chunks=valid_chunks)
-    backend.open_index()
-
-    strategy = MultiHopSearchStrategy(backend, max_hops=hops)
-
-    console.print(f"[dim]Researching: {question}[/dim]")
-    console.print(f"[dim]Max hops: {hops}, Results per hop: {limit}[/dim]\n")
-
-    with Progress(
-        SpinnerColumn(), TextColumn("[progress.description]{task.description}"), console=console
-    ) as progress:
-        task = progress.add_task("Analyzing code relationships...", total=None)
-        result = strategy.research(question, max_results_per_hop=limit)
-        progress.update(task, completed=True)
-
-    # Display results summary
-    console.print("\n[bold green]✓ Research Complete[/bold green]")
-    console.print(f"  Found: {len(result.chunks)} related code chunks")
-    console.print(f"  Relationships: {len(result.relationships)}")
-    console.print(f"  Entities discovered: {result.total_entities_found}")
-    console.print(f"  Hops executed: {result.hops_executed}/{hops}\n")
-
-    if not result.chunks:
-        console.print("[yellow]No relevant code found. Try rephrasing your question.[/yellow]")
-        return
-
-    # Display top chunks
-    console.print("[bold]Top Related Code:[/bold]\n")
-    for i, chunk in enumerate(result.chunks[:10], 1):
-        console.print(f"{i}. [cyan]{chunk.symbol}[/cyan]")
-        console.print(f"   {chunk.file_path}:{chunk.start_line}-{chunk.end_line}")
-        if i <= 3:  # Show code preview for top 3
-            preview = chunk.code[:200].replace("\n", "\n   ")
-            console.print(f"   [dim]{preview}...[/dim]")
-        console.print()
-
-    # Show call graph if requested
-    if graph and result.relationships:
-        call_graph = strategy.build_call_graph(result.relationships)
-        entry_points = strategy.get_entry_points(result.relationships)
-
-        console.print("\n[bold]Call Graph:[/bold]\n")
+    if hops != 2:
+        console.print("[dim]Note: --hops is accepted for compatibility but ignored.[/dim]")
+    if graph:
+        console.print("[dim]Note: --graph is accepted for compatibility but ignored.[/dim]")
+    if no_filter:
+        console.print("[dim]Note: --no-filter has no effect with ChunkHound-backed research.[/dim]")
+    if limit != 5:
+        console.print("[dim]Note: --limit is accepted for compatibility but ignored.[/dim]")
+
+    db_path = chunkhound_db_path(sia_dir, config)
+    command = build_research_command(
+        config=config,
+        question=question,
+        project_path=Path("."),
+        db_path=db_path,
+    )
+    result = run_chunkhound_command(command, cwd=Path("."), capture_output=True)
 
-        if entry_points:
-            console.print("[dim]Entry points:[/dim]")
-            for entry in entry_points[:5]:
-                console.print(f"  [green]→ {entry}[/green]")
-            console.print()
+    if result.returncode != 0:
+        combined = f"{result.stdout}\n{result.stderr}"
+        if config.chunkhound.research_fallback_to_regex and research_needs_llm_fallback(combined):
+            console.print(
+                "[yellow]ChunkHound research requires LLM config; falling back to regex search.[/yellow]"
+            )
+            fallback_command = build_search_command(
+                config=config,
+                query=question,
+                project_path=Path("."),
+                db_path=db_path,
+                mode="regex",
+                limit=limit,
+            )
+            result = run_chunkhound_command(fallback_command, cwd=Path("."), capture_output=True)
 
-        console.print("[dim]Relationships:[/dim]")
-        for entity, targets in list(call_graph.items())[:15]:
-            console.print(f"  {entity}")
-            for target in targets[:3]:
-                rel_type = target["type"].replace("_", " ")
-                console.print(f"    [dim]{rel_type}[/dim] → {target['target']}")
+    if result.returncode != 0:
+        if result.stdout:
+            print(result.stdout, end="")
+        if result.stderr:
+            print(result.stderr, end="", file=sys.stderr)
+        sys.exit(result.returncode)
 
-        if len(call_graph) > 15:
-            console.print(f"\n  [dim]... and {len(call_graph) - 15} more entities[/dim]")
+    print(result.stdout, end="" if result.stdout.endswith("\n") else "\n")
 
 
 @main.command()
@@ -1431,7 +1351,7 @@ def memory():
 
 @memory.command(name="sync-git")
 @click.option("--since", default="HEAD~100", help="Git ref to start from (e.g., v1.0.0, HEAD~50)")
-@click.option("--limit", type=int, default=50, help="Maximum events to process")
+@click.option("--limit", type=int, default=0, help="Maximum events to process (0 means all)")
 @click.option("--dry-run", is_flag=True, help="Preview without importing")
 @click.option("--tags-only", is_flag=True, help="Only scan tags, skip merge commits")
 @click.option("--merges-only", is_flag=True, help="Only scan merge commits, skip tags")
@@ -1464,9 +1384,10 @@ def memory_sync_git(since, limit, dry_run, tags_only, merges_only, min_importanc
         console.print(f"[cyan]Syncing git history from {since}...[/cyan]\n")
 
         sync_service = GitSyncService(backend, Path("."))
+        effective_limit = None if limit <= 0 else limit
         stats = sync_service.sync(
             since=since,
-            limit=limit,
+            limit=effective_limit,
             dry_run=dry_run,
             tags_only=tags_only,
             merges_only=merges_only,
@@ -1562,7 +1483,7 @@ def memory_add_decision(title, description, reasoning, alternatives):
     default="all",
     help="Filter decisions by status",
 )
-@click.option("--limit", type=int, default=20, help="Maximum items to show")
+@click.option("--limit", type=int, default=20, help="Maximum items to show (0 means all)")
 @click.option(
     "--format",
     "output_format",
@@ -1578,24 +1499,26 @@ def memory_list(item_type, status, limit, output_format):
 
     try:
         results = {"decisions": [], "timeline": [], "changelogs": []}
+        effective_limit = None if limit <= 0 else limit
 
         # Fetch decisions
         if item_type in ("decision", "all"):
             if status == "pending":
-                results["decisions"] = backend.list_pending_decisions(limit=limit)
+                results["decisions"] = backend.list_pending_decisions(limit=effective_limit)
             else:
                 # Get all decisions (pending + approved)
-                results["decisions"] = backend.list_pending_decisions(limit=limit * 2)
+                expanded_limit = None if effective_limit is None else effective_limit * 2
+                results["decisions"] = backend.list_pending_decisions(limit=expanded_limit)
                 if status != "all":
                     results["decisions"] = [d for d in results["decisions"] if d.status == status]
 
         # Fetch timeline events
         if item_type in ("timeline", "all"):
-            results["timeline"] = backend.get_timeline_events(limit=limit)
+            results["timeline"] = backend.get_timeline_events(limit=effective_limit)
 
         # Fetch changelogs
         if item_type in ("changelog", "all"):
-            results["changelogs"] = backend.get_changelogs(limit=limit)
+            results["changelogs"] = backend.get_changelogs(limit=effective_limit)
 
         # Output
         if output_format == "json":
@@ -1783,7 +1706,8 @@ def memory_search(query, search_type, limit):
     default="text",
     help="Output format",
 )
-def memory_timeline(since, event_type, importance, output_format):
+@click.option("--limit", type=int, default=0, help="Maximum events to show (0 means all)")
+def memory_timeline(since, event_type, importance, output_format, limit):
     """Show project timeline events.
 
     Example: sia-code memory timeline --format markdown --importance high
@@ -1793,7 +1717,7 @@ def memory_timeline(since, event_type, importance, output_format):
     backend.open_index()
 
     try:
-        events = backend.get_timeline_events(limit=100)
+        events = backend.get_timeline_events(limit=None if limit <= 0 else limit)
 
         # Apply filters
         if event_type:
@@ -1864,8 +1788,9 @@ def memory_timeline(since, event_type, importance, output_format):
     default="markdown",
     help="Output format",
 )
+@click.option("--limit", type=int, default=0, help="Maximum changelog entries (0 means all)")
 @click.option("-o", "--output", type=click.Path(), help="Save to file")
-def memory_changelog(range, output_format, output):
+def memory_changelog(range, output_format, limit, output):
     """Generate changelog from memory.
 
     Example: sia-code memory changelog v1.0.0..v2.0.0 --format markdown -o CHANGELOG.md
@@ -1875,7 +1800,7 @@ def memory_changelog(range, output_format, output):
     backend.open_index()
 
     try:
-        changelogs = backend.get_changelogs(limit=100)
+        changelogs = backend.get_changelogs(limit=None if limit <= 0 else limit)
 
         # Filter by range if provided
         if range:
diff --git a/sia_code/config.py b/sia_code/config.py
index 67ff4ca..5aafae4 100644
--- a/sia_code/config.py
+++ b/sia_code/config.py
@@ -149,6 +149,18 @@ class SearchConfig(BaseModel):
     include_dependencies: bool = True  # Default: deps always included in search
 
 
+class ChunkHoundConfig(BaseModel):
+    """ChunkHound CLI integration settings."""
+
+    command: str = "uvx chunkhound"
+    db_filename: str = "chunkhound.db"
+    default_search_mode: Literal["regex", "semantic"] = "regex"
+    no_embeddings_for_index: bool = True
+    no_embeddings_for_regex_search: bool = True
+    research_prompt_prefix: str = ""
+    research_fallback_to_regex: bool = True
+
+
 class DependencyConfig(BaseModel):
     """Dependency indexing configuration."""
 
@@ -197,6 +209,7 @@ class Config(BaseModel):
     indexing: IndexingConfig = Field(default_factory=IndexingConfig)
     chunking: ChunkingConfig = Field(default_factory=ChunkingConfig)
     search: SearchConfig = Field(default_factory=SearchConfig)
+    chunkhound: ChunkHoundConfig = Field(default_factory=ChunkHoundConfig)
     # New configuration sections
     dependencies: DependencyConfig = Field(default_factory=DependencyConfig)
     documentation: DocumentationConfig = Field(default_factory=DocumentationConfig)
diff --git a/sia_code/memory/git_events.py b/sia_code/memory/git_events.py
index b828a38..da82ad8 100644
--- a/sia_code/memory/git_events.py
+++ b/sia_code/memory/git_events.py
@@ -9,6 +9,13 @@
 from git.exc import GitCommandError, InvalidGitRepositoryError
 
 
+def _coerce_text(value: str | bytes | Any) -> str:
+    """Normalize git message-like values into text."""
+    if isinstance(value, bytes):
+        return value.decode("utf-8", errors="replace")
+    return str(value)
+
+
 class GitEventExtractor:
     """Extract timeline events and changelogs from git repository."""
 
@@ -74,12 +81,14 @@ def scan_git_tags(self) -> list[dict[str, Any]]:
 
         return changelogs
 
-    def scan_merge_events(self, since: str | None = None, limit: int = 50) -> list[dict[str, Any]]:
+    def scan_merge_events(
+        self, since: str | None = None, limit: int | None = 50
+    ) -> list[dict[str, Any]]:
         """Extract merge commits as timeline events.
 
         Args:
             since: Git ref to start from (e.g., 'HEAD~100' or 'v1.0.0')
-            limit: Maximum number of merge events to return
+            limit: Maximum number of merge events to return (None for all)
 
         Returns:
             List of timeline event dictionaries
@@ -93,21 +102,23 @@ def scan_merge_events(self, since: str | None = None, limit: int = 50) -> list[d
             commit_range = "HEAD"
 
         try:
-            commits = list(self.repo.iter_commits(commit_range, max_count=limit * 2))
+            max_count = limit * 2 if limit is not None and limit > 0 else None
+            commits = list(self.repo.iter_commits(commit_range, max_count=max_count))
         except GitCommandError:
             # If range is invalid, just get HEAD commits
-            commits = list(self.repo.iter_commits("HEAD", max_count=limit * 2))
+            max_count = limit * 2 if limit is not None and limit > 0 else None
+            commits = list(self.repo.iter_commits("HEAD", max_count=max_count))
 
         for commit in commits:
             # Check if it's a merge commit (has multiple parents)
             if len(commit.parents) > 1:
                 # Get branch names from commit message
-                from_branch, to_branch = self._extract_merge_branches(commit.message)
+                from_branch, to_branch = self._extract_merge_branches(_coerce_text(commit.message))
 
                 # Get files changed
                 files_changed = []
                 try:
-                    files_changed = [item.a_path for item in commit.stats.files.keys()]
+                    files_changed = [str(path) for path in commit.stats.files.keys()]
                 except Exception:
                     pass
 
@@ -122,7 +133,7 @@ def scan_merge_events(self, since: str | None = None, limit: int = 50) -> list[d
                     "event_type": "merge",
                     "from_ref": from_branch or commit.parents[1].hexsha[:7],
                     "to_ref": to_branch or commit.parents[0].hexsha[:7],
-                    "summary": commit.summary,
+                    "summary": _coerce_text(commit.summary),
                     "files_changed": files_changed[:20],  # Limit to avoid huge lists
                     "diff_stats": diff_stats,
                     "importance": self._determine_importance(diff_stats),
@@ -134,7 +145,7 @@ def scan_merge_events(self, since: str | None = None, limit: int = 50) -> list[d
 
                 events.append(event)
 
-                if len(events) >= limit:
+                if limit is not None and limit > 0 and len(events) >= limit:
                     break
 
         return events
@@ -264,6 +275,10 @@ def _extract_merge_branches(self, message: str) -> tuple[str | None, str | None]
 
         return (None, None)
 
+    def is_merge_branch_message(self, message: str) -> bool:
+        """Return True when commit message follows 'Merge branch ...' pattern."""
+        return bool(re.search(r"^Merge\s+branch\s+'[^']+'", (message or "").strip()))
+
     def _determine_importance(self, diff_stats: dict[str, Any]) -> str:
         """Determine importance based on diff statistics.
 
@@ -296,7 +311,7 @@ def get_commits_between_tags(self, from_tag: str, to_tag: str) -> list[str]:
         try:
             commits = list(self.repo.iter_commits(f"{from_tag}..{to_tag}"))
             # Return first line of each commit message
-            return [c.message.strip().split("\n")[0] for c in commits]
+            return [_coerce_text(c.message).strip().split("\n")[0] for c in commits]
         except Exception as e:
             logger = logging.getLogger(__name__)
             logger.debug(f"Could not get commits between {from_tag} and {to_tag}: {e}")
@@ -321,7 +336,7 @@ def get_commits_in_merge(self, merge_commit) -> list[str]:
                 commits = list(
                     self.repo.iter_commits(f"{base[0].hexsha}..{merge_commit.parents[1].hexsha}")
                 )
-                return [c.message.strip().split("\n")[0] for c in commits]
+                return [_coerce_text(c.message).strip().split("\n")[0] for c in commits]
         except Exception as e:
             logger = logging.getLogger(__name__)
             logger.debug(f"Could not get commits for merge {merge_commit.hexsha[:7]}: {e}")
@@ -347,14 +362,14 @@ def scan_git_tags(repo_path: str | Path) -> list[dict[str, Any]]:
 
 
 def scan_merge_events(
-    repo_path: str | Path, since: str | None = None, limit: int = 50
+    repo_path: str | Path, since: str | None = None, limit: int | None = 50
 ) -> list[dict[str, Any]]:
     """Extract merge commits as timeline events.
 
     Args:
         repo_path: Path to git repository
         since: Git ref to start from
-        limit: Maximum number of events
+        limit: Maximum number of events (None for all)
 
     Returns:
         List of timeline event dictionaries
diff --git a/sia_code/memory/git_sync.py b/sia_code/memory/git_sync.py
index 599abca..0f1c9e8 100644
--- a/sia_code/memory/git_sync.py
+++ b/sia_code/memory/git_sync.py
@@ -63,7 +63,7 @@ def summarizer(self):
     def sync(
         self,
         since: str | None = None,
-        limit: int = 50,
+        limit: int | None = 50,
         dry_run: bool = False,
         tags_only: bool = False,
         merges_only: bool = False,
@@ -73,7 +73,7 @@ def sync(
 
         Args:
             since: Git ref to start from (e.g., 'v1.0.0', 'HEAD~50')
-            limit: Maximum number of events to process
+            limit: Maximum number of events to process (None/0 means no limit)
             dry_run: If True, don't write to backend
             tags_only: Only process tags, skip merges
             merges_only: Only process merges, skip tags
@@ -83,6 +83,7 @@ def sync(
             Dictionary with sync statistics
         """
         stats = GitSyncStats()
+        effective_limit = limit if limit is not None and limit > 0 else None
 
         # Process tags as changelogs (unless merges_only)
         if not merges_only:
@@ -127,7 +128,7 @@ def sync(
                     stats.changelogs_added += 1
 
                     # Early exit if hit limit
-                    if stats.changelogs_added >= limit:
+                    if effective_limit is not None and stats.changelogs_added >= effective_limit:
                         break
             except Exception as e:
                 stats.errors.append(f"Error processing tags: {e}")
@@ -135,7 +136,7 @@ def sync(
         # Process merge commits as timeline events (unless tags_only)
         if not tags_only:
             try:
-                merge_events = self.extractor.scan_merge_events(since=since, limit=limit)
+                merge_events = self.extractor.scan_merge_events(since=since, limit=effective_limit)
                 for event_data in merge_events:
                     # Filter by importance
                     event_importance = event_data.get("importance", "medium")
@@ -183,8 +184,56 @@ def sync(
                         )
                     stats.timeline_added += 1
 
+                    # Build changelog entries from merge commits with explicit
+                    # "Merge branch ..." subject lines.
+                    if self.extractor.is_merge_branch_message(event_data.get("summary", "")):
+                        if (
+                            effective_limit is not None
+                            and stats.changelogs_added >= effective_limit
+                        ):
+                            continue
+
+                        changelog_tag = self._merge_changelog_tag(event_data)
+                        if self._is_duplicate_changelog(changelog_tag):
+                            stats.changelogs_skipped += 1
+                            continue
+
+                        merged_commits: list[str] = []
+                        merge_commit = event_data.get("merge_commit")
+                        if merge_commit is not None:
+                            merged_commits = self.extractor.get_commits_in_merge(merge_commit)
+
+                        changelog_summary = event_data.get("summary", "")
+                        if self.summarizer and merged_commits:
+                            try:
+                                changelog_summary = self.summarizer.enhance_changelog(
+                                    changelog_tag,
+                                    changelog_summary,
+                                    merged_commits,
+                                )
+                            except Exception as e:
+                                logger.debug(f"Could not enhance merge changelog: {e}")
+
+                        commit_text = "\n".join(merged_commits)
+                        breaking_changes = self.extractor._extract_breaking_changes(commit_text)
+                        features = self.extractor._extract_features(commit_text)
+                        fixes = self.extractor._extract_fixes(commit_text)
+
+                        if not dry_run:
+                            self.backend.add_changelog(
+                                tag=changelog_tag,
+                                version=None,
+                                summary=changelog_summary,
+                                breaking_changes=breaking_changes,
+                                features=features,
+                                fixes=fixes,
+                                commit_hash=event_data.get("commit_hash"),
+                                commit_time=event_data.get("commit_time"),
+                            )
+                        stats.changelogs_added += 1
+
                     # Early exit if hit limit
-                    if stats.timeline_added >= limit:
+                    if effective_limit is not None and stats.timeline_added >= effective_limit:
                         break
             except Exception as e:
                 stats.errors.append(f"Error processing merges: {e}")
@@ -201,7 +250,7 @@ def _is_duplicate_changelog(self, tag: str) -> bool:
             True if changelog with this tag exists
         """
         try:
-            existing = self.backend.get_changelogs(limit=1000)
+            existing = self.backend.get_changelogs(limit=None)
             return any(c.tag == tag for c in existing)
         except Exception:
             # If check fails, assume not duplicate to avoid data loss
@@ -219,7 +268,7 @@ def _is_duplicate_event(self, event_type: str, from_ref: str, to_ref: str) -> bo
             True if event with these attributes exists
         """
         try:
-            existing = self.backend.get_timeline_events(limit=1000)
+            existing = self.backend.get_timeline_events(limit=None)
             return any(
                 e.event_type == event_type and e.from_ref == from_ref and e.to_ref == to_ref
                 for e in existing
@@ -242,3 +291,12 @@ def _meets_importance_threshold(self, event_importance: str, min_importance: str
         event_level = importance_order.get(event_importance, 0)
         min_level = importance_order.get(min_importance, 0)
         return event_level >= min_level
+
+    def _merge_changelog_tag(self, event_data: dict[str, Any]) -> str:
+        """Build stable synthetic changelog key for merge-derived entries."""
+        commit_hash = event_data.get("commit_hash")
+        if commit_hash:
+            return f"merge:{commit_hash}"
+        return (
+            f"merge:{event_data.get('from_ref', 'unknown')}->{event_data.get('to_ref', 'unknown')}"
+        )
diff --git a/sia_code/search/chunkhound_cli.py b/sia_code/search/chunkhound_cli.py
new file mode 100644
index 0000000..c288bfe
--- /dev/null
+++ b/sia_code/search/chunkhound_cli.py
@@ -0,0 +1,206 @@
+"""ChunkHound CLI bridge for Sia search/research commands."""
+
+from __future__ import annotations
+
+import re
+import shlex
+import subprocess
+from pathlib import Path
+from typing import Any, Literal
+
+from ..config import Config
+
+
+SearchMode = Literal["regex", "semantic"]
+
+
+def chunkhound_db_path(sia_dir: Path, config: Config) -> Path:
+    """Resolve ChunkHound database path from Sia config."""
+    return sia_dir / config.chunkhound.db_filename
+
+
+def split_chunkhound_command(command: str) -> list[str]:
+    """Split configured command string into executable argv."""
+    stripped = command.strip() if command else ""
+    if not stripped:
+        stripped = "uvx chunkhound"
+    return shlex.split(stripped)
+
+
+def resolve_search_mode(config: Config, regex: bool, semantic_only: bool) -> SearchMode:
+    """Resolve target search mode from CLI flags and config defaults."""
+    if regex:
+        return "regex"
+    if semantic_only:
+        return "semantic"
+    return config.chunkhound.default_search_mode
+
+
+def build_index_command(
+    config: Config,
+    project_path: Path,
+    db_path: Path,
+    force_reindex: bool = False,
+) -> list[str]:
+    """Build chunkhound indexing command."""
+    cmd = split_chunkhound_command(config.chunkhound.command)
+    cmd.extend(["index", str(project_path), "--db", str(db_path)])
+    if config.chunkhound.no_embeddings_for_index:
+        cmd.append("--no-embeddings")
+    if force_reindex:
+        cmd.append("--force-reindex")
+    return cmd
+
+
+def build_search_command(
+    config: Config,
+    query: str,
+    project_path: Path,
+    db_path: Path,
+    mode: SearchMode,
+    limit: int,
+) -> list[str]:
+    """Build chunkhound search command."""
+    cmd = split_chunkhound_command(config.chunkhound.command)
+    cmd.extend(
+        [
+            "search",
+            query,
+            str(project_path),
+            "--db",
+            str(db_path),
+            "--page-size",
+            str(limit),
+        ]
+    )
+
+    if mode == "regex":
+        cmd.append("--regex")
+        if config.chunkhound.no_embeddings_for_regex_search:
+            cmd.append("--no-embeddings")
+    elif mode != "semantic":
+        raise ValueError(f"Unsupported search mode: {mode}")
+
+    return cmd
+
+
+def build_research_command(
+    config: Config,
+    question: str,
+    project_path: Path,
+    db_path: Path,
+) -> list[str]:
+    """Build chunkhound research command."""
+    cmd = split_chunkhound_command(config.chunkhound.command)
+    cmd.extend(["research", build_research_query(config, question), str(project_path)])
+    cmd.extend(["--db", str(db_path)])
+    return cmd
+
+
+def build_research_query(config: Config, question: str) -> str:
+    """Apply optional prompt prefix before invoking chunkhound research."""
+    prefix = config.chunkhound.research_prompt_prefix.strip()
+    if not prefix:
+        return question
+    return f"{prefix}\n\n{question}"
+
+
+def run_chunkhound_command(
+    command: list[str],
+    cwd: Path,
+    capture_output: bool = False,
+) -> subprocess.CompletedProcess[str]:
+    """Run chunkhound command."""
+    return subprocess.run(
+        command,
+        cwd=cwd,
+        text=True,
+        capture_output=capture_output,
+    )
+
+
+def parse_search_output(output: str, query: str, mode: str) -> dict[str, Any]:
+    """Parse chunkhound text search output into Sia-compatible JSON structure."""
+    results: list[dict[str, Any]] = []
+    current: dict[str, Any] | None = None
+    in_code_block = False
+    code_lines: list[str] = []
+
+    def flush_current() -> None:
+        nonlocal current
+        if not current:
+            return
+
+        file_path = current.get("file_path") or "unknown"
+        start_line = int(current.get("start_line") or 1)
+        end_line = int(current.get("end_line") or start_line)
+        snippet = (current.get("snippet") or "").strip()
+        rank = int(current.get("rank") or (len(results) + 1))
+
+        results.append(
+            {
+                "chunk": {
+                    "symbol": Path(file_path).stem,
+                    "start_line": start_line,
+                    "end_line": end_line,
+                    "code": snippet,
+                    "chunk_type": "unknown",
+                    "language": "unknown",
+                    "file_path": file_path,
+                    "file_id": None,
+                    "id": None,
+                    "parent_header": None,
+                    "metadata": {"source": "chunkhound-cli"},
+                },
+                "score": max(0.0, 1.0 - (rank - 1) * 0.01),
+                "snippet": snippet or None,
+                "highlights": [],
+            }
+        )
+        current = None
+
+    for raw_line in output.splitlines():
+        line = raw_line.rstrip("\n")
+        stripped = line.strip()
+
+        if stripped.startswith("```"):
+            if in_code_block:
+                if current is not None:
+                    current["snippet"] = "\n".join(code_lines).strip()
+                code_lines = []
+                in_code_block = False
+            else:
+                in_code_block = True
+                code_lines = []
+            continue
+
+        if in_code_block:
+            code_lines.append(line)
+            continue
+
+        match = re.match(r"^\[(\d+)\]\s+(.+)$", stripped)
+        if match:
+            flush_current()
+            current = {
+                "rank": int(match.group(1)),
+                "file_path": match.group(2).strip(),
+                "start_line": None,
+                "end_line": None,
+                "snippet": "",
+            }
+            continue
+
+        if current is not None:
+            line_match = re.search(r"Lines\s+(\d+)(?:-(\d+))?", stripped)
+            if line_match:
+                current["start_line"] = int(line_match.group(1))
+                current["end_line"] = int(line_match.group(2) or line_match.group(1))
+
+    flush_current()
+    return {"query": query, "mode": mode, "results": results}
+
+
+def research_needs_llm_fallback(output_text: str) -> bool:
+    """Detect known chunkhound LLM setup errors for graceful fallback."""
+    lowered = output_text.lower()
+    return "configure an llm provider" in lowered or "llm provider setup failed" in lowered
diff --git a/sia_code/storage/base.py b/sia_code/storage/base.py
index b35f050..12e2eed 100644
--- a/sia_code/storage/base.py
+++ b/sia_code/storage/base.py
@@ -195,11 +195,11 @@ def reject_decision(self, decision_id: int) -> None:
         ...
 
     @abstractmethod
-    def list_pending_decisions(self, limit: int = 20) -> list[Decision]:
+    def list_pending_decisions(self, limit: int | None = 20) -> list[Decision]:
         """List oldest pending decisions for review.
 
         Args:
-            limit: Maximum number of decisions to return
+            limit: Maximum number of decisions to return (None for all)
 
         Returns:
             List of pending decisions, oldest first
@@ -284,14 +284,14 @@ def add_changelog(
 
     @abstractmethod
     def get_timeline_events(
-        self, from_ref: str | None = None, to_ref: str | None = None, limit: int = 20
+        self, from_ref: str | None = None, to_ref: str | None = None, limit: int | None = 20
     ) -> list[TimelineEvent]:
         """Get timeline events.
 
         Args:
             from_ref: Filter by starting ref
             to_ref: Filter by ending ref
-            limit: Maximum number of events to return
+            limit: Maximum number of events to return (None for all)
 
         Returns:
             List of timeline events
@@ -299,11 +299,11 @@ def get_timeline_events(
         ...
 
     @abstractmethod
-    def get_changelogs(self, limit: int = 20) -> list[ChangelogEntry]:
+    def get_changelogs(self, limit: int | None = 20) -> list[ChangelogEntry]:
         """Get changelog entries.
 
         Args:
-            limit: Maximum number of entries to return
+            limit: Maximum number of entries to return (None for all)
 
         Returns:
             List of changelog entries, newest first
diff --git a/sia_code/storage/sqlite_vec_backend.py b/sia_code/storage/sqlite_vec_backend.py
index fec85c9..0457282 100644
--- a/sia_code/storage/sqlite_vec_backend.py
+++ b/sia_code/storage/sqlite_vec_backend.py
@@ -1549,11 +1549,11 @@ def reject_decision(self, decision_id: int) -> None:
         )
         self.conn.commit()
 
-    def list_pending_decisions(self, limit: int = 20) -> list[Decision]:
+    def list_pending_decisions(self, limit: int | None = 20) -> list[Decision]:
         """List oldest pending decisions for review.
 
         Args:
-            limit: Maximum number to return
+            limit: Maximum number to return (None for all)
 
         Returns:
             List of pending decisions, oldest first
@@ -1562,17 +1562,18 @@ def list_pending_decisions(self, limit: int = 20) -> list[Decision]:
             raise RuntimeError("Index not initialized")
 
         cursor = self.conn.cursor()
-        cursor.execute(
-            """
-            SELECT id, session_id, title, description, reasoning, alternatives, 
+        query = """
+            SELECT id, session_id, title, description, reasoning, alternatives,
                    status, category, commit_hash, commit_time, created_at, approved_at
             FROM decisions
             WHERE status = 'pending'
             ORDER BY created_at ASC
-            LIMIT ?
-        """,
-            (limit,),
-        )
+        """
+        params: list[Any] = []
+        if limit is not None and limit > 0:
+            query += " LIMIT ?"
+            params.append(limit)
+        cursor.execute(query, params)
 
         decisions = []
         for row in cursor.fetchall():
@@ -1766,14 +1767,14 @@ def add_changelog(
         return changelog_id
 
     def get_timeline_events(
-        self, from_ref: str | None = None, to_ref: str | None = None, limit: int = 20
+        self, from_ref: str | None = None, to_ref: str | None = None, limit: int | None = 20
     ) -> list[TimelineEvent]:
         """Get timeline events.
 
         Args:
             from_ref: Filter by starting ref
             to_ref: Filter by ending ref
-            limit: Maximum number to return
+            limit: Maximum number to return (None for all)
 
         Returns:
             List of timeline events
@@ -1795,19 +1796,18 @@ def get_timeline_events(
             params.append(to_ref)
 
         where_clause = f"WHERE {' AND '.join(conditions)}" if conditions else ""
-        params.append(limit)
-
-        cursor.execute(
-            f"""
+        query = f"""
             SELECT id, event_type, from_ref, to_ref, summary, files_changed, diff_stats, importance,
                    commit_hash, commit_time, created_at
             FROM timeline
             {where_clause}
             ORDER BY created_at DESC
-            LIMIT ?
-        """,
-            params,
-        )
+        """
+        if limit is not None and limit > 0:
+            query += " LIMIT ?"
+            params.append(limit)
+
+        cursor.execute(query, params)
 
         events = []
         for row in cursor.fetchall():
@@ -1833,11 +1833,11 @@ def get_timeline_events(
 
         return events
 
-    def get_changelogs(self, limit: int = 20) -> list[ChangelogEntry]:
+    def get_changelogs(self, limit: int | None = 20) -> list[ChangelogEntry]:
         """Get changelog entries.
 
         Args:
-            limit: Maximum number to return
+            limit: Maximum number to return (None for all)
 
         Returns:
             List of changelog entries, newest first
@@ -1846,16 +1846,17 @@ def get_changelogs(self, limit: int = 20) -> list[ChangelogEntry]:
             raise RuntimeError("Index not initialized")
 
         cursor = self.conn.cursor()
-        cursor.execute(
-            """
+        query = """
             SELECT id, tag, version, date, summary, breaking_changes, features, fixes,
                    commit_hash, commit_time, created_at
             FROM changelogs
             ORDER BY date DESC
-            LIMIT ?
-        """,
-            (limit,),
-        )
+        """
+        params: list[Any] = []
+        if limit is not None and limit > 0:
+            query += " LIMIT ?"
+            params.append(limit)
+        cursor.execute(query, params)
 
         changelogs = []
         for row in cursor.fetchall():
@@ -2087,12 +2088,12 @@ def export_memory(
 
         # Timeline events
         if include_timeline:
-            timeline = self.get_timeline_events(limit=100)
+            timeline = self.get_timeline_events(limit=None)
             memory["timeline"] = [t.to_dict() for t in timeline]
 
         # Changelogs
         if include_changelogs:
-            changelogs = self.get_changelogs(limit=100)
+            changelogs = self.get_changelogs(limit=None)
             memory["changelogs"] = [c.to_dict() for c in changelogs]
 
         # Approved decisions
@@ -2124,7 +2125,7 @@ def export_memory(
 
         # Pending decisions (optional)
         if include_pending:
-            pending = self.list_pending_decisions(limit=100)
+            pending = self.list_pending_decisions(limit=None)
             memory["pending_decisions"] = [
                 {
                     "id": f"decision:{d.id}",
diff --git a/sia_code/storage/usearch_backend.py b/sia_code/storage/usearch_backend.py
index 8114582..22d3757 100644
--- a/sia_code/storage/usearch_backend.py
+++ b/sia_code/storage/usearch_backend.py
@@ -1472,11 +1472,11 @@ def reject_decision(self, decision_id: int) -> None:
         )
         self.conn.commit()
 
-    def list_pending_decisions(self, limit: int = 20) -> list[Decision]:
+    def list_pending_decisions(self, limit: int | None = 20) -> list[Decision]:
         """List oldest pending decisions for review.
 
         Args:
-            limit: Maximum number to return
+            limit: Maximum number to return (None for all)
 
         Returns:
             List of pending decisions, oldest first
@@ -1485,17 +1485,18 @@ def list_pending_decisions(self, limit: int = 20) -> list[Decision]:
             raise RuntimeError("Index not initialized")
 
         cursor = self.conn.cursor()
-        cursor.execute(
-            """
+        query = """
             SELECT id, session_id, title, description, reasoning, alternatives,
                    status, category, commit_hash, commit_time, created_at, approved_at
             FROM decisions
             WHERE status = 'pending'
             ORDER BY created_at ASC
-            LIMIT ?
-        """,
-            (limit,),
-        )
+        """
+        params: list[Any] = []
+        if limit is not None and limit > 0:
+            query += " LIMIT ?"
+            params.append(limit)
+        cursor.execute(query, params)
 
         decisions = []
         for row in cursor.fetchall():
@@ -1705,14 +1706,14 @@ def add_changelog(
         return changelog_id
 
     def get_timeline_events(
-        self, from_ref: str | None = None, to_ref: str | None = None, limit: int = 20
+        self, from_ref: str | None = None, to_ref: str | None = None, limit: int | None = 20
     ) -> list[TimelineEvent]:
         """Get timeline events.
 
         Args:
             from_ref: Filter by starting ref
             to_ref: Filter by ending ref
-            limit: Maximum number to return
+            limit: Maximum number to return (None for all)
 
         Returns:
             List of timeline events
@@ -1734,19 +1735,18 @@ def get_timeline_events(
             params.append(to_ref)
 
         where_clause = f"WHERE {' AND '.join(conditions)}" if conditions else ""
-        params.append(limit)
-
-        cursor.execute(
-            f"""
+        query = f"""
             SELECT id, event_type, from_ref, to_ref, summary, files_changed, diff_stats, importance,
                    commit_hash, commit_time, created_at
             FROM timeline
             {where_clause}
             ORDER BY created_at DESC
-            LIMIT ?
-        """,
-            params,
-        )
+        """
+        if limit is not None and limit > 0:
+            query += " LIMIT ?"
+            params.append(limit)
+
+        cursor.execute(query, params)
 
         events = []
         for row in cursor.fetchall():
@@ -1772,11 +1772,11 @@ def get_timeline_events(
 
         return events
 
-    def get_changelogs(self, limit: int = 20) -> list[ChangelogEntry]:
+    def get_changelogs(self, limit: int | None = 20) -> list[ChangelogEntry]:
         """Get changelog entries.
 
         Args:
-            limit: Maximum number to return
+            limit: Maximum number to return (None for all)
 
         Returns:
             List of changelog entries, newest first
@@ -1785,16 +1785,17 @@ def get_changelogs(self, limit: int = 20) -> list[ChangelogEntry]:
             raise RuntimeError("Index not initialized")
 
         cursor = self.conn.cursor()
-        cursor.execute(
-            """
+        query = """
             SELECT id, tag, version, date, summary, breaking_changes, features, fixes,
                    commit_hash, commit_time, created_at
             FROM changelogs
             ORDER BY date DESC
-            LIMIT ?
-        """,
-            (limit,),
-        )
+        """
+        params: list[Any] = []
+        if limit is not None and limit > 0:
+            query += " LIMIT ?"
+            params.append(limit)
+        cursor.execute(query, params)
 
         changelogs = []
         for row in cursor.fetchall():
@@ -2026,12 +2027,12 @@ def export_memory(
 
         # Timeline events
         if include_timeline:
-            timeline = self.get_timeline_events(limit=100)
+            timeline = self.get_timeline_events(limit=None)
             memory["timeline"] = [t.to_dict() for t in timeline]
 
         # Changelogs
         if include_changelogs:
-            changelogs = self.get_changelogs(limit=100)
+            changelogs = self.get_changelogs(limit=None)
             memory["changelogs"] = [c.to_dict() for c in changelogs]
 
         # Approved decisions
@@ -2063,7 +2064,7 @@ def export_memory(
 
         # Pending decisions (optional)
         if include_pending:
-            pending = self.list_pending_decisions(limit=100)
+            pending = self.list_pending_decisions(limit=None)
             memory["pending_decisions"] = [
                 {
                     "id": f"decision:{d.id}",
diff --git a/skills/sia-code/SKILL.md b/skills/sia-code/SKILL.md
index bbf4c7c..129fa59 100644
--- a/skills/sia-code/SKILL.md
+++ b/skills/sia-code/SKILL.md
@@ -1,9 +1,9 @@
 ---
 name: sia-code
-description: Compact local-first code search skill for CLI agents using BM25, optional semantic search, multi-hop research, and project memory.
+description: Compact local-first code search skill for CLI agents using ChunkHound-backed search/research and Sia project memory.
 license: MIT
 compatibility: opencode
-version: 0.7.0
+version: 0.7.1
 ---
 
 # Sia-Code Skill (Compact)
@@ -19,42 +19,58 @@ This is a compact, repo-local variant intended for easy copy/paste into LLM CLI
 uvx sia-code init
 uvx sia-code index .
 
-# fast lexical search (great for identifiers)
+# fast lexical search (ChunkHound-backed)
 uvx sia-code search --regex "auth|login|token"
 
-# architecture exploration
+# architecture exploration (ChunkHound-backed)
 uvx sia-code research "how does authentication flow work?"
 
 # health check
 uvx sia-code status
 ```
 
+## Search + Research Backend
+
+`sia-code search` and `sia-code research` are powered by ChunkHound CLI.
+Sia's own memory/decision database remains unchanged.
+
+Install once:
+
+```bash
+uv tool install chunkhound
+```
+
 ## Search Modes
 
-- `uvx sia-code search "query"`: default hybrid search (BM25 + semantic)
-- `uvx sia-code search --regex "pattern"`: lexical search only (usually best for exact symbols)
-- `uvx sia-code search --semantic-only "query"`: semantic-only search
+- `uvx sia-code search "query"`: default mode from config (`chunkhound.default_search_mode`)
+- `uvx sia-code search --regex "pattern"`: lexical search (recommended for exact symbols)
+- `uvx sia-code search --semantic-only "query"`: semantic search (requires embedding setup)
 
-Useful flags:
+Supported flags:
 
 - `-k, --limit <N>`: result count
-- `--no-deps`: project code only
-- `--deps-only`: dependency code only
-- `--format json|table|csv`: structured output
+- `--format json|table|csv`: output shaping in Sia wrapper
+
+Compatibility notes (currently no-op with ChunkHound):
+
+- `--no-deps`
+- `--deps-only`
+- `--no-filter`
 
 ## Multi-Hop Research
 
 ```bash
-uvx sia-code research "how is config loaded?" --hops 3 --graph
+uvx sia-code research "how is config loaded?"
 ```
 
 - Use for dependency tracing, call flow mapping, and architecture questions.
+- `--hops`, `--graph`, and `--limit` are accepted for compatibility in Sia but ignored by ChunkHound CLI.
 
 ## Memory Workflow
 
 ```bash
 # import timeline/changelogs from git
-uvx sia-code memory sync-git
+uvx sia-code memory sync-git --limit 0
 
 # store a pending decision
 uvx sia-code memory add-decision "Adopt sqlite-vec by default" \
@@ -69,6 +85,11 @@ uvx sia-code memory approve 1 --category architecture
 uvx sia-code memory search "backend default" --type all
 ```
 
+Notes:
+
+- `memory sync-git` derives changelog entries from merge commits whose subject matches `Merge branch '...'`.
+- Use `--limit 0` when you want to process all eligible git events.
+
 ## Agent-Friendly Session Pattern
 
 ```bash
@@ -90,12 +111,12 @@ uvx sia-code memory add-decision "..." -d "..." -r "..."
 ## Troubleshooting
 
 - If uninitialized: run `uvx sia-code init && uvx sia-code index .`
-- If results look stale: run `uvx sia-code index --update` (or `--clean` after major refactors)
+- If results look stale: run `uvx sia-code index --update` (this also syncs ChunkHound index)
 - If memory add/search fails with embedding issues: run `uvx sia-code embed start`
-- If too much dependency noise: add `--no-deps`
+- If ChunkHound is missing: run `uv tool install chunkhound`
 
 ## Notes
 
 - Lexical search is often strong for code due to exact identifiers.
-- Hybrid/semantic search may require embedding setup depending on configuration.
+- Semantic research/search requires ChunkHound embedding/LLM provider setup.
 - Keep this file short and operational; move deep theory to project docs.
diff --git a/tests/e2e/test_cpp_e2e.py b/tests/e2e/test_cpp_e2e.py
index 921e4bf..8bf00f1 100644
--- a/tests/e2e/test_cpp_e2e.py
+++ b/tests/e2e/test_cpp_e2e.py
@@ -33,52 +33,6 @@ def test_init_creates_index_file(self, initialized_repo):
         index_path = initialized_repo / ".sia-code" / "index.db"
         assert index_path.exists()
 
-    # ===== INDEXING TESTS =====
-
-    def test_index_full_completes_successfully(self, indexed_repo):
-        """Test that full indexing completes without errors.
-
-        Note: Uses indexed_repo fixture which already performed full indexing.
-        This test verifies the index was created successfully rather than re-indexing.
-        """
-        # Verify index was created
-        index_path = indexed_repo / ".sia-code" / "index.db"
-        assert index_path.exists(), "Index database not created"
-        assert index_path.stat().st_size > 100000, "Index appears empty or incomplete"
-
-        # Verify index contains data by checking status
-        result = self.run_cli(["status"], indexed_repo)
-        assert result.returncode == 0, f"Status check failed: {result.stderr}"
-        assert "index" in result.stdout.lower()
-
-    def test_index_reports_file_and_chunk_counts(self, indexed_repo):
-        """Test that status shows index information after indexing."""
-        result = self.run_cli(["status"], indexed_repo)
-        assert result.returncode == 0
-        # Check for basic index info (chunk info only shown after --update)
-        assert "index" in result.stdout.lower()
-
-    def test_index_skips_excluded_patterns(self, indexed_repo):
-        """Test that indexing skips excluded patterns."""
-        results = self.search_json(".git", indexed_repo, regex=True, limit=10)
-        file_paths = self.get_result_file_paths(results)
-        git_files = [fp for fp in file_paths if ".git/" in fp]
-        assert len(git_files) == 0
-
-    def test_index_clean_rebuilds_from_scratch(self, indexed_repo):
-        """Test that --clean flag rebuilds index from scratch.
-
-        Note: This test does a full rebuild with embeddings enabled.
-        """
-        result = self.run_cli(["index", "--clean", "."], indexed_repo, timeout=600)
-        assert result.returncode == 0
-        assert "clean" in result.stdout.lower()
-
-    def test_index_update_only_processes_changes(self, indexed_repo):
-        """Test that --update flag only reindexes changed files."""
-        result = self.run_cli(["index", "--update", "."], indexed_repo, timeout=600)
-        assert result.returncode == 0
-
     # ===== SEARCH - LEXICAL TESTS =====
 
     def test_search_finds_language_keyword(self, indexed_repo):
@@ -135,31 +89,6 @@ def test_search_csv_output_valid(self, indexed_repo):
         )
         assert result.returncode == 0
 
-    # ===== RESEARCH TESTS =====
-
-    def test_research_finds_related_code(self, indexed_repo):
-        """Test that research command finds related code chunks."""
-        result = self.run_cli(
-            ["research", "How does JSON parsing work?", "--hops", "2"], indexed_repo, timeout=600
-        )
-        assert result.returncode == 0
-
-    def test_research_respects_hop_limit(self, indexed_repo):
-        """Test that research respects --hops parameter."""
-        result = self.run_cli(
-            ["research", "How does this work?", "--hops", "1"], indexed_repo, timeout=600
-        )
-        assert result.returncode == 0
-
-    def test_research_graph_shows_relationships(self, indexed_repo):
-        """Test that --graph flag shows code relationships."""
-        result = self.run_cli(
-            ["research", "How does JSON parsing work?", "--hops", "2", "--graph"],
-            indexed_repo,
-            timeout=600,
-        )
-        assert result.returncode == 0
-
     # ===== STATUS & MAINTENANCE =====
 
     def test_status_shows_index_info(self, indexed_repo):
diff --git a/tests/e2e/test_csharp_e2e.py b/tests/e2e/test_csharp_e2e.py
index 4e3f009..a7454ee 100644
--- a/tests/e2e/test_csharp_e2e.py
+++ b/tests/e2e/test_csharp_e2e.py
@@ -33,52 +33,6 @@ def test_init_creates_index_file(self, initialized_repo):
         index_path = initialized_repo / ".sia-code" / "index.db"
         assert index_path.exists()
 
-    # ===== INDEXING TESTS =====
-
-    def test_index_full_completes_successfully(self, indexed_repo):
-        """Test that full indexing completes without errors.
-
-        Note: Uses indexed_repo fixture which already performed full indexing.
-        This test verifies the index was created successfully rather than re-indexing.
-        """
-        # Verify index was created
-        index_path = indexed_repo / ".sia-code" / "index.db"
-        assert index_path.exists(), "Index database not created"
-        assert index_path.stat().st_size > 100000, "Index appears empty or incomplete"
-
-        # Verify index contains data by checking status
-        result = self.run_cli(["status"], indexed_repo)
-        assert result.returncode == 0, f"Status check failed: {result.stderr}"
-        assert "index" in result.stdout.lower()
-
-    def test_index_reports_file_and_chunk_counts(self, indexed_repo):
-        """Test that status shows index information after indexing."""
-        result = self.run_cli(["status"], indexed_repo)
-        assert result.returncode == 0
-        # Check for basic index info (chunk info only shown after --update)
-        assert "index" in result.stdout.lower()
-
-    def test_index_skips_excluded_patterns(self, indexed_repo):
-        """Test that indexing skips excluded patterns."""
-        results = self.search_json(".git", indexed_repo, regex=True, limit=10)
-        file_paths = self.get_result_file_paths(results)
-        git_files = [fp for fp in file_paths if ".git/" in fp]
-        assert len(git_files) == 0
-
-    def test_index_clean_rebuilds_from_scratch(self, indexed_repo):
-        """Test that --clean flag rebuilds index from scratch.
-
-        Note: This test does a full rebuild with embeddings enabled.
-        """
-        result = self.run_cli(["index", "--clean", "."], indexed_repo, timeout=600)
-        assert result.returncode == 0
-        assert "clean" in result.stdout.lower()
-
-    def test_index_update_only_processes_changes(self, indexed_repo):
-        """Test that --update flag only reindexes changed files."""
-        result = self.run_cli(["index", "--update", "."], indexed_repo, timeout=600)
-        assert result.returncode == 0
-
     # ===== SEARCH - LEXICAL TESTS =====
 
     def test_search_finds_language_keyword(self, indexed_repo):
@@ -135,31 +89,6 @@ def test_search_csv_output_valid(self, indexed_repo):
         )
         assert result.returncode == 0
 
-    # ===== RESEARCH TESTS =====
-
-    def test_research_finds_related_code(self, indexed_repo):
-        """Test that research command finds related code chunks."""
-        result = self.run_cli(
-            ["research", "How does HTTP context work?", "--hops", "2"], indexed_repo, timeout=600
-        )
-        assert result.returncode == 0
-
-    def test_research_respects_hop_limit(self, indexed_repo):
-        """Test that research respects --hops parameter."""
-        result = self.run_cli(
-            ["research", "How does this work?", "--hops", "1"], indexed_repo, timeout=600
-        )
-        assert result.returncode == 0
-
-    def test_research_graph_shows_relationships(self, indexed_repo):
-        """Test that --graph flag shows code relationships."""
-        result = self.run_cli(
-            ["research", "How does HTTP context work?", "--hops", "2", "--graph"],
-            indexed_repo,
-            timeout=600,
-        )
-        assert result.returncode == 0
-
     # ===== STATUS & MAINTENANCE =====
 
     def test_status_shows_index_info(self, indexed_repo):
diff --git a/tests/e2e/test_go_e2e.py b/tests/e2e/test_go_e2e.py
index 71636d2..35cd16c 100644
--- a/tests/e2e/test_go_e2e.py
+++ b/tests/e2e/test_go_e2e.py
@@ -33,52 +33,6 @@ def test_init_creates_index_file(self, initialized_repo):
         index_path = initialized_repo / ".sia-code" / "index.db"
         assert index_path.exists()
 
-    # ===== INDEXING TESTS =====
-
-    def test_index_full_completes_successfully(self, indexed_repo):
-        """Test that full indexing completes without errors.
-
-        Note: Uses indexed_repo fixture which already performed full indexing.
-        This test verifies the index was created successfully rather than re-indexing.
-        """
-        # Verify index was created
-        index_path = indexed_repo / ".sia-code" / "index.db"
-        assert index_path.exists(), "Index database not created"
-        assert index_path.stat().st_size > 100000, "Index appears empty or incomplete"
-
-        # Verify index contains data by checking status
-        result = self.run_cli(["status"], indexed_repo)
-        assert result.returncode == 0, f"Status check failed: {result.stderr}"
-        assert "index" in result.stdout.lower()
-
-    def test_index_reports_file_and_chunk_counts(self, indexed_repo):
-        """Test that status shows index information after indexing."""
-        result = self.run_cli(["status"], indexed_repo)
-        assert result.returncode == 0
-        # Check for basic index info (chunk info only shown after --update)
-        assert "index" in result.stdout.lower()
-
-    def test_index_skips_excluded_patterns(self, indexed_repo):
-        """Test that indexing skips excluded patterns."""
-        results = self.search_json(".git", indexed_repo, regex=True, limit=10)
-        file_paths = self.get_result_file_paths(results)
-        git_files = [fp for fp in file_paths if ".git/" in fp]
-        assert len(git_files) == 0
-
-    def test_index_clean_rebuilds_from_scratch(self, indexed_repo):
-        """Test that --clean flag rebuilds index from scratch.
-
-        Note: This test does a full rebuild with embeddings enabled.
-        """
-        result = self.run_cli(["index", "--clean", "."], indexed_repo, timeout=600)
-        assert result.returncode == 0
-        assert "clean" in result.stdout.lower()
-
-    def test_index_update_only_processes_changes(self, indexed_repo):
-        """Test that --update flag only reindexes changed files."""
-        result = self.run_cli(["index", "--update", "."], indexed_repo, timeout=600)
-        assert result.returncode == 0
-
     # ===== SEARCH - LEXICAL TESTS =====
 
     def test_search_finds_language_keyword(self, indexed_repo):
@@ -132,31 +86,6 @@ def test_search_csv_output_valid(self, indexed_repo):
         )
         assert result.returncode == 0
 
-    # ===== RESEARCH TESTS =====
-
-    def test_research_finds_related_code(self, indexed_repo):
-        """Test that research command finds related code chunks."""
-        result = self.run_cli(
-            ["research", "How does the HTTP engine work?", "--hops", "2"], indexed_repo, timeout=600
-        )
-        assert result.returncode == 0
-
-    def test_research_respects_hop_limit(self, indexed_repo):
-        """Test that research respects --hops parameter."""
-        result = self.run_cli(
-            ["research", "How does this work?", "--hops", "1"], indexed_repo, timeout=600
-        )
-        assert result.returncode == 0
-
-    def test_research_graph_shows_relationships(self, indexed_repo):
-        """Test that --graph flag shows code relationships."""
-        result = self.run_cli(
-            ["research", "How does the HTTP engine work?", "--hops", "2", "--graph"],
-            indexed_repo,
-            timeout=600,
-        )
-        assert result.returncode == 0
-
     # ===== STATUS & MAINTENANCE =====
 
     def test_status_shows_index_info(self, indexed_repo):
diff --git a/tests/e2e/test_java_e2e.py b/tests/e2e/test_java_e2e.py
index a8c5d8a..99650fe 100644
--- a/tests/e2e/test_java_e2e.py
+++ b/tests/e2e/test_java_e2e.py
@@ -43,61 +43,6 @@ def test_init_creates_index_file(self, initialized_repo):
         index_path = initialized_repo / ".sia-code" / "index.db"
         assert index_path.exists()
 
-    # ===== INDEXING TESTS =====
-
-    def test_index_full_completes_successfully(self, indexed_repo):
-        """Test that full indexing completes without errors.
-
-        Note: Uses indexed_repo fixture which already performed full indexing.
-        This test verifies the index was created successfully rather than re-indexing.
-        """
-        # Verify index was created
-        index_path = indexed_repo / ".sia-code" / "index.db"
-        assert index_path.exists(), "Index database not created"
-        assert index_path.stat().st_size > 100000, "Index appears empty or incomplete"
-
-        # Verify index contains data by checking status
-        result = self.run_cli(["status"], indexed_repo)
-        assert result.returncode == 0, f"Status check failed: {result.stderr}"
-        assert "index" in result.stdout.lower()
-
-    def test_index_reports_file_and_chunk_counts(self, indexed_repo):
-        """Test that status shows index information after indexing."""
-        result = self.run_cli(["status"], indexed_repo)
-        assert result.returncode == 0
-        # Check for basic index info (chunk info only shown after --update)
-        assert "index" in result.stdout.lower()
-
-    def test_index_skips_excluded_patterns(self, indexed_repo):
-        """Test that indexing skips excluded patterns like .git, node_modules."""
-        # Check that .git directory was not indexed by searching for git-specific files
-        results = self.search_json("HEAD", indexed_repo, regex=True, limit=20)
-
-        # If any results found, ensure they're not from .git directory
-        file_paths = self.get_result_file_paths(results)
-        git_files = [fp for fp in file_paths if ".git/" in fp or "\\.git\\" in fp]
-        assert len(git_files) == 0, f"Indexed files from .git directory: {git_files}"
-
-    def test_index_clean_rebuilds_from_scratch(self, indexed_repo):
-        """Test that --clean flag rebuilds index from scratch.
-
-        Note: This test does a full rebuild with embeddings enabled.
-        """
-        result = self.run_cli(["index", "--clean", "."], indexed_repo, timeout=600)
-        assert result.returncode == 0
-        assert "clean" in result.stdout.lower()
-
-    def test_index_update_only_processes_changes(self, indexed_repo):
-        """Test that --update flag only reindexes changed files."""
-        result = self.run_cli(["index", "--update", "."], indexed_repo, timeout=600)
-        assert result.returncode == 0
-        # Should mention incremental or update
-        assert (
-            "incremental" in result.stdout.lower()
-            or "update" in result.stdout.lower()
-            or "unchanged" in result.stdout.lower()
-        )
-
     # ===== SEARCH - LEXICAL TESTS =====
 
     def test_search_finds_language_keyword(self, indexed_repo):
@@ -164,43 +109,6 @@ def test_search_csv_output_valid(self, indexed_repo):
         )
         assert result.returncode == 0
 
-    # ===== RESEARCH TESTS =====
-
-    def test_research_finds_related_code(self, indexed_repo):
-        """Test that research command finds related code chunks."""
-        result = self.run_cli(
-            ["research", "How does mocking work?", "--hops", "2", "-k", "5"],
-            indexed_repo,
-            timeout=600,
-        )
-        assert result.returncode == 0
-        # Should report findings
-        assert (
-            "found" in result.stdout.lower()
-            or "chunk" in result.stdout.lower()
-            or "complete" in result.stdout.lower()
-        )
-
-    def test_research_respects_hop_limit(self, indexed_repo):
-        """Test that research respects --hops parameter."""
-        result = self.run_cli(
-            ["research", "What is verification?", "--hops", "1"], indexed_repo, timeout=600
-        )
-        assert result.returncode == 0
-        # Should complete with specified hop limit
-        assert "hop" in result.stdout.lower() or "complete" in result.stdout.lower()
-
-    def test_research_graph_shows_relationships(self, indexed_repo):
-        """Test that --graph flag shows code relationships."""
-        result = self.run_cli(
-            ["research", "How are mocks created?", "--hops", "2", "--graph"],
-            indexed_repo,
-            timeout=600,
-        )
-        assert result.returncode == 0
-        # Graph output should mention relationships or call graph
-        # Even if no relationships found, command should succeed
-
     # ===== STATUS & MAINTENANCE =====
 
     def test_status_shows_index_info(self, indexed_repo):
diff --git a/tests/e2e/test_javascript_e2e.py b/tests/e2e/test_javascript_e2e.py
index 58780f6..1b0e281 100644
--- a/tests/e2e/test_javascript_e2e.py
+++ b/tests/e2e/test_javascript_e2e.py
@@ -33,52 +33,6 @@ def test_init_creates_index_file(self, initialized_repo):
         index_path = initialized_repo / ".sia-code" / "index.db"
         assert index_path.exists()
 
-    # ===== INDEXING TESTS =====
-
-    def test_index_full_completes_successfully(self, indexed_repo):
-        """Test that full indexing completes without errors.
-
-        Note: Uses indexed_repo fixture which already performed full indexing.
-        This test verifies the index was created successfully rather than re-indexing.
-        """
-        # Verify index was created
-        index_path = indexed_repo / ".sia-code" / "index.db"
-        assert index_path.exists(), "Index database not created"
-        assert index_path.stat().st_size > 100000, "Index appears empty or incomplete"
-
-        # Verify index contains data by checking status
-        result = self.run_cli(["status"], indexed_repo)
-        assert result.returncode == 0, f"Status check failed: {result.stderr}"
-        assert "index" in result.stdout.lower()
-
-    def test_index_reports_file_and_chunk_counts(self, indexed_repo):
-        """Test that status shows index information after indexing."""
-        result = self.run_cli(["status"], indexed_repo)
-        assert result.returncode == 0
-        # Check for basic index info (chunk info only shown after --update)
-        assert "index" in result.stdout.lower()
-
-    def test_index_skips_excluded_patterns(self, indexed_repo):
-        """Test that indexing skips excluded patterns."""
-        results = self.search_json(".git", indexed_repo, regex=True, limit=10)
-        file_paths = self.get_result_file_paths(results)
-        git_files = [fp for fp in file_paths if ".git/" in fp]
-        assert len(git_files) == 0
-
-    def test_index_clean_rebuilds_from_scratch(self, indexed_repo):
-        """Test that --clean flag rebuilds index from scratch.
-
-        Note: This test does a full rebuild with embeddings enabled.
-        """
-        result = self.run_cli(["index", "--clean", "."], indexed_repo, timeout=600)
-        assert result.returncode == 0
-        assert "clean" in result.stdout.lower()
-
-    def test_index_update_only_processes_changes(self, indexed_repo):
-        """Test that --update flag only reindexes changed files."""
-        result = self.run_cli(["index", "--update", "."], indexed_repo, timeout=600)
-        assert result.returncode == 0
-
     # ===== SEARCH - LEXICAL TESTS =====
 
     def test_search_finds_language_keyword(self, indexed_repo):
@@ -135,31 +89,6 @@ def test_search_csv_output_valid(self, indexed_repo):
         )
         assert result.returncode == 0
 
-    # ===== RESEARCH TESTS =====
-
-    def test_research_finds_related_code(self, indexed_repo):
-        """Test that research command finds related code chunks."""
-        result = self.run_cli(
-            ["research", "How does routing work?", "--hops", "2"], indexed_repo, timeout=600
-        )
-        assert result.returncode == 0
-
-    def test_research_respects_hop_limit(self, indexed_repo):
-        """Test that research respects --hops parameter."""
-        result = self.run_cli(
-            ["research", "How does this work?", "--hops", "1"], indexed_repo, timeout=600
-        )
-        assert result.returncode == 0
-
-    def test_research_graph_shows_relationships(self, indexed_repo):
-        """Test that --graph flag shows code relationships."""
-        result = self.run_cli(
-            ["research", "How does routing work?", "--hops", "2", "--graph"],
-            indexed_repo,
-            timeout=600,
-        )
-        assert result.returncode == 0
-
     # ===== STATUS & MAINTENANCE =====
 
     def test_status_shows_index_info(self, indexed_repo):
diff --git a/tests/e2e/test_php_e2e.py b/tests/e2e/test_php_e2e.py
index fadacac..c2a2d08 100644
--- a/tests/e2e/test_php_e2e.py
+++ b/tests/e2e/test_php_e2e.py
@@ -33,52 +33,6 @@ def test_init_creates_index_file(self, initialized_repo):
         index_path = initialized_repo / ".sia-code" / "index.db"
         assert index_path.exists()
 
-    # ===== INDEXING TESTS =====
-
-    def test_index_full_completes_successfully(self, indexed_repo):
-        """Test that full indexing completes without errors.
-
-        Note: Uses indexed_repo fixture which already performed full indexing.
-        This test verifies the index was created successfully rather than re-indexing.
-        """
-        # Verify index was created
-        index_path = indexed_repo / ".sia-code" / "index.db"
-        assert index_path.exists(), "Index database not created"
-        assert index_path.stat().st_size > 100000, "Index appears empty or incomplete"
-
-        # Verify index contains data by checking status
-        result = self.run_cli(["status"], indexed_repo)
-        assert result.returncode == 0, f"Status check failed: {result.stderr}"
-        assert "index" in result.stdout.lower()
-
-    def test_index_reports_file_and_chunk_counts(self, indexed_repo):
-        """Test that status shows index information after indexing."""
-        result = self.run_cli(["status"], indexed_repo)
-        assert result.returncode == 0
-        # Check for basic index info (chunk info only shown after --update)
-        assert "index" in result.stdout.lower()
-
-    def test_index_skips_excluded_patterns(self, indexed_repo):
-        """Test that indexing skips excluded patterns."""
-        results = self.search_json(".git", indexed_repo, regex=True, limit=10)
-        file_paths = self.get_result_file_paths(results)
-        git_files = [fp for fp in file_paths if ".git/" in fp]
-        assert len(git_files) == 0
-
-    def test_index_clean_rebuilds_from_scratch(self, indexed_repo):
-        """Test that --clean flag rebuilds index from scratch.
-
-        Note: This test does a full rebuild with embeddings enabled.
-        """
-        result = self.run_cli(["index", "--clean", "."], indexed_repo, timeout=600)
-        assert result.returncode == 0
-        assert "clean" in result.stdout.lower()
-
-    def test_index_update_only_processes_changes(self, indexed_repo):
-        """Test that --update flag only reindexes changed files."""
-        result = self.run_cli(["index", "--update", "."], indexed_repo, timeout=600)
-        assert result.returncode == 0
-
     # ===== SEARCH - LEXICAL TESTS =====
 
     def test_search_finds_language_keyword(self, indexed_repo):
@@ -135,31 +89,6 @@ def test_search_csv_output_valid(self, indexed_repo):
         )
         assert result.returncode == 0
 
-    # ===== RESEARCH TESTS =====
-
-    def test_research_finds_related_code(self, indexed_repo):
-        """Test that research command finds related code chunks."""
-        result = self.run_cli(
-            ["research", "How does the framework work?", "--hops", "2"], indexed_repo, timeout=600
-        )
-        assert result.returncode == 0
-
-    def test_research_respects_hop_limit(self, indexed_repo):
-        """Test that research respects --hops parameter."""
-        result = self.run_cli(
-            ["research", "How does this work?", "--hops", "1"], indexed_repo, timeout=600
-        )
-        assert result.returncode == 0
-
-    def test_research_graph_shows_relationships(self, indexed_repo):
-        """Test that --graph flag shows code relationships."""
-        result = self.run_cli(
-            ["research", "How does the framework work?", "--hops", "2", "--graph"],
-            indexed_repo,
-            timeout=600,
-        )
-        assert result.returncode == 0
-
     # ===== STATUS & MAINTENANCE =====
 
     def test_status_shows_index_info(self, indexed_repo):
diff --git a/tests/e2e/test_python_e2e.py b/tests/e2e/test_python_e2e.py
index fc39c52..d9e9a96 100644
--- a/tests/e2e/test_python_e2e.py
+++ b/tests/e2e/test_python_e2e.py
@@ -33,52 +33,6 @@ def test_init_creates_index_file(self, initialized_repo):
         index_path = initialized_repo / ".sia-code" / "index.db"
         assert index_path.exists()
 
-    # ===== INDEXING TESTS =====
-
-    def test_index_full_completes_successfully(self, indexed_repo):
-        """Test that full indexing completes without errors.
-
-        Note: Uses indexed_repo fixture which already performed full indexing.
-        This test verifies the index was created successfully rather than re-indexing.
-        """
-        # Verify index was created
-        index_path = indexed_repo / ".sia-code" / "index.db"
-        assert index_path.exists(), "Index database not created"
-        assert index_path.stat().st_size > 100000, "Index appears empty or incomplete"
-
-        # Verify index contains data by checking status
-        result = self.run_cli(["status"], indexed_repo)
-        assert result.returncode == 0, f"Status check failed: {result.stderr}"
-        assert "index" in result.stdout.lower()
-
-    def test_index_reports_file_and_chunk_counts(self, indexed_repo):
-        """Test that status shows index information after indexing."""
-        result = self.run_cli(["status"], indexed_repo)
-        assert result.returncode == 0
-        # Check for basic index info (chunk info only shown after --update)
-        assert "index" in result.stdout.lower()
-
-    def test_index_skips_excluded_patterns(self, indexed_repo):
-        """Test that indexing skips excluded patterns."""
-        results = self.search_json(".git", indexed_repo, regex=True, limit=10)
-        file_paths = self.get_result_file_paths(results)
-        git_files = [fp for fp in file_paths if ".git/" in fp]
-        assert len(git_files) == 0
-
-    def test_index_clean_rebuilds_from_scratch(self, indexed_repo):
-        """Test that --clean flag rebuilds index from scratch.
-
-        Note: This test does a full rebuild with embeddings enabled.
-        """
-        result = self.run_cli(["index", "--clean", "."], indexed_repo, timeout=600)
-        assert result.returncode == 0
-        assert "clean" in result.stdout.lower()
-
-    def test_index_update_only_processes_changes(self, indexed_repo):
-        """Test that --update flag only reindexes changed files."""
-        result = self.run_cli(["index", "--update", "."], indexed_repo, timeout=600)
-        assert result.returncode == 0
-
     # ===== SEARCH - LEXICAL TESTS =====
 
     def test_search_finds_language_keyword(self, indexed_repo):
@@ -137,31 +91,6 @@ def test_search_csv_output_valid(self, indexed_repo):
         )
         assert result.returncode == 0
 
-    # ===== RESEARCH TESTS =====
-
-    def test_research_finds_related_code(self, indexed_repo):
-        """Test that research command finds related code chunks."""
-        result = self.run_cli(
-            ["research", "How do HTTP requests work?", "--hops", "2"], indexed_repo, timeout=600
-        )
-        assert result.returncode == 0
-
-    def test_research_respects_hop_limit(self, indexed_repo):
-        """Test that research respects --hops parameter."""
-        result = self.run_cli(
-            ["research", "What is a session?", "--hops", "1"], indexed_repo, timeout=600
-        )
-        assert result.returncode == 0
-
-    def test_research_graph_shows_relationships(self, indexed_repo):
-        """Test that --graph flag shows code relationships."""
-        result = self.run_cli(
-            ["research", "How are requests sent?", "--hops", "2", "--graph"],
-            indexed_repo,
-            timeout=600,
-        )
-        assert result.returncode == 0
-
     # ===== STATUS & MAINTENANCE =====
 
     def test_status_shows_index_info(self, indexed_repo):
diff --git a/tests/e2e/test_ruby_e2e.py b/tests/e2e/test_ruby_e2e.py
index df06f76..492780d 100644
--- a/tests/e2e/test_ruby_e2e.py
+++ b/tests/e2e/test_ruby_e2e.py
@@ -33,52 +33,6 @@ def test_init_creates_index_file(self, initialized_repo):
         index_path = initialized_repo / ".sia-code" / "index.db"
         assert index_path.exists()
 
-    # ===== INDEXING TESTS =====
-
-    def test_index_full_completes_successfully(self, indexed_repo):
-        """Test that full indexing completes without errors.
-
-        Note: Uses indexed_repo fixture which already performed full indexing.
-        This test verifies the index was created successfully rather than re-indexing.
-        """
-        # Verify index was created
-        index_path = indexed_repo / ".sia-code" / "index.db"
-        assert index_path.exists(), "Index database not created"
-        assert index_path.stat().st_size > 100000, "Index appears empty or incomplete"
-
-        # Verify index contains data by checking status
-        result = self.run_cli(["status"], indexed_repo)
-        assert result.returncode == 0, f"Status check failed: {result.stderr}"
-        assert "index" in result.stdout.lower()
-
-    def test_index_reports_file_and_chunk_counts(self, indexed_repo):
-        """Test that status shows index information after indexing."""
-        result = self.run_cli(["status"], indexed_repo)
-        assert result.returncode == 0
-        # Check for basic index info (chunk info only shown after --update)
-        assert "index" in result.stdout.lower()
-
-    def test_index_skips_excluded_patterns(self, indexed_repo):
-        """Test that indexing skips excluded patterns."""
-        results = self.search_json(".git", indexed_repo, regex=True, limit=10)
-        file_paths = self.get_result_file_paths(results)
-        git_files = [fp for fp in file_paths if ".git/" in fp]
-        assert len(git_files) == 0
-
-    def test_index_clean_rebuilds_from_scratch(self, indexed_repo):
-        """Test that --clean flag rebuilds index from scratch.
-
-        Note: This test does a full rebuild with embeddings enabled.
-        """
-        result = self.run_cli(["index", "--clean", "."], indexed_repo, timeout=600)
-        assert result.returncode == 0
-        assert "clean" in result.stdout.lower()
-
-    def test_index_update_only_processes_changes(self, indexed_repo):
-        """Test that --update flag only reindexes changed files."""
-        result = self.run_cli(["index", "--update", "."], indexed_repo, timeout=600)
-        assert result.returncode == 0
-
     # ===== SEARCH - LEXICAL TESTS =====
 
     def test_search_finds_language_keyword(self, indexed_repo):
@@ -132,31 +86,6 @@ def test_search_csv_output_valid(self, indexed_repo):
         )
         assert result.returncode == 0
 
-    # ===== RESEARCH TESTS =====
-
-    def test_research_finds_related_code(self, indexed_repo):
-        """Test that research command finds related code chunks."""
-        result = self.run_cli(
-            ["research", "How does routing work?", "--hops", "2"], indexed_repo, timeout=600
-        )
-        assert result.returncode == 0
-
-    def test_research_respects_hop_limit(self, indexed_repo):
-        """Test that research respects --hops parameter."""
-        result = self.run_cli(
-            ["research", "How does this work?", "--hops", "1"], indexed_repo, timeout=600
-        )
-        assert result.returncode == 0
-
-    def test_research_graph_shows_relationships(self, indexed_repo):
-        """Test that --graph flag shows code relationships."""
-        result = self.run_cli(
-            ["research", "How does routing work?", "--hops", "2", "--graph"],
-            indexed_repo,
-            timeout=600,
-        )
-        assert result.returncode == 0
-
     # ===== STATUS & MAINTENANCE =====
 
     def test_status_shows_index_info(self, indexed_repo):
diff --git a/tests/e2e/test_rust_e2e.py b/tests/e2e/test_rust_e2e.py
index fd023c4..8d5ebad 100644
--- a/tests/e2e/test_rust_e2e.py
+++ b/tests/e2e/test_rust_e2e.py
@@ -33,52 +33,6 @@ def test_init_creates_index_file(self, initialized_repo):
         index_path = initialized_repo / ".sia-code" / "index.db"
         assert index_path.exists()
 
-    # ===== INDEXING TESTS =====
-
-    def test_index_full_completes_successfully(self, indexed_repo):
-        """Test that full indexing completes without errors.
-
-        Note: Uses indexed_repo fixture which already performed full indexing.
-        This test verifies the index was created successfully rather than re-indexing.
-        """
-        # Verify index was created
-        index_path = indexed_repo / ".sia-code" / "index.db"
-        assert index_path.exists(), "Index database not created"
-        assert index_path.stat().st_size > 100000, "Index appears empty or incomplete"
-
-        # Verify index contains data by checking status
-        result = self.run_cli(["status"], indexed_repo)
-        assert result.returncode == 0, f"Status check failed: {result.stderr}"
-        assert "index" in result.stdout.lower()
-
-    def test_index_reports_file_and_chunk_counts(self, indexed_repo):
-        """Test that status shows index information after indexing."""
-        result = self.run_cli(["status"], indexed_repo)
-        assert result.returncode == 0
-        # Check for basic index info (chunk info only shown after --update)
-        assert "index" in result.stdout.lower()
-
-    def test_index_skips_excluded_patterns(self, indexed_repo):
-        """Test that indexing skips excluded patterns."""
-        results = self.search_json(".git", indexed_repo, regex=True, limit=10)
-        file_paths = self.get_result_file_paths(results)
-        git_files = [fp for fp in file_paths if ".git/" in fp]
-        assert len(git_files) == 0
-
-    def test_index_clean_rebuilds_from_scratch(self, indexed_repo):
-        """Test that --clean flag rebuilds index from scratch.
-
-        Note: This test does a full rebuild with embeddings enabled.
-        """
-        result = self.run_cli(["index", "--clean", "."], indexed_repo, timeout=600)
-        assert result.returncode == 0
-        assert "clean" in result.stdout.lower()
-
-    def test_index_update_only_processes_changes(self, indexed_repo):
-        """Test that --update flag only reindexes changed files."""
-        result = self.run_cli(["index", "--update", "."], indexed_repo, timeout=600)
-        assert result.returncode == 0
-
     # ===== SEARCH - LEXICAL TESTS =====
 
     def test_search_finds_language_keyword(self, indexed_repo):
@@ -132,31 +86,6 @@ def test_search_csv_output_valid(self, indexed_repo):
         )
         assert result.returncode == 0
 
-    # ===== RESEARCH TESTS =====
-
-    def test_research_finds_related_code(self, indexed_repo):
-        """Test that research command finds related code chunks."""
-        result = self.run_cli(
-            ["research", "How does async runtime work?", "--hops", "2"], indexed_repo, timeout=600
-        )
-        assert result.returncode == 0
-
-    def test_research_respects_hop_limit(self, indexed_repo):
-        """Test that research respects --hops parameter."""
-        result = self.run_cli(
-            ["research", "How does this work?", "--hops", "1"], indexed_repo, timeout=600
-        )
-        assert result.returncode == 0
-
-    def test_research_graph_shows_relationships(self, indexed_repo):
-        """Test that --graph flag shows code relationships."""
-        result = self.run_cli(
-            ["research", "How does async runtime work?", "--hops", "2", "--graph"],
-            indexed_repo,
-            timeout=600,
-        )
-        assert result.returncode == 0
-
     # ===== STATUS & MAINTENANCE =====
 
     def test_status_shows_index_info(self, indexed_repo):
diff --git a/tests/e2e/test_typescript_e2e.py b/tests/e2e/test_typescript_e2e.py
index 90b9154..cd94fcf 100644
--- a/tests/e2e/test_typescript_e2e.py
+++ b/tests/e2e/test_typescript_e2e.py
@@ -33,52 +33,6 @@ def test_init_creates_index_file(self, initialized_repo):
         index_path = initialized_repo / ".sia-code" / "index.db"
         assert index_path.exists()
 
-    # ===== INDEXING TESTS =====
-
-    def test_index_full_completes_successfully(self, indexed_repo):
-        """Test that full indexing completes without errors.
-
-        Note: Uses indexed_repo fixture which already performed full indexing.
-        This test verifies the index was created successfully rather than re-indexing.
-        """
-        # Verify index was created
-        index_path = indexed_repo / ".sia-code" / "index.db"
-        assert index_path.exists(), "Index database not created"
-        assert index_path.stat().st_size > 100000, "Index appears empty or incomplete"
-
-        # Verify index contains data by checking status
-        result = self.run_cli(["status"], indexed_repo)
-        assert result.returncode == 0, f"Status check failed: {result.stderr}"
-        assert "index" in result.stdout.lower()
-
-    def test_index_reports_file_and_chunk_counts(self, indexed_repo):
-        """Test that status shows index information after indexing."""
-        result = self.run_cli(["status"], indexed_repo)
-        assert result.returncode == 0
-        # Check for basic index info (chunk info only shown after --update)
-        assert "index" in result.stdout.lower()
-
-    def test_index_skips_excluded_patterns(self, indexed_repo):
-        """Test that indexing skips excluded patterns."""
-        results = self.search_json(".git", indexed_repo, regex=True, limit=10)
-        file_paths = self.get_result_file_paths(results)
-        git_files = [fp for fp in file_paths if ".git/" in fp]
-        assert len(git_files) == 0
-
-    def test_index_clean_rebuilds_from_scratch(self, indexed_repo):
-        """Test that --clean flag rebuilds index from scratch.
-
-        Note: This test does a full rebuild with embeddings enabled.
-        """
-        result = self.run_cli(["index", "--clean", "."], indexed_repo, timeout=600)
-        assert result.returncode == 0
-        assert "clean" in result.stdout.lower()
-
-    def test_index_update_only_processes_changes(self, indexed_repo):
-        """Test that --update flag only reindexes changed files."""
-        result = self.run_cli(["index", "--update", "."], indexed_repo, timeout=600)
-        assert result.returncode == 0
-
     # ===== SEARCH - LEXICAL TESTS =====
 
     def test_search_finds_language_keyword(self, indexed_repo):
@@ -135,31 +89,6 @@ def test_search_csv_output_valid(self, indexed_repo):
         )
         assert result.returncode == 0
 
-    # ===== RESEARCH TESTS =====
-
-    def test_research_finds_related_code(self, indexed_repo):
-        """Test that research command finds related code chunks."""
-        result = self.run_cli(
-            ["research", "How does the runtime work?", "--hops", "2"], indexed_repo, timeout=600
-        )
-        assert result.returncode == 0
-
-    def test_research_respects_hop_limit(self, indexed_repo):
-        """Test that research respects --hops parameter."""
-        result = self.run_cli(
-            ["research", "How does this work?", "--hops", "1"], indexed_repo, timeout=600
-        )
-        assert result.returncode == 0
-
-    def test_research_graph_shows_relationships(self, indexed_repo):
-        """Test that --graph flag shows code relationships."""
-        result = self.run_cli(
-            ["research", "How does the runtime work?", "--hops", "2", "--graph"],
-            indexed_repo,
-            timeout=600,
-        )
-        assert result.returncode == 0
-
     # ===== STATUS & MAINTENANCE =====
 
     def test_status_shows_index_info(self, indexed_repo):
diff --git a/tests/integration/test_batch_indexing_search.py b/tests/integration/test_batch_indexing_search.py
deleted file mode 100644
index 9e68ecb..0000000
--- a/tests/integration/test_batch_indexing_search.py
+++ /dev/null
@@ -1,48 +0,0 @@
-"""Integration test for batched indexing and lexical search."""
-
-
-from sia_code.config import Config
-from sia_code.indexer.coordinator import IndexingCoordinator
-from sia_code.storage.usearch_backend import UsearchSqliteBackend
-
-
-def test_batched_indexing_enables_search(tmp_path):
-    repo = tmp_path / "repo"
-    repo.mkdir()
-
-    source = repo / "math_utils.py"
-    source.write_text(
-        "\n".join(
-            [
-                "def add(a, b):",
-                "    return a + b",
-                "",
-                "def multiply(a, b):",
-                "    return a * b",
-                "",
-            ]
-        )
-    )
-
-    config = Config()
-    config.indexing.chunk_batch_size = 2
-    config.embedding.enabled = False
-
-    backend = UsearchSqliteBackend(
-        path=tmp_path / ".sia-code",
-        embedding_enabled=False,
-        ndim=4,
-        dtype="f32",
-    )
-    backend.create_index()
-
-    coordinator = IndexingCoordinator(config, backend)
-    stats = coordinator.index_directory(repo)
-
-    assert stats["total_chunks"] > 0
-
-    results = backend.search_lexical("multiply", k=1)
-    assert results
-    assert results[0].chunk.file_path.name == "math_utils.py"
-
-    backend.close()
diff --git a/tests/integration/test_v1_v2_equivalence.py b/tests/integration/test_v1_v2_equivalence.py
deleted file mode 100644
index 4d706b0..0000000
--- a/tests/integration/test_v1_v2_equivalence.py
+++ /dev/null
@@ -1,328 +0,0 @@
-"""Test equivalence between v1 and v2 incremental indexing methods.
-
-NOTE: v1 has been REMOVED from the codebase after validation.
-These tests remain as historical documentation that v2 was validated
-to produce equivalent or better results than v1 before deletion.
-
-The tests now only execute against a mock v1 implementation.
-"""
-
-import pytest
-import time
-from sia_code.indexer.coordinator import IndexingCoordinator
-from sia_code.indexer.hash_cache import HashCache
-from sia_code.indexer.chunk_index import ChunkIndex
-from sia_code.storage.usearch_backend import UsearchSqliteBackend
-from sia_code.config import Config, ChunkingConfig
-
-
-@pytest.fixture
-def test_workspace(tmp_path):
-    """Create a workspace with test files."""
-    workspace = tmp_path / "workspace"
-    workspace.mkdir()
-
-    # Create test files with different sizes
-    (workspace / "small.py").write_text("""
-def small_function():
-    return "small"
-""")
-
-    (workspace / "medium.py").write_text("""
-def function_one():
-    return 1
-
-def function_two():
-    return 2
-
-class MediumClass:
-    def method(self):
-        return "method"
-""")
-
-    (workspace / "large.py").write_text("""
-class LargeClass:
-    def __init__(self):
-        self.data = []
-    
-    def add(self, item):
-        self.data.append(item)
-    
-    def remove(self, item):
-        self.data.remove(item)
-    
-    def get_all(self):
-        return self.data
-    
-    def clear(self):
-        self.data.clear()
-""")
-
-    return workspace
-
-
-@pytest.fixture
-def backends(tmp_path):
-    """Create separate backends for v1 and v2."""
-    backend_v1 = UsearchSqliteBackend(tmp_path / "v1.sia-code", embedding_enabled=False)
-    backend_v1.create_index()
-
-    backend_v2 = UsearchSqliteBackend(tmp_path / "v2.sia-code", embedding_enabled=False)
-    backend_v2.create_index()
-
-    yield {"v1": backend_v1, "v2": backend_v2}
-
-    backend_v1.close()
-    backend_v2.close()
-
-
-class TestV1V2Equivalence:
-    """Test that v2 produces equivalent results to v1.
-
-    NOTE: v1 has been removed. These tests are skipped but kept for documentation.
-    """
-
-    @pytest.mark.skip(reason="v1 removed after validation - kept for historical documentation")
-    def test_initial_indexing_produces_same_chunk_count(self, test_workspace, backends, tmp_path):
-        """Test that v1 and v2 produce same chunk count on initial indexing."""
-        # Setup for v1
-        cache_v1 = HashCache(tmp_path / "cache_v1.json")
-        config = Config(
-            sia_dir=tmp_path / "v1_dir",
-            chunking=ChunkingConfig(
-                max_chunk_size=500,
-                min_chunk_size=50,
-                merge_threshold=100,
-                greedy_merge=True,
-            ),
-        )
-        coordinator_v1 = IndexingCoordinator(backend=backends["v1"], config=config)
-
-        # Setup for v2
-        cache_v2 = HashCache(tmp_path / "cache_v2.json")
-        chunk_index = ChunkIndex(tmp_path / "chunk_index.json")
-        coordinator_v2 = IndexingCoordinator(backend=backends["v2"], config=config)
-
-        # Run v1
-        stats_v1 = coordinator_v1.index_directory_incremental(test_workspace, cache_v1)
-
-        # Run v2
-        stats_v2 = coordinator_v2.index_directory_incremental_v2(
-            test_workspace, cache_v2, chunk_index, progress_callback=None
-        )
-
-        # Compare results
-        # Both use same keys
-        assert stats_v1["changed_files"] == stats_v2["changed_files"]
-        assert stats_v1["total_chunks"] == stats_v2["total_chunks"]
-
-    @pytest.mark.skip(reason="v1 removed after validation - kept for historical documentation")
-    def test_incremental_reindex_skips_same_files(self, test_workspace, backends, tmp_path):
-        """Test that both v1 and v2 skip unchanged files on re-index."""
-        cache_v1 = HashCache(tmp_path / "cache_v1.json")
-        cache_v2 = HashCache(tmp_path / "cache_v2.json")
-        chunk_index = ChunkIndex(tmp_path / "chunk_index.json")
-
-        config = Config(
-            sia_dir=tmp_path,
-            chunking=ChunkingConfig(
-                max_chunk_size=500,
-                min_chunk_size=50,
-                merge_threshold=100,
-                greedy_merge=True,
-            ),
-        )
-
-        coordinator_v1 = IndexingCoordinator(backend=backends["v1"], config=config)
-        coordinator_v2 = IndexingCoordinator(backend=backends["v2"], config=config)
-
-        # Initial indexing
-        coordinator_v1.index_directory_incremental(test_workspace, cache_v1)
-        coordinator_v2.index_directory_incremental_v2(
-            test_workspace, cache_v2, chunk_index, progress_callback=None
-        )
-
-        # Save caches
-        cache_v1.save()
-        cache_v2.save()
-        chunk_index.save()
-
-        # Re-index without changes
-        stats_v1_reindex = coordinator_v1.index_directory_incremental(test_workspace, cache_v1)
-        stats_v2_reindex = coordinator_v2.index_directory_incremental_v2(
-            test_workspace, cache_v2, chunk_index, progress_callback=None
-        )
-
-        # Both should skip all files
-        assert stats_v1_reindex["changed_files"] == 0
-        assert stats_v2_reindex["changed_files"] == 0
-        assert stats_v1_reindex["total_chunks"] == 0
-        assert stats_v2_reindex["total_chunks"] == 0
-
-    @pytest.mark.skip(reason="v1 removed after validation - kept for historical documentation")
-    def test_file_change_detection_consistent(self, test_workspace, backends, tmp_path):
-        """Test that both v1 and v2 detect file changes consistently."""
-        cache_v1 = HashCache(tmp_path / "cache_v1.json")
-        cache_v2 = HashCache(tmp_path / "cache_v2.json")
-        chunk_index = ChunkIndex(tmp_path / "chunk_index.json")
-
-        config = Config(
-            sia_dir=tmp_path,
-            chunking=ChunkingConfig(
-                max_chunk_size=500,
-                min_chunk_size=50,
-                merge_threshold=100,
-                greedy_merge=True,
-            ),
-        )
-
-        coordinator_v1 = IndexingCoordinator(backend=backends["v1"], config=config)
-        coordinator_v2 = IndexingCoordinator(backend=backends["v2"], config=config)
-
-        # Initial indexing
-        coordinator_v1.index_directory_incremental(test_workspace, cache_v1)
-        coordinator_v2.index_directory_incremental_v2(
-            test_workspace, cache_v2, chunk_index, progress_callback=None
-        )
-
-        cache_v1.save()
-        cache_v2.save()
-        chunk_index.save()
-
-        # Modify one file
-        time.sleep(0.01)
-        (test_workspace / "small.py").write_text("""
-def small_function():
-    return "modified"
-
-def new_function():
-    return "new"
-""")
-
-        # Re-index
-        stats_v1 = coordinator_v1.index_directory_incremental(test_workspace, cache_v1)
-        stats_v2 = coordinator_v2.index_directory_incremental_v2(
-            test_workspace, cache_v2, chunk_index, progress_callback=None
-        )
-
-        # Both should detect 1 changed file
-        assert stats_v1["changed_files"] == 1
-        assert stats_v2["changed_files"] == 1
-
-        # Both should have similar chunk counts (at least 2 functions)
-        assert stats_v1["total_chunks"] >= 2
-        assert stats_v2["total_chunks"] >= 2
-
-    def test_v2_additional_features_work(self, test_workspace, backends, tmp_path):
-        """Test that v2's additional features (chunk tracking) work correctly."""
-        cache = HashCache(tmp_path / "cache.json")
-        chunk_index = ChunkIndex(tmp_path / "chunk_index.json")
-
-        config = Config(
-            sia_dir=tmp_path,
-            chunking=ChunkingConfig(
-                max_chunk_size=500,
-                min_chunk_size=50,
-                merge_threshold=100,
-                greedy_merge=True,
-            ),
-        )
-
-        coordinator = IndexingCoordinator(backend=backends["v2"], config=config)
-
-        # Initial indexing
-        coordinator.index_directory_incremental_v2(
-            test_workspace, cache, chunk_index, progress_callback=None
-        )
-
-        # Chunk index should have valid chunks
-        valid_chunks = chunk_index.get_valid_chunks()
-        assert len(valid_chunks) > 0
-
-        # Modify a file
-        time.sleep(0.01)
-        (test_workspace / "medium.py").write_text("def new(): pass")
-
-        # Re-index
-        coordinator.index_directory_incremental_v2(
-            test_workspace, cache, chunk_index, progress_callback=None
-        )
-
-        # Should now have stale chunks (from old medium.py)
-        stale_chunks = chunk_index.get_stale_chunks()
-        assert len(stale_chunks) > 0
-
-
-class TestV2Improvements:
-    """Test that v2 has improvements over v1."""
-
-    def test_v2_tracks_staleness(self, test_workspace, backends, tmp_path):
-        """Test that v2 tracks chunk staleness (v1 does not)."""
-        cache = HashCache(tmp_path / "cache.json")
-        chunk_index = ChunkIndex(tmp_path / "chunk_index.json")
-
-        config = Config(
-            sia_dir=tmp_path,
-            chunking=ChunkingConfig(
-                max_chunk_size=500,
-                min_chunk_size=50,
-                merge_threshold=100,
-                greedy_merge=True,
-            ),
-        )
-
-        coordinator = IndexingCoordinator(backend=backends["v2"], config=config)
-
-        # Index
-        coordinator.index_directory_incremental_v2(
-            test_workspace, cache, chunk_index, progress_callback=None
-        )
-
-        # Get summary
-        summary = chunk_index.get_staleness_summary()
-
-        # Should have metrics
-        assert summary.total_chunks > 0
-        assert summary.valid_chunks > 0
-        assert summary.stale_chunks == 0  # No stale chunks yet
-        assert summary.staleness_ratio == 0.0
-
-    def test_v2_cleanup_deleted_files(self, test_workspace, backends, tmp_path):
-        """Test that v2 cleans up chunks from deleted files."""
-        cache = HashCache(tmp_path / "cache.json")
-        chunk_index = ChunkIndex(tmp_path / "chunk_index.json")
-
-        config = Config(
-            sia_dir=tmp_path,
-            chunking=ChunkingConfig(
-                max_chunk_size=500,
-                min_chunk_size=50,
-                merge_threshold=100,
-                greedy_merge=True,
-            ),
-        )
-
-        coordinator = IndexingCoordinator(backend=backends["v2"], config=config)
-
-        # Initial index
-        stats1 = coordinator.index_directory_incremental_v2(
-            test_workspace, cache, chunk_index, progress_callback=None
-        )
-        initial_file_count = stats1["changed_files"]
-
-        # Delete a file
-        (test_workspace / "small.py").unlink()
-
-        # Re-index
-        coordinator.index_directory_incremental_v2(
-            test_workspace, cache, chunk_index, progress_callback=None
-        )
-
-        # Chunk index should have cleaned up the deleted file
-        # (exact validation depends on internal state, but shouldn't crash)
-        summary = chunk_index.get_staleness_summary()
-        assert summary.total_files < initial_file_count
-
-
-if __name__ == "__main__":
-    pytest.main([__file__, "-v"])
diff --git a/tests/integration/test_watch_mode.py b/tests/integration/test_watch_mode.py
deleted file mode 100644
index 63e26b7..0000000
--- a/tests/integration/test_watch_mode.py
+++ /dev/null
@@ -1,261 +0,0 @@
-"""Integration tests for watch mode functionality."""
-
-import pytest
-import time
-from sia_code.indexer.coordinator import IndexingCoordinator
-from sia_code.indexer.hash_cache import HashCache
-from sia_code.indexer.chunk_index import ChunkIndex
-from sia_code.storage.usearch_backend import UsearchSqliteBackend
-from sia_code.config import Config, ChunkingConfig
-
-
-@pytest.fixture
-def temp_workspace(tmp_path):
-    """Create a temporary workspace with test files."""
-    workspace = tmp_path / "workspace"
-    workspace.mkdir()
-
-    # Create initial test file
-    test_file = workspace / "test.py"
-    test_file.write_text("""
-def hello():
-    return "Hello, World!"
-""")
-
-    return workspace
-
-
-@pytest.fixture
-def test_setup(tmp_path, temp_workspace):
-    """Set up test infrastructure (backend, cache, index)."""
-    # Create backend
-    backend_path = tmp_path / "test.sia-code"
-    backend = UsearchSqliteBackend(backend_path, embedding_enabled=False)
-    backend.create_index()
-
-    # Create cache and chunk index
-    cache_path = tmp_path / "cache.json"
-    cache = HashCache(cache_path)
-
-    chunk_index_path = tmp_path / "chunk_index.json"
-    chunk_index = ChunkIndex(chunk_index_path)
-
-    # Create config
-    config = Config(
-        sia_dir=tmp_path,
-        chunking=ChunkingConfig(
-            max_chunk_size=500,
-            min_chunk_size=50,
-            merge_threshold=100,
-            greedy_merge=True,
-        ),
-    )
-
-    coordinator = IndexingCoordinator(backend=backend, config=config)
-
-    yield {
-        "backend": backend,
-        "cache": cache,
-        "chunk_index": chunk_index,
-        "config": config,
-        "coordinator": coordinator,
-        "workspace": temp_workspace,
-    }
-
-    backend.close()
-
-
-class TestWatchModeIndexing:
-    """Test watch mode uses v2 incremental indexing correctly."""
-
-    def test_watch_uses_v2_method(self, test_setup):
-        """Test that watch mode reindex uses index_directory_incremental_v2."""
-        setup = test_setup
-
-        # Initial index
-        stats = setup["coordinator"].index_directory_incremental_v2(
-            setup["workspace"],
-            setup["cache"],
-            setup["chunk_index"],
-            progress_callback=None,
-        )
-
-        assert stats["changed_files"] >= 1
-        assert stats["total_chunks"] >= 1
-
-        # Save state
-        setup["cache"].save()
-        setup["chunk_index"].save()
-
-        # Verify chunk index was updated
-        valid_chunks = setup["chunk_index"].get_valid_chunks()
-        assert len(valid_chunks) >= 1
-
-    def test_watch_incremental_reuses_unchanged_chunks(self, test_setup):
-        """Test that incremental indexing reuses chunks from unchanged files."""
-        setup = test_setup
-
-        # Initial index
-        stats1 = setup["coordinator"].index_directory_incremental_v2(
-            setup["workspace"],
-            setup["cache"],
-            setup["chunk_index"],
-            progress_callback=None,
-        )
-
-        setup["cache"].save()
-        setup["chunk_index"].save()
-        initial_chunks = stats1["total_chunks"]
-
-        # Re-index without changes (should skip unchanged files)
-        stats2 = setup["coordinator"].index_directory_incremental_v2(
-            setup["workspace"],
-            setup["cache"],
-            setup["chunk_index"],
-            progress_callback=None,
-        )
-
-        # Should index 0 new files (nothing changed)
-        assert stats2["changed_files"] == 0
-        assert stats2["total_chunks"] == 0
-
-        # Chunk index should still have the original chunks
-        valid_chunks = setup["chunk_index"].get_valid_chunks()
-        assert len(valid_chunks) == initial_chunks
-
-    def test_watch_detects_file_changes(self, test_setup):
-        """Test that watch mode detects and re-indexes changed files."""
-        setup = test_setup
-
-        # Initial index
-        setup["coordinator"].index_directory_incremental_v2(
-            setup["workspace"],
-            setup["cache"],
-            setup["chunk_index"],
-            progress_callback=None,
-        )
-
-        setup["cache"].save()
-        setup["chunk_index"].save()
-
-        # Wait a moment to ensure mtime changes
-        time.sleep(0.01)
-
-        # Modify the file
-        test_file = setup["workspace"] / "test.py"
-        test_file.write_text("""
-def hello():
-    return "Hello, World!"
-
-def goodbye():
-    return "Goodbye, World!"
-""")
-
-        # Re-index (should detect change)
-        stats2 = setup["coordinator"].index_directory_incremental_v2(
-            setup["workspace"],
-            setup["cache"],
-            setup["chunk_index"],
-            progress_callback=None,
-        )
-
-        # Should re-index the changed file
-        assert stats2["changed_files"] >= 1
-        assert stats2["total_chunks"] >= 2  # Now has 2 functions
-
-    def test_watch_does_not_reindex_whole_repo(self, test_setup):
-        """Test that watch mode doesn't re-index unchanged files."""
-        setup = test_setup
-
-        # Create multiple files
-        for i in range(5):
-            file_path = setup["workspace"] / f"module{i}.py"
-            file_path.write_text(f"""
-def function_{i}():
-    return {i}
-""")
-
-        # Initial index
-        stats1 = setup["coordinator"].index_directory_incremental_v2(
-            setup["workspace"],
-            setup["cache"],
-            setup["chunk_index"],
-            progress_callback=None,
-        )
-
-        setup["cache"].save()
-        setup["chunk_index"].save()
-
-        # Should index 6 files (test.py + 5 new modules)
-        assert stats1["changed_files"] >= 6
-
-        # Wait and modify only one file
-        time.sleep(0.01)
-        changed_file = setup["workspace"] / "module2.py"
-        changed_file.write_text("""
-def function_2():
-    return "modified"
-""")
-
-        # Re-index
-        stats2 = setup["coordinator"].index_directory_incremental_v2(
-            setup["workspace"],
-            setup["cache"],
-            setup["chunk_index"],
-            progress_callback=None,
-        )
-
-        # Should only re-index the 1 changed file, not all 6
-        assert stats2["changed_files"] == 1
-        assert stats2["skipped_files"] == 5  # Other 5 files skipped
-
-    def test_chunk_index_tracks_stale_chunks(self, test_setup):
-        """Test that chunk index properly tracks stale chunks when files change."""
-        setup = test_setup
-
-        # Initial index
-        setup["coordinator"].index_directory_incremental_v2(
-            setup["workspace"],
-            setup["cache"],
-            setup["chunk_index"],
-            progress_callback=None,
-        )
-
-        setup["cache"].save()
-        setup["chunk_index"].save()
-
-        initial_valid_chunks = list(setup["chunk_index"].get_valid_chunks())
-        assert len(initial_valid_chunks) >= 1
-
-        # Modify file
-        time.sleep(0.01)
-        test_file = setup["workspace"] / "test.py"
-        test_file.write_text("""
-def modified_function():
-    return "Modified"
-""")
-
-        # Re-index
-        setup["coordinator"].index_directory_incremental_v2(
-            setup["workspace"],
-            setup["cache"],
-            setup["chunk_index"],
-            progress_callback=None,
-        )
-
-        # Old chunks should be marked stale
-        stale_chunks = setup["chunk_index"].get_stale_chunks()
-        assert len(stale_chunks) >= 1
-
-        # Should have new valid chunks
-        new_valid_chunks = list(setup["chunk_index"].get_valid_chunks())
-        assert len(new_valid_chunks) >= 1
-
-        # Upsert may preserve chunk IDs; ensure either IDs changed or previous IDs were stale-marked.
-        assert new_valid_chunks != initial_valid_chunks or any(
-            chunk_id in stale_chunks for chunk_id in initial_valid_chunks
-        )
-
-
-if __name__ == "__main__":
-    pytest.main([__file__, "-v"])
diff --git a/tests/test_cli_integration.py b/tests/test_cli_integration.py
index 035c80c..54e5392 100644
--- a/tests/test_cli_integration.py
+++ b/tests/test_cli_integration.py
@@ -6,7 +6,6 @@
 import subprocess
 import sys
 from pathlib import Path
-import os
 import shutil
 
 
@@ -122,48 +121,6 @@ def test_status_after_init(self, test_project):
         assert "index" in result.stdout.lower()
 
 
-class TestCLIIndex:
-    """Test 'sia-code index' command."""
-
-    def test_index_not_initialized(self, test_project):
-        """Test index when not initialized."""
-        result = run_cli(["index", "."], cwd=test_project)
-
-        assert result.returncode != 0
-
-    def test_index_basic(self, test_project):
-        """Test basic indexing."""
-        run_cli(["init"], cwd=test_project)
-        disable_embeddings(test_project)
-        result = run_cli(["index", "."], cwd=test_project)
-
-        assert result.returncode == 0
-        assert "indexing complete" in result.stdout.lower()
-
-    def test_index_clean(self, test_project):
-        """Test clean indexing."""
-        run_cli(["init"], cwd=test_project)
-        disable_embeddings(test_project)
-        run_cli(["index", "."], cwd=test_project)
-
-        result = run_cli(["index", "--clean", "."], cwd=test_project)
-
-        assert result.returncode == 0
-        assert "clean" in result.stdout.lower()
-
-    def test_index_clean_removes_legacy_usearch_file(self, test_project):
-        """Test clean indexing removes legacy vectors.usearch to allow sqlite-vec migration."""
-        run_cli(["init"], cwd=test_project)
-
-        legacy_vectors = test_project / ".sia-code" / "vectors.usearch"
-        legacy_vectors.write_text("legacy")
-
-        result = run_cli(["index", "--clean", "."], cwd=test_project)
-
-        assert result.returncode == 0
-        assert not legacy_vectors.exists()
-
-
 class TestCLISearch:
     """Test 'sia-code search' command."""
 
diff --git a/tests/unit/test_chunkhound_cli.py b/tests/unit/test_chunkhound_cli.py
new file mode 100644
index 0000000..ee8f8c4
--- /dev/null
+++ b/tests/unit/test_chunkhound_cli.py
@@ -0,0 +1,48 @@
+"""Unit tests for ChunkHound CLI bridge helpers."""
+
+from pathlib import Path
+
+from sia_code.config import Config
+from sia_code.search.chunkhound_cli import build_search_command, parse_search_output
+
+
+def test_build_search_command_regex_uses_no_embeddings_by_default():
+    config = Config()
+
+    cmd = build_search_command(
+        config=config,
+        query="auth",
+        project_path=Path("."),
+        db_path=Path("/tmp/chunkhound.db"),
+        mode="regex",
+        limit=7,
+    )
+
+    assert cmd[:3] == ["uvx", "chunkhound", "search"]
+    assert "--regex" in cmd
+    assert "--no-embeddings" in cmd
+    assert "--page-size" in cmd
+    assert "7" in cmd
+
+
+def test_parse_search_output_extracts_file_and_lines():
+    output = """=== Regex Search Results ===
+
+[1] src/auth/service.py
+[INFO] [blue][INFO][/blue] Lines 12-18
+```python
+def authenticate_user(token: str) -> bool:
+    return token != ""
+```
+"""
+
+    parsed = parse_search_output(output=output, query="authenticate", mode="regex")
+
+    assert parsed["query"] == "authenticate"
+    assert parsed["mode"] == "regex"
+    assert len(parsed["results"]) == 1
+    first = parsed["results"][0]
+    assert first["chunk"]["file_path"] == "src/auth/service.py"
+    assert first["chunk"]["start_line"] == 12
+    assert first["chunk"]["end_line"] == 18
+    assert "authenticate_user" in (first["snippet"] or "")
diff --git a/tests/unit/test_git_sync.py b/tests/unit/test_git_sync.py
index 35b73a8..fb6b6de 100644
--- a/tests/unit/test_git_sync.py
+++ b/tests/unit/test_git_sync.py
@@ -214,6 +214,117 @@ def test_meets_importance_threshold(self, sync_service):
         assert sync_service._meets_importance_threshold("medium", "medium") is True
         assert sync_service._meets_importance_threshold("low", "high") is False
 
+    def test_merge_branch_generates_commit_based_changelog(self, sync_service, mock_backend):
+        """Merge commits with 'Merge branch' message should create changelog entries."""
+        merge_event = {
+            "event_type": "merge",
+            "from_ref": "feat/location-mailing-list",
+            "to_ref": "develop",
+            "summary": "Merge branch 'feat/location-mailing-list' into 'develop'",
+            "files_changed": ["src/a.ts"],
+            "diff_stats": {"files": 1, "insertions": 10, "deletions": 2},
+            "importance": "medium",
+            "commit_hash": "abc123",
+            "commit_time": datetime(2026, 1, 1, 12, 0, 0),
+            "merge_commit": object(),
+        }
+
+        with patch.object(sync_service.extractor, "scan_git_tags", return_value=[]):
+            with patch.object(
+                sync_service.extractor, "scan_merge_events", return_value=[merge_event]
+            ):
+                with patch.object(
+                    sync_service.extractor,
+                    "get_commits_in_merge",
+                    return_value=[
+                        "feat: add mailing list support",
+                        "fix: resolve location sorting",
+                        "BREAKING CHANGE: rename location payload",
+                    ],
+                ):
+                    stats = sync_service.sync(merges_only=True)
+
+        assert stats["changelogs_added"] == 1
+        assert mock_backend.add_changelog.called
+        args = mock_backend.add_changelog.call_args.kwargs
+        assert args["tag"] == "merge:abc123"
+        assert "feat: add mailing list support" in args["features"]
+        assert "fix: resolve location sorting" in args["fixes"]
+        assert "BREAKING CHANGE: rename location payload" in args["breaking_changes"]
+
+    def test_non_merge_branch_messages_do_not_generate_commit_changelog(
+        self, sync_service, mock_backend
+    ):
+        """Merge commits without 'Merge branch' message should skip commit changelogs."""
+        merge_event = {
+            "event_type": "merge",
+            "from_ref": "feature-x",
+            "to_ref": "main",
+            "summary": "Merge pull request #123 from org/feature-x",
+            "files_changed": ["src/a.ts"],
+            "diff_stats": {"files": 1, "insertions": 10, "deletions": 2},
+            "importance": "medium",
+            "commit_hash": "def456",
+            "commit_time": datetime(2026, 1, 1, 12, 0, 0),
+            "merge_commit": object(),
+        }
+
+        with patch.object(sync_service.extractor, "scan_git_tags", return_value=[]):
+            with patch.object(
+                sync_service.extractor, "scan_merge_events", return_value=[merge_event]
+            ):
+                with patch.object(
+                    sync_service.extractor,
+                    "get_commits_in_merge",
+                    return_value=["feat: should be ignored"],
+                ):
+                    stats = sync_service.sync(merges_only=True)
+
+        assert stats["changelogs_added"] == 0
+        mock_backend.add_changelog.assert_not_called()
+
+    def test_sync_limit_zero_means_unbounded(self, sync_service, mock_backend):
+        """A limit of 0 should process all available events."""
+        merge_events = [
+            {
+                "event_type": "merge",
+                "from_ref": "a",
+                "to_ref": "b",
+                "summary": "Merge branch 'a' into 'b'",
+                "files_changed": [],
+                "diff_stats": {},
+                "importance": "medium",
+                "commit_hash": "aaa111",
+                "commit_time": datetime(2026, 1, 1, 12, 0, 0),
+                "merge_commit": object(),
+            },
+            {
+                "event_type": "merge",
+                "from_ref": "c",
+                "to_ref": "d",
+                "summary": "Merge branch 'c' into 'd'",
+                "files_changed": [],
+                "diff_stats": {},
+                "importance": "medium",
+                "commit_hash": "bbb222",
+                "commit_time": datetime(2026, 1, 1, 12, 0, 0),
+                "merge_commit": object(),
+            },
+        ]
+
+        with patch.object(sync_service.extractor, "scan_git_tags", return_value=[]):
+            with patch.object(
+                sync_service.extractor, "scan_merge_events", return_value=merge_events
+            ):
+                with patch.object(
+                    sync_service.extractor,
+                    "get_commits_in_merge",
+                    return_value=["fix: keep all"],
+                ):
+                    stats = sync_service.sync(limit=0, merges_only=True)
+
+        assert stats["timeline_added"] == 2
+
 
 if __name__ == "__main__":
     pytest.main([__file__, "-v"])
diff --git a/tests/unit/test_multi_hop.py b/tests/unit/test_multi_hop.py
deleted file mode 100644
index 70f8c73..0000000
--- a/tests/unit/test_multi_hop.py
+++ /dev/null
@@ -1,461 +0,0 @@
-"""Unit tests for multi-hop code research functionality."""
-
-import pytest
-from sia_code.core.models import Chunk
-from sia_code.core.types import ChunkType, Language, FilePath, LineNumber, ChunkId
-from sia_code.search.multi_hop import MultiHopSearchStrategy, CodeRelationship
-from sia_code.storage.usearch_backend import UsearchSqliteBackend
-
-
-@pytest.fixture
-def backend(tmp_path):
-    """Create a temporary backend for testing."""
-    test_path = tmp_path / ".sia-code"
-    backend = UsearchSqliteBackend(test_path, embedding_enabled=False)
-    backend.create_index()
-    yield backend
-    backend.close()
-
-
-@pytest.fixture
-def sample_chunks():
-    """Create sample chunks with realistic code relationships."""
-    return [
-        # Main entry point
-        Chunk(
-            symbol="main",
-            start_line=LineNumber(1),
-            end_line=LineNumber(10),
-            code="""def main():
-    config = load_config()
-    data = fetch_data()
-    result = process_data(data)
-    save_result(result)
-""",
-            chunk_type=ChunkType.FUNCTION,
-            language=Language.PYTHON,
-            file_path=FilePath("app/main.py"),
-        ),
-        # Helper function 1
-        Chunk(
-            symbol="load_config",
-            start_line=LineNumber(1),
-            end_line=LineNumber(5),
-            code="""def load_config():
-    with open('config.json') as f:
-        return json.load(f)
-""",
-            chunk_type=ChunkType.FUNCTION,
-            language=Language.PYTHON,
-            file_path=FilePath("app/config.py"),
-        ),
-        # Helper function 2
-        Chunk(
-            symbol="fetch_data",
-            start_line=LineNumber(1),
-            end_line=LineNumber(5),
-            code="""def fetch_data():
-    response = requests.get(API_URL)
-    return parse_response(response)
-""",
-            chunk_type=ChunkType.FUNCTION,
-            language=Language.PYTHON,
-            file_path=FilePath("app/data.py"),
-        ),
-        # Helper function 3
-        Chunk(
-            symbol="process_data",
-            start_line=LineNumber(1),
-            end_line=LineNumber(5),
-            code="""def process_data(data):
-    cleaned = clean_data(data)
-    return transform_data(cleaned)
-""",
-            chunk_type=ChunkType.FUNCTION,
-            language=Language.PYTHON,
-            file_path=FilePath("app/processor.py"),
-        ),
-        # Deeply nested function
-        Chunk(
-            symbol="parse_response",
-            start_line=LineNumber(1),
-            end_line=LineNumber(3),
-            code="""def parse_response(response):
-    return response.json()
-""",
-            chunk_type=ChunkType.FUNCTION,
-            language=Language.PYTHON,
-            file_path=FilePath("app/parser.py"),
-        ),
-    ]
-
-
-class TestMultiHopResearch:
-    """Test multi-hop code research functionality."""
-
-    def test_research_returns_results(self, backend, sample_chunks):
-        """Test that research returns results for a valid query."""
-        # Store chunks
-        backend.store_chunks_batch(sample_chunks)
-
-        # Create multi-hop strategy
-        strategy = MultiHopSearchStrategy(backend, max_hops=1)
-
-        # Research for "main"
-        result = strategy.research("main", max_results_per_hop=5)
-
-        # Should find at least the main function
-        assert len(result.chunks) >= 1
-        assert result.question == "main"
-        assert result.hops_executed >= 0
-
-    def test_research_respects_max_hops(self, backend, sample_chunks):
-        """Test that research respects max_hops parameter."""
-        backend.store_chunks_batch(sample_chunks)
-
-        # Test with max_hops=0 (only initial search)
-        strategy_0 = MultiHopSearchStrategy(backend, max_hops=0)
-        result_0 = strategy_0.research("main", max_results_per_hop=5)
-        assert result_0.hops_executed == 0
-
-        # Test with max_hops=1 (one hop)
-        strategy_1 = MultiHopSearchStrategy(backend, max_hops=1)
-        result_1 = strategy_1.research("main", max_results_per_hop=5)
-        assert result_1.hops_executed <= 1
-
-        # Test with max_hops=2 (two hops)
-        strategy_2 = MultiHopSearchStrategy(backend, max_hops=2)
-        result_2 = strategy_2.research("main", max_results_per_hop=5)
-        assert result_2.hops_executed <= 2
-
-    def test_research_respects_max_total_chunks(self, backend, sample_chunks):
-        """Test that research respects max_total_chunks safety limit."""
-        backend.store_chunks_batch(sample_chunks)
-
-        strategy = MultiHopSearchStrategy(backend, max_hops=10)
-
-        # Set low limit
-        result = strategy.research("main", max_results_per_hop=5, max_total_chunks=3)
-
-        # Should not exceed the limit
-        assert len(result.chunks) <= 3
-
-    def test_research_discovers_relationships(self, backend, sample_chunks):
-        """Test that multi-hop research discovers code relationships."""
-        backend.store_chunks_batch(sample_chunks)
-
-        strategy = MultiHopSearchStrategy(backend, max_hops=2)
-        result = strategy.research("main", max_results_per_hop=5)
-
-        # Should discover some relationships
-        # (exact count depends on entity extraction success)
-        assert result.relationships is not None
-        assert isinstance(result.relationships, list)
-
-        # Each relationship should have valid structure
-        for rel in result.relationships:
-            assert rel.from_entity is not None
-            assert rel.to_entity is not None
-            assert rel.relationship_type is not None
-
-    def test_research_handles_empty_results(self, backend):
-        """Test that research handles queries with no results gracefully."""
-        strategy = MultiHopSearchStrategy(backend, max_hops=1)
-
-        # Search for something that doesn't exist
-        result = strategy.research("nonexistent_function_xyz")
-
-        # Should return empty result, not crash
-        assert result.question == "nonexistent_function_xyz"
-        assert len(result.chunks) == 0
-        assert len(result.relationships) == 0
-        assert result.hops_executed == 0
-
-    def test_research_tracks_entities_found(self, backend, sample_chunks):
-        """Test that research tracks total entities found."""
-        backend.store_chunks_batch(sample_chunks)
-
-        strategy = MultiHopSearchStrategy(backend, max_hops=1)
-        result = strategy.research("main", max_results_per_hop=5)
-
-        # Should track entities (even if 0 due to extraction limitations)
-        assert result.total_entities_found >= 0
-        assert isinstance(result.total_entities_found, int)
-
-
-class TestCallGraphBuilding:
-    """Test call graph construction from relationships."""
-
-    def test_build_call_graph(self, tmp_path):
-        """Test building call graph from relationships."""
-        relationships = [
-            CodeRelationship(
-                from_entity="main",
-                to_entity="load_config",
-                relationship_type="function_call",
-                from_chunk=ChunkId("chunk1"),
-                to_chunk=ChunkId("chunk2"),
-            ),
-            CodeRelationship(
-                from_entity="main",
-                to_entity="fetch_data",
-                relationship_type="function_call",
-                from_chunk=ChunkId("chunk1"),
-                to_chunk=ChunkId("chunk3"),
-            ),
-            CodeRelationship(
-                from_entity="fetch_data",
-                to_entity="parse_response",
-                relationship_type="function_call",
-                from_chunk=ChunkId("chunk3"),
-                to_chunk=ChunkId("chunk4"),
-            ),
-        ]
-
-        backend = UsearchSqliteBackend(tmp_path / ".sia-code", embedding_enabled=False)
-        strategy = MultiHopSearchStrategy(backend, max_hops=1)
-
-        graph = strategy.build_call_graph(relationships)
-
-        # Should have entries for calling entities
-        assert "main" in graph
-        assert "fetch_data" in graph
-
-        # main should call load_config and fetch_data
-        assert len(graph["main"]) == 2
-        targets = {edge["target"] for edge in graph["main"]}
-        assert "load_config" in targets
-        assert "fetch_data" in targets
-
-        # fetch_data should call parse_response
-        assert len(graph["fetch_data"]) == 1
-        assert graph["fetch_data"][0]["target"] == "parse_response"
-
-    def test_build_call_graph_empty(self, tmp_path):
-        """Test building call graph with no relationships."""
-        backend = UsearchSqliteBackend(tmp_path / ".sia-code", embedding_enabled=False)
-        strategy = MultiHopSearchStrategy(backend, max_hops=1)
-
-        graph = strategy.build_call_graph([])
-
-        # Should return empty graph
-        assert graph == {}
-
-    def test_build_call_graph_includes_metadata(self, tmp_path):
-        """Test that call graph includes relationship metadata."""
-        relationships = [
-            CodeRelationship(
-                from_entity="ClassA",
-                to_entity="ClassB",
-                relationship_type="inheritance",
-                from_chunk=ChunkId("chunk1"),
-                to_chunk=ChunkId("chunk2"),
-            ),
-        ]
-
-        backend = UsearchSqliteBackend(tmp_path / ".sia-code", embedding_enabled=False)
-        strategy = MultiHopSearchStrategy(backend, max_hops=1)
-
-        graph = strategy.build_call_graph(relationships)
-
-        # Should include relationship type
-        assert graph["ClassA"][0]["type"] == "inheritance"
-        assert graph["ClassA"][0]["chunk_id"] == ChunkId("chunk2")
-
-
-class TestEntryPointDetection:
-    """Test entry point identification in call graphs."""
-
-    def test_get_entry_points(self, tmp_path):
-        """Test identifying entry points (no incoming edges)."""
-        relationships = [
-            CodeRelationship("main", "load_config", "function_call"),
-            CodeRelationship("main", "fetch_data", "function_call"),
-            CodeRelationship("fetch_data", "parse_response", "function_call"),
-        ]
-
-        backend = UsearchSqliteBackend(tmp_path / ".sia-code", embedding_enabled=False)
-        strategy = MultiHopSearchStrategy(backend, max_hops=1)
-
-        entry_points = strategy.get_entry_points(relationships)
-
-        # Only "main" should be an entry point (never a target)
-        assert "main" in entry_points
-        assert "load_config" not in entry_points  # Called by main
-        assert "fetch_data" not in entry_points  # Called by main
-        assert "parse_response" not in entry_points  # Called by fetch_data
-
-    def test_get_entry_points_multiple(self, tmp_path):
-        """Test identifying multiple entry points."""
-        relationships = [
-            CodeRelationship("main", "helper", "function_call"),
-            CodeRelationship("test_main", "helper", "function_call"),
-            CodeRelationship("helper", "util", "function_call"),
-        ]
-
-        backend = UsearchSqliteBackend(tmp_path / ".sia-code", embedding_enabled=False)
-        strategy = MultiHopSearchStrategy(backend, max_hops=1)
-
-        entry_points = strategy.get_entry_points(relationships)
-
-        # Both main and test_main are entry points
-        assert len(entry_points) == 2
-        assert "main" in entry_points
-        assert "test_main" in entry_points
-
-    def test_get_entry_points_empty(self, tmp_path):
-        """Test entry point detection with no relationships."""
-        backend = UsearchSqliteBackend(tmp_path / ".sia-code", embedding_enabled=False)
-        strategy = MultiHopSearchStrategy(backend, max_hops=1)
-
-        entry_points = strategy.get_entry_points([])
-
-        # Should return empty list
-        assert entry_points == []
-
-    def test_get_entry_points_circular(self, tmp_path):
-        """Test entry point detection with circular relationships."""
-        relationships = [
-            CodeRelationship("A", "B", "calls"),
-            CodeRelationship("B", "C", "calls"),
-            CodeRelationship("C", "A", "calls"),  # Circular
-        ]
-
-        backend = UsearchSqliteBackend(tmp_path / ".sia-code", embedding_enabled=False)
-        strategy = MultiHopSearchStrategy(backend, max_hops=1)
-
-        entry_points = strategy.get_entry_points(relationships)
-
-        # In a circular graph, no entity is an entry point
-        assert len(entry_points) == 0
-
-
-class TestAdaptiveSearch:
-    """Test adaptive search strategy (semantic vs preprocessed lexical)."""
-
-    def test_uses_semantic_when_embeddings_enabled(self, backend, sample_chunks):
-        """Research should use semantic search when embeddings are available."""
-        backend.store_chunks_batch(sample_chunks)
-
-        # Enable embeddings
-        backend.embedding_enabled = True
-
-        # Mock search_semantic to track if it's called
-        original_search_semantic = backend.search_semantic
-        call_count = {"count": 0}
-
-        def mock_search_semantic(*args, **kwargs):
-            call_count["count"] += 1
-            return original_search_semantic(*args, **kwargs)
-
-        backend.search_semantic = mock_search_semantic
-
-        strategy = MultiHopSearchStrategy(backend, max_hops=1)
-        strategy.research("How does main work?", max_results_per_hop=5)
-
-        # Should have called semantic search
-        assert call_count["count"] >= 1
-
-    def test_uses_lexical_when_embeddings_disabled(self, backend, sample_chunks):
-        """Research should use preprocessed lexical search when embeddings disabled."""
-        backend.store_chunks_batch(sample_chunks)
-
-        # Disable embeddings
-        backend.embedding_enabled = False
-
-        # Mock search_lexical to track calls
-        original_search_lexical = backend.search_lexical
-        call_count = {"count": 0}
-        calls = []
-
-        def mock_search_lexical(query, *args, **kwargs):
-            call_count["count"] += 1
-            calls.append(query)
-            return original_search_lexical(query, *args, **kwargs)
-
-        backend.search_lexical = mock_search_lexical
-
-        strategy = MultiHopSearchStrategy(backend, max_hops=1)
-        strategy.research("How does main work?", max_results_per_hop=5)
-
-        # Should have called lexical search
-        assert call_count["count"] >= 1
-        # First call should be preprocessed (no "How", "does")
-        first_query = calls[0]
-        assert "how" not in first_query.lower() or "main" in first_query.lower()
-
-
-class TestNaturalLanguageQueries:
-    """Test that research handles natural language questions."""
-
-    def test_natural_language_question_with_embeddings(self, backend, sample_chunks):
-        """Natural language questions should attempt semantic search when enabled."""
-        backend.store_chunks_batch(sample_chunks)
-        backend.embedding_enabled = True
-
-        strategy = MultiHopSearchStrategy(backend, max_hops=1)
-        # This should not crash even if embeddings aren't available
-        result = strategy.research("How does the main function work?", max_results_per_hop=5)
-
-        # Should return a valid result object (may be empty if no API key)
-        assert isinstance(result.chunks, list)
-        assert result.question == "How does the main function work?"
-
-    def test_natural_language_question_without_embeddings(self, backend, sample_chunks):
-        """Natural language questions should work with preprocessing fallback."""
-        backend.store_chunks_batch(sample_chunks)
-        backend.embedding_enabled = False
-
-        strategy = MultiHopSearchStrategy(backend, max_hops=1)
-        result = strategy.research("How does main work", max_results_per_hop=5)
-
-        # With preprocessing, should find "main" after removing "How", "does"
-        # Result should be valid (may have results depending on lexical matching)
-        assert isinstance(result.chunks, list)
-        assert result.hops_executed >= 0
-
-    def test_question_with_code_identifiers(self, backend, sample_chunks):
-        """Questions with code identifiers should preserve them in preprocessing."""
-        backend.store_chunks_batch(sample_chunks)
-        backend.embedding_enabled = False
-
-        # Use simpler query that will match
-        strategy = MultiHopSearchStrategy(backend, max_hops=1)
-        result = strategy.research("load_config", max_results_per_hop=5)
-
-        # Should find the load_config function with keyword search
-        assert len(result.chunks) >= 1
-        symbols = [chunk.symbol for chunk in result.chunks]
-        assert "load_config" in symbols
-
-    def test_natural_language_preprocessing_removes_stop_words(self, backend, sample_chunks):
-        """Verify that preprocessing is applied for natural language questions."""
-        backend.store_chunks_batch(sample_chunks)
-        backend.embedding_enabled = False
-
-        # Track what query is actually used in lexical search
-        original_search_lexical = backend.search_lexical
-        actual_queries = []
-
-        def track_search_lexical(query, *args, **kwargs):
-            actual_queries.append(query)
-            return original_search_lexical(query, *args, **kwargs)
-
-        backend.search_lexical = track_search_lexical
-
-        strategy = MultiHopSearchStrategy(backend, max_hops=0)
-        strategy.research("How does the config work?", max_results_per_hop=5)
-
-        # Should have made at least one lexical search
-        assert len(actual_queries) >= 1
-
-        # First query should have stop words removed
-        first_query = actual_queries[0].lower()
-        # "how", "does", "the" should be removed, "config" should remain
-        assert "config" in first_query
-        # Stop words should ideally be removed (may not be perfect but should try)
-        # Just verify config is present - that's the key term
-
-
-if __name__ == "__main__":
-    pytest.main([__file__, "-v"])
diff --git a/uv.lock b/uv.lock
index 948915b..b4b6fd8 100644
--- a/uv.lock
+++ b/uv.lock
@@ -2720,7 +2720,7 @@ wheels = [
 
 [[package]]
 name = "sia-code"
-version = "0.6.0"
+version = "0.7.1"
 source = { editable = "." }
 dependencies = [
     { name = "click" },
@@ -3098,10 +3098,10 @@ dependencies = [
     { name = "typing-extensions" },
 ]
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/e3/ea/304cf7afb744aa626fa9855245526484ee55aba610d9973a0521c552a843/torch-2.10.0-1-cp310-none-macosx_11_0_arm64.whl", hash = "sha256:c37fc46eedd9175f9c81814cc47308f1b42cfe4987e532d4b423d23852f2bf63", size = 79411450, upload-time = "2026-02-06T17:37:35.75Z" },
-    { url = "https://files.pythonhosted.org/packages/25/d8/9e6b8e7df981a1e3ea3907fd5a74673e791da483e8c307f0b6ff012626d0/torch-2.10.0-1-cp311-none-macosx_11_0_arm64.whl", hash = "sha256:f699f31a236a677b3118bc0a3ef3d89c0c29b5ec0b20f4c4bf0b110378487464", size = 79423460, upload-time = "2026-02-06T17:37:39.657Z" },
-    { url = "https://files.pythonhosted.org/packages/c9/2f/0b295dd8d199ef71e6f176f576473d645d41357b7b8aa978cc6b042575df/torch-2.10.0-1-cp312-none-macosx_11_0_arm64.whl", hash = "sha256:6abb224c2b6e9e27b592a1c0015c33a504b00a0e0938f1499f7f514e9b7bfb5c", size = 79498197, upload-time = "2026-02-06T17:37:27.627Z" },
-    { url = "https://files.pythonhosted.org/packages/a4/1b/af5fccb50c341bd69dc016769503cb0857c1423fbe9343410dfeb65240f2/torch-2.10.0-1-cp313-none-macosx_11_0_arm64.whl", hash = "sha256:7350f6652dfd761f11f9ecb590bfe95b573e2961f7a242eccb3c8e78348d26fe", size = 79498248, upload-time = "2026-02-06T17:37:31.982Z" },
+    { url = "https://files.pythonhosted.org/packages/5b/30/bfebdd8ec77db9a79775121789992d6b3b75ee5494971294d7b4b7c999bc/torch-2.10.0-2-cp310-none-macosx_11_0_arm64.whl", hash = "sha256:2b980edd8d7c0a68c4e951ee1856334a43193f98730d97408fbd148c1a933313", size = 79411457, upload-time = "2026-02-10T21:44:59.189Z" },
+    { url = "https://files.pythonhosted.org/packages/0f/8b/4b61d6e13f7108f36910df9ab4b58fd389cc2520d54d81b88660804aad99/torch-2.10.0-2-cp311-none-macosx_11_0_arm64.whl", hash = "sha256:418997cb02d0a0f1497cf6a09f63166f9f5df9f3e16c8a716ab76a72127c714f", size = 79423467, upload-time = "2026-02-10T21:44:48.711Z" },
+    { url = "https://files.pythonhosted.org/packages/d3/54/a2ba279afcca44bbd320d4e73675b282fcee3d81400ea1b53934efca6462/torch-2.10.0-2-cp312-none-macosx_11_0_arm64.whl", hash = "sha256:13ec4add8c3faaed8d13e0574f5cd4a323c11655546f91fbe6afa77b57423574", size = 79498202, upload-time = "2026-02-10T21:44:52.603Z" },
+    { url = "https://files.pythonhosted.org/packages/ec/23/2c9fe0c9c27f7f6cb865abcea8a4568f29f00acaeadfc6a37f6801f84cb4/torch-2.10.0-2-cp313-none-macosx_11_0_arm64.whl", hash = "sha256:e521c9f030a3774ed770a9c011751fb47c4d12029a3d6522116e48431f2ff89e", size = 79498254, upload-time = "2026-02-10T21:44:44.095Z" },
     { url = "https://files.pythonhosted.org/packages/0c/1a/c61f36cfd446170ec27b3a4984f072fd06dab6b5d7ce27e11adb35d6c838/torch-2.10.0-cp310-cp310-manylinux_2_28_aarch64.whl", hash = "sha256:5276fa790a666ee8becaffff8acb711922252521b28fbce5db7db5cf9cb2026d", size = 145992962, upload-time = "2026-01-21T16:24:14.04Z" },
     { url = "https://files.pythonhosted.org/packages/b5/60/6662535354191e2d1555296045b63e4279e5a9dbad49acf55a5d38655a39/torch-2.10.0-cp310-cp310-manylinux_2_28_x86_64.whl", hash = "sha256:aaf663927bcd490ae971469a624c322202a2a1e68936eb952535ca4cd3b90444", size = 915599237, upload-time = "2026-01-21T16:23:25.497Z" },
     { url = "https://files.pythonhosted.org/packages/40/b8/66bbe96f0d79be2b5c697b2e0b187ed792a15c6c4b8904613454651db848/torch-2.10.0-cp310-cp310-win_amd64.whl", hash = "sha256:a4be6a2a190b32ff5c8002a0977a25ea60e64f7ba46b1be37093c141d9c49aeb", size = 113720931, upload-time = "2026-01-21T16:24:23.743Z" },