DxTa · DxTa · Feb 20, 2026 · Feb 20, 2026 · Feb 23, 2026
diff --git a/README.md b/README.md
@@ -4,10 +4,12 @@ Local-first codebase intelligence for CLI workflows.
 
 Sia Code indexes your repo and lets you:
 
-- search code fast (lexical, semantic, or hybrid)
-- trace architecture with multi-hop research
+- search code fast via ChunkHound CLI (lexical or semantic)
+- trace architecture with ChunkHound research
 - store/retrieve project decisions and timeline context
 
+Search and research are hard-switched to ChunkHound CLI. Sia keeps index orchestration and memory storage local.
+
 ## Why teams use it
 
 - Works directly on local code (`.sia-code/` index per repo/worktree)
@@ -31,6 +33,9 @@ sia-code --version
 ## Quick Start (2 minutes)
 
 ```bash
+# install ChunkHound CLI once
+uv tool install chunkhound
+
 # in your project
 sia-code init
 sia-code index .
@@ -53,29 +58,31 @@ sia-code status
 | `sia-code index .` | Build index |
 | `sia-code index --update` | Incremental re-index |
 | `sia-code index --clean` | Rebuild index from scratch |
-| `sia-code search "query"` | Hybrid search (default) |
-| `sia-code search --regex "pattern"` | Lexical search |
-| `sia-code research "question"` | Multi-hop relationship discovery |
+| `sia-code search "query"` | ChunkHound-backed search (default mode from config) |
+| `sia-code search --regex "pattern"` | ChunkHound lexical search |
+| `sia-code research "question"` | ChunkHound research |
 | `sia-code memory sync-git` | Import timeline/changelog from git |
 | `sia-code memory search "topic"` | Search stored project memory |
 | `sia-code config show` | Print active configuration |
 
 ## Search Modes (important)
 
-- Default command is hybrid: `sia-code search "query"`
+- Default search mode comes from `chunkhound.default_search_mode` (default: `regex`)
 - Lexical mode: `sia-code search --regex "pattern"`
-- Semantic-only mode: `sia-code search --semantic-only "query"`
+- Semantic-only mode: `sia-code search --semantic-only "query"` (requires ChunkHound semantic setup)
 
-Use `--no-deps` when you want only your project code.
+Dependency visibility flags (`--no-deps`, `--deps-only`) are currently compatibility no-ops with ChunkHound-backed search.
 
 ## Git Sync Memory + Semantic Changelog
 
 `sia-code memory sync-git` is the fastest way to build project memory from git history.
 
 - Scans tags into changelog entries
 - Scans merge commits into timeline events
+- For merge commits whose subject matches `Merge branch '...'`, also creates changelog entries
 - Stores `files_changed` and diff stats (`insertions`, `deletions`, `files`)
 - Optionally enhances sparse summaries using a local summarization model
+- `memory sync-git --limit 0` processes all eligible events
 
 How semantic summary generation works:
 
@@ -111,8 +118,8 @@ Useful commands:
 
 ```bash
 sia-code config show
-sia-code config get search.vector_weight
-sia-code config set search.vector_weight 0.0
+sia-code config get chunkhound.default_search_mode
+sia-code config set chunkhound.default_search_mode semantic
 ```
 
 Note: backend selection is auto by default (`sqlite-vec` for new indexes, legacy `usearch` supported).

diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md
@@ -11,9 +11,9 @@ Sia Code has two core pipelines:
    - write lexical/vector indexes
 
 2. **Query pipeline**
-   - preprocess query
-   - run lexical and/or semantic search
-   - rank + return chunk matches
+   - resolve mode and build ChunkHound CLI command
+   - execute ChunkHound search/research
+   - parse and render results in Sia CLI formats
 
 ## Storage Model
 
@@ -34,17 +34,18 @@ Backend selection:
 - `cli.py`: command entry and orchestration
 - `indexer/coordinator.py`: full/incremental indexing lifecycle
 - `parser/*`: language detection, concept extraction, chunk building
-- `storage/*`: search execution and persistence
+- `search/chunkhound_cli.py`: ChunkHound command bridge and output parsing
+- `storage/*`: memory persistence plus legacy/local search paths
 - `memory/*`: git-to-memory sync and timeline/changelog tooling
 - `embed_server/*`: optional shared embed daemon
 
 ## Search Architecture
 
-- **Hybrid (default):** lexical + semantic
-- **Lexical (`--regex`):** exact token/symbol heavy queries
-- **Semantic (`--semantic-only`):** concept similarity only
+- **Default (`search`)**: mode from `chunkhound.default_search_mode` (default `regex`)
+- **Lexical (`--regex`)**: exact token/symbol heavy queries
+- **Semantic (`--semantic-only`)**: ChunkHound semantic mode
 
-Flags like `--no-deps` and `--deps-only` control dependency-code visibility.
+Flags like `--no-deps` and `--deps-only` are accepted for compatibility but currently no-op with ChunkHound-backed search.
 
 ## Design Goals
 

diff --git a/docs/BENCHMARK_METHODOLOGY.md b/docs/BENCHMARK_METHODOLOGY.md
@@ -2,10 +2,12 @@
 
 This project uses RepoEval-style retrieval evaluation for search quality checks.
 
+> Note: The benchmark harness in `tests/benchmarks/` targets legacy in-process retrievers. ChunkHound-backed CLI search/research should be benchmarked separately as end-to-end CLI runs.
+
 ## Scope
 
 - Evaluate retrieval quality (not answer generation)
-- Compare lexical, hybrid, and semantic settings
+- Compare lexical, hybrid, and semantic settings in the legacy retriever stack
 - Use consistent query set and top-k metrics
 
 ## Minimal Reproduction Flow
@@ -23,7 +25,7 @@ pkgx python tests/benchmarks/run_full_repoeval_benchmark.py
 
 - Recall@k (especially Recall@5)
 - indexing time and query latency
-- configuration used (`vector_weight`, embedding settings)
+- configuration used (`chunkhound.default_search_mode` for CLI runs, `vector_weight` for legacy runs, embedding settings)
 
 ## Fairness Rules
 

diff --git a/docs/BENCHMARK_RESULTS.md b/docs/BENCHMARK_RESULTS.md
@@ -5,19 +5,21 @@
 - RepoEval Recall@5: **89.9%** (reported)
 - Improvement over cAST baseline: **+12.9 points** (reported)
 
+> Note: These numbers are historical baselines from legacy in-process retrievers. Current CLI `search` and `research` are ChunkHound-backed.
+
 ## Practical Takeaways
 
 - Lexical-heavy search performs strongly for code identifiers.
-- Hybrid can still be useful for natural-language style queries.
+- For legacy retriever experiments, hybrid can still help natural-language style queries.
 - For daily debugging, `--regex` is often the fastest path.
 
 ## Recommended Starting Config
 
 ```bash
-sia-code config set search.vector_weight 0.0
+sia-code config set chunkhound.default_search_mode regex
 ```
 
-Then adjust only if your query style is mostly conceptual.
+For legacy benchmark experiments, `search.vector_weight` remains available in the in-process retriever stack.
 
 ## Where to find raw benchmark tooling
 

diff --git a/docs/CLI_FEATURES.md b/docs/CLI_FEATURES.md
@@ -18,8 +18,8 @@ sia-code status
 | --- | --- | --- |
 | `init` | Create `.sia-code/` index workspace | `--path`, `--dry-run` |
 | `index [PATH]` | Build index | `--update`, `--clean`, `--parallel`, `--workers`, `--watch`, `--debounce`, `--no-git-sync` |
-| `search QUERY` | Search code (default hybrid) | `--regex`, `--semantic-only`, `-k/--limit`, `--no-filter`, `--no-deps`, `--deps-only`, `--format`, `--output` |
-| `research QUESTION` | Multi-hop architecture exploration | `--hops`, `--graph`, `-k/--limit`, `--no-filter` |
+| `search QUERY` | Search code (ChunkHound-backed) | `--regex`, `--semantic-only`, `-k/--limit`, `--no-filter` (compat), `--no-deps` (compat), `--deps-only` (compat), `--format`, `--output` |
+| `research QUESTION` | Architecture exploration (ChunkHound-backed) | `--hops` (compat), `--graph` (compat), `-k/--limit` (compat), `--no-filter` (compat) |
 | `status` | Index health and statistics | none |
 | `compact [PATH]` | Remove stale chunks | `--threshold`, `--force` |
 | `interactive` | Live query loop | `--regex`, `-k/--limit` |
@@ -28,17 +28,17 @@ sia-code status
 
 | Command | Purpose | Key options |
 | --- | --- | --- |
-| `memory sync-git` | Import timeline/changelog from git (with diff stats and optional local semantic summaries) | `--since`, `--limit`, `--dry-run`, `--tags-only`, `--merges-only`, `--min-importance` |
+| `memory sync-git` | Import timeline/changelog from git (with diff stats and optional local semantic summaries) | `--since`, `--limit` (`0` means all), `--dry-run`, `--tags-only`, `--merges-only`, `--min-importance` |
 | `memory add-decision TITLE` | Add pending decision | `-d/--description` (required), `-r/--reasoning`, `-a/--alternatives` |
-| `memory list` | List memory items | `--type`, `--status`, `--limit`, `--format` |
+| `memory list` | List memory items | `--type`, `--status`, `--limit` (`0` means all), `--format` |
 | `memory approve ID` | Approve decision | `-c/--category` (required) |
 | `memory reject ID` | Reject decision | none |
 | `memory search QUERY` | Search memory | `--type`, `-k/--limit` |
-| `memory timeline` | View timeline events | `--since`, `--event-type`, `--importance`, `--format` |
-| `memory changelog [RANGE]` | Generate changelog | `--format`, `--output` |
+| `memory timeline` | View timeline events | `--since`, `--event-type`, `--importance`, `--limit` (`0` means all), `--format` |
+| `memory changelog [RANGE]` | Generate changelog | `--limit` (`0` means all), `--format`, `--output` |
 | `memory export` / `memory import` | Backup/restore memory | `-o/--output`, `-i/--input` |
 
-`memory sync-git` is the entrypoint for semantic changelog generation: it extracts git context, then (if enabled) uses the local summarizer to enrich release and merge summaries stored in memory.
+`memory sync-git` is the entrypoint for semantic changelog generation: it extracts git context, then (if enabled) uses the local summarizer to enrich tag releases and merge-derived changelog entries stored in memory.
 
 ## Embed Daemon
 
@@ -48,15 +48,15 @@ sia-code status
 | `embed status` | Show daemon status |
 | `embed stop` | Stop daemon |
 
-Use daemon when you rely heavily on hybrid/semantic search or memory embedding operations.
+Use daemon when you rely heavily on memory embedding operations.
 
 ## Config Commands
 
 ```bash
 sia-code config show
 sia-code config path
-sia-code config get search.vector_weight
-sia-code config set search.vector_weight 0.0
+sia-code config get chunkhound.default_search_mode
+sia-code config set chunkhound.default_search_mode semantic
 ```
 
 ## Output Formats
@@ -71,8 +71,8 @@ sia-code config set search.vector_weight 0.0
 - First index: `sia-code index .`
 - Ongoing work: `sia-code index --update`
 - Exact symbols: `sia-code search --regex "pattern"`
-- Project-only focus: `--no-deps`
-- Architecture questions: `sia-code research "..." --hops 3`
+- If output is noisy: tighten regex terms or add path-like query terms
+- Architecture questions: `sia-code research "..."`
 
 ## Related Docs
 

diff --git a/docs/CODE_STRUCTURE.md b/docs/CODE_STRUCTURE.md
@@ -21,8 +21,8 @@ sia_code/
   core/                  # shared models and enums
   parser/                # AST concept extraction and chunking
   indexer/               # indexing orchestration, hash cache, metrics
-  search/                # query pre-processing and multi-hop logic
-  storage/               # sqlite-vec + legacy usearch backends
+  search/                # ChunkHound CLI bridge + query helpers
+  storage/               # memory persistence + legacy local search backends
   memory/                # git sync, timeline, changelog, decision flow
   embed_server/          # optional embedding daemon
 ```
@@ -35,7 +35,8 @@ sia_code/
 | Change default behavior | `sia_code/config.py`, `sia_code/cli.py` |
 | Tune indexing | `sia_code/indexer/coordinator.py`, `sia_code/indexer/chunk_index.py` |
 | Tune chunking | `sia_code/parser/chunker.py`, `sia_code/parser/concepts.py` |
-| Search ranking/filtering | `sia_code/storage/sqlite_vec_backend.py`, `sia_code/storage/usearch_backend.py` |
+| ChunkHound search/research bridge | `sia_code/search/chunkhound_cli.py`, `sia_code/cli.py` |
+| Legacy/local search ranking (interactive) | `sia_code/storage/sqlite_vec_backend.py`, `sia_code/storage/usearch_backend.py` |
 | Backend selection logic | `sia_code/storage/factory.py` |
 | Memory commands and sync | `sia_code/memory/git_sync.py`, `sia_code/memory/git_events.py`, `sia_code/cli.py` |
 

diff --git a/docs/LLM_CLI_INTEGRATION.md b/docs/LLM_CLI_INTEGRATION.md
@@ -30,21 +30,29 @@ Load skill sia-code
 ## 3) Recommended agent workflow
 
 ```bash
+uv tool install chunkhound
 uvx sia-code status
 uvx sia-code init
 uvx sia-code index .
 uvx sia-code search --regex "your symbol"
 uvx sia-code research "how does X work?"
 ```
 
+Notes:
+
+- `search` and `research` are ChunkHound-backed.
+- Memory commands stay in Sia's local memory database.
+
 ## 4) Optional memory workflow
 
 ```bash
-uvx sia-code memory sync-git
+uvx sia-code memory sync-git --limit 0
 uvx sia-code memory search "topic"
 uvx sia-code memory add-decision "Decision title" -d "Context" -r "Reason"
 ```
 
+`memory sync-git` also derives changelog entries from merge commits whose subject matches `Merge branch '...'`.
+
 ## 5) Multiple worktrees / multiple Claude Code instances
 
 Use one of these index strategies per session:

diff --git a/docs/MEMORY_FEATURES.md b/docs/MEMORY_FEATURES.md
@@ -31,6 +31,7 @@ sia-code memory search "Adopt X" --type decision
 
 - Tags become changelog memory entries
 - Merge commits become timeline memory events
+- Merge commits whose subject matches `Merge branch '...'` also become changelog entries
 - Each event captures changed files and diff stats
 - Duplicate events are skipped automatically
 
@@ -69,6 +70,11 @@ Notes:
 | `memory changelog` | render changelog text/json/markdown |
 | `memory export` / `memory import` | backup/restore memory data |
 
+Limit behavior:
+
+- `memory sync-git --limit 0` processes all eligible events
+- `memory list --limit 0`, `memory timeline --limit 0`, and `memory changelog --limit 0` return all rows
+
 ## Good Practices
 
 - Add decisions with explicit `description` and `reasoning`.

diff --git a/docs/PERFORMANCE_ANALYSIS.md b/docs/PERFORMANCE_ANALYSIS.md
@@ -3,18 +3,18 @@
 ## Typical Expectations
 
 - `search --regex`: usually lowest-latency mode
-- hybrid `search`: additional semantic overhead
+- `search --semantic-only`: usually higher latency than regex
 - `index --update`: much faster than full rebuild for small changes
 
-Actual speed depends on repo size, hardware, and embedding configuration.
+Actual speed depends on repo size, hardware, and ChunkHound semantic/provider setup.
 
 ## Quick Optimization Checklist
 
 1. Use `sia-code index --update` for daily work
 2. Use `--regex` for symbol/identifier lookup
-3. Add `--no-deps` to reduce large dependency noise
+3. Use tighter regex terms (or include path-like hints) to reduce noise
 4. Use `--parallel` for large initial indexing runs
-5. Start embed daemon when doing repeated semantic/hybrid queries
+5. Start embed daemon when doing repeated memory embedding operations
 
 ## Useful Commands
 
@@ -28,8 +28,8 @@ sia-code search --regex "pattern"
 ## Bottleneck Hints
 
 - Slow index build: reduce indexed scope or enable parallel workers
-- Slow semantic/hybrid queries: ensure embed daemon is healthy
-- Noisy result set: use dependency filters (`--no-deps` / `--deps-only`)
+- Slow semantic queries: verify ChunkHound provider setup and model/network health
+- Noisy result set: narrow regex terms and include path-like query hints
 
 ## Related Docs
 

diff --git a/docs/QUERYING.md b/docs/QUERYING.md
@@ -3,49 +3,50 @@
 ## Search Commands
 
 ```bash
-# default hybrid
+# default mode from config (ChunkHound-backed; default is regex)
 sia-code search "authentication flow"
 
 # lexical / symbol-heavy
 sia-code search --regex "AuthService|token"
 
-# semantic only
+# semantic only (requires embedding setup)
 sia-code search --semantic-only "handle login failures"
 ```
 
 ## Useful Flags
 
 - `-k, --limit <N>`: number of results
-- `--no-deps`: only project code
-- `--deps-only`: only dependency code
-- `--no-filter`: include stale chunks
+- `--no-deps`: accepted for compatibility (currently no-op)
+- `--deps-only`: accepted for compatibility (currently no-op)
+- `--no-filter`: accepted for compatibility (currently no-op)
 - `--format text|json|table|csv`
 - `--output <path>`: write results to file
 
 ## Multi-Hop Research
 
 ```bash
-sia-code research "how does auth middleware work?" --hops 3 --graph
+sia-code research "how does auth middleware work?"
 ```
 
 Use this for architecture tracing, call-path discovery, and unfamiliar code.
 
+Compatibility flags for `research` (`--hops`, `--graph`, `--limit`, `--no-filter`) are accepted by Sia and ignored by ChunkHound.
+
 ## Practical Tuning
 
-- `search.vector_weight = 0.0` => lexical-heavy behavior
-- `search.vector_weight = 1.0` => semantic-heavy behavior
+- `chunkhound.default_search_mode = regex|semantic`
 - defaults come from `.sia-code/config.json`
 
 ```bash
-sia-code config get search.vector_weight
-sia-code config set search.vector_weight 0.0
+sia-code config get chunkhound.default_search_mode
+sia-code config set chunkhound.default_search_mode semantic
 ```
 
 ## Output Tips
 
 - Use `--format json` for scripts/agents.
 - Use `--format table` for quick terminal scanning.
-- Use `--no-deps` in large repos to reduce noise.
+- Use tighter regex terms or path-like query text when results are noisy.
 
 ## Related Docs