You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Feb 14, 2026. It is now read-only.
v0.9.0: Consolidate into .refdocs/ directory, replace search with manifest
Move config, manifest, and downloaded docs into a single .refdocs/ folder
(.refdocs/config.json, .refdocs/manifest.json, .refdocs/docs/) following the
.git/ convention. Remove search indexer, chunker, and eval harness — replaced
by lightweight manifest-based discovery. Default download path changes from
ref-docs/ to docs/.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
A local CLI tool that indexes markdown documentation and exposes fast fuzzy search with intelligent chunking. Designed to give LLM coding agents efficient, token-conscious access to project documentation without MCP servers, network calls, or full-file context dumps.
3
+
A local CLI tool that fetches, organizes, and catalogs markdown documentation. Generates a compact manifest that gives LLM coding agents efficient, token-conscious access to project documentation without MCP servers, network calls, or full-file context dumps.
-`chunkMinTokens` — minimum chunk size; merge small sections with their parent
55
-
-`boostFields` — field relevance weights for search ranking
43
+
-`paths` — array of directories to catalog (relative to `.refdocs/`)
44
+
-`manifest` — where to persist the generated manifest (relative to `.refdocs/`)
56
45
-`sources` — (managed by `refdocs add`) tracks GitHub repos added for future updates
57
46
58
-
## CLI Commands
47
+
## Manifest
59
48
60
-
### `refdocs init`
49
+
The manifest is a compact JSON file that summarizes all documented files. It replaces the old search index with a lightweight catalog that LLM agents can read directly.
61
50
62
-
Create a `.refdocs.json`config file with full defaults. Errors if the file already exists. Also auto-runs when `refdocs add` is called without an existing config.
- Small sections (below `chunkMinTokens`) merge into their parent heading's chunk
71
-
- Large sections (above `chunkMaxTokens`) split at paragraph boundaries
72
-
- Serialize index to `.refdocs-index.json`
73
-
- Print summary: files indexed, chunks created, index size
75
+
Target: entire manifest for 50 files should be ~500-800 tokens.
74
76
75
-
### `refdocs search <query>`
77
+
##CLI Commands
76
78
77
-
Fuzzy search the index and return the top chunks.
79
+
### `refdocs init`
78
80
79
-
- Load persisted index (error if not built yet)
80
-
- Run MiniSearch with fuzzy matching (fuzzy: 0.2), prefix search enabled
81
-
- Return top 3 results by default
82
-
- Output format: each chunk preceded by a comment with source file and line range
81
+
Create a `.refdocs/config.json` config file with full defaults. Errors if the file already exists. Also auto-runs when `refdocs add` is called without an existing config.
83
82
84
-
**Flags:**
85
-
-`-n, --results <count>` — number of results (default: 3, max: 10)
86
-
-`-f, --file <pattern>` — filter results to files matching glob
87
-
-`--json` — output results as JSON array instead of formatted text
88
-
-`--raw` — output chunk body only, no metadata header (for piping)
83
+
### `refdocs manifest`
84
+
85
+
Walk all configured paths, extract headings and summaries from every markdown file, and generate the manifest.
86
+
87
+
- Parse each markdown file for h1-h3 headings via regex
88
+
- Extract frontmatter `description` or first paragraph as summary
89
+
- Count lines per file
90
+
- Write to `.refdocs/manifest.json`
91
+
- Print summary: files cataloged, sources tracked
89
92
90
93
### `refdocs add <source>`
91
94
92
95
Add a local path or download markdown docs from a GitHub repository.
93
96
94
-
- If source is a URL (`http://` or `https://`), download from GitHub as before
97
+
- If source is a URL (`http://` or `https://`), download from GitHub
95
98
- If source is a local path, verify it exists with `.md` files and add to `paths`
96
-
- Update `.refdocs.json`: add path to `paths`, track source in `sources` (GitHub only)
97
-
- Auto re-index unless `--no-index` is passed
99
+
- Update `.refdocs/config.json`: add path to `paths`, track source in `sources` (GitHub only)
100
+
- Auto regenerate manifest unless `--no-manifest` is passed
-`--branch <branch>` — override branch detection from URL (GitHub only)
102
-
-`--no-index` — skip auto re-indexing after adding
105
+
-`--no-manifest` — skip auto manifest generation after adding
103
106
104
107
Auth via `GITHUB_TOKEN` env var for private repos.
105
108
106
109
### `refdocs remove <path>`
107
110
108
-
Remove a path from the index configuration.
111
+
Remove a path from the configuration.
109
112
110
-
- Remove path from `paths` in `.refdocs.json`
113
+
- Remove path from `paths` in `.refdocs/config.json`
111
114
- If path has an associated source, remove from `sources` too
112
-
- Auto re-index unless `--no-index` is passed
115
+
- Auto regenerate manifest unless `--no-manifest` is passed
113
116
- Does not delete files on disk
114
117
115
118
**Flags:**
116
-
-`--no-index` — skip auto re-indexing after removal
119
+
-`--no-manifest` — skip auto manifest generation after removal
117
120
118
121
### `refdocs list`
119
122
120
-
List all indexed files and their chunk counts. Useful for verifying what's in the index.
121
-
122
-
### `refdocs info <file>`
123
-
124
-
Show all chunks for a specific file with their headings and token estimates.
123
+
List all documented files and their heading counts. Loads from manifest if available, otherwise scans filesystem directly.
125
124
126
125
### `refdocs update`
127
126
128
-
Re-pull all tracked sources from GitHub and re-index.
127
+
Re-pull all tracked sources from GitHub and regenerate manifest.
129
128
130
-
- Iterates over `sources` in `.refdocs.json`
129
+
- Iterates over `sources` in `.refdocs/config.json`
131
130
- Downloads each repo tarball and extracts `.md` files, overwriting local copies
132
-
- Auto re-index unless `--no-index` is passed
131
+
- Auto regenerate manifest unless `--no-manifest` is passed
133
132
134
133
**Flags:**
135
-
-`--no-index` — skip auto re-indexing after update
136
-
137
-
## Chunking Strategy
138
-
139
-
This is the core value of the tool. Chunks must be:
140
-
141
-
1.**Semantically coherent** — never split mid-section. Heading boundaries are the primary split points.
142
-
2.**Right-sized for LLM context** — 100-800 tokens. Big enough to be useful, small enough to not waste context.
143
-
3.**Hierarchical** — each chunk carries its full heading breadcrumb (e.g. `Configuration > Database > Connections`) so the LLM understands where the chunk fits.
144
-
145
-
Algorithm:
146
-
1. Parse markdown into AST
147
-
2. Walk AST and split at heading nodes (h1, h2, h3)
148
-
3. Each section becomes a candidate chunk with its heading breadcrumb
149
-
4. If chunk < minTokens, merge with previous sibling or parent
150
-
5. If chunk > maxTokens, split at paragraph boundaries (double newline)
151
-
6. Attach metadata: source file path, line range, heading trail
152
-
153
-
## Output Format
154
-
155
-
Default output for `refdocs search "data transformers"`:
156
-
157
-
```
158
-
# [1] spatie-laravel-data/transformers.md:15-48
159
-
# Transformers > Built-in Transformers
160
-
161
-
Transformers are used to convert data properties when...
0 commit comments