fix(extract): resolve bare-name body wikilinks via resolver#1233
Open
rayers wants to merge 1 commit into
Open
Conversation
Body wikilinks in wiki/topic/learning content are silently dropped on every `gbrain extract` pass. Three layered issues: 1. WIKILINK_RE in src/core/link-extraction.ts is gated on DIR_PATTERN (people|companies|meetings|...). Wiki/topic/learning content uses bare-name wikilinks like `[[Fast-Weigh]]` or `[[2026-05-07-cost-plan]]` which fall outside that allow-list — the regex never matched, so body refs were invisible to extract. 2. Body wikilinks that DID match were only resolved when `--include-frontmatter` was set, because extractPageLinks routed ALL refs (body + frontmatter) through `activeResolver` which was set to nullResolver when frontmatter was off. Body refs are already free of the cost concern that gated frontmatter — they surface in markdown the user explicitly typed — so they should always resolve. 3. extract.ts called extractPageLinks with one resolver doing both jobs. Splitting via opts.skipFrontmatter lets the body pass keep the real resolver while frontmatter stays opt-in. Fixes: - link-extraction.ts adds BARE_WIKILINK_RE matching `[[<name>(#anchor)?(|display)?]]` shapes outside DIR_PATTERN, resolved via the new `resolveBareWikilink(name, resolver)` that walks fuzzy match + bare-name prefix expansion + exact-slug before giving up. Three new exports: BARE_WIKILINK_RE, resolveBareWikilink, isBareName (regex shape guard for the pre-extract candidate check). extractPageLinks gains an opts.skipFrontmatter parameter — when true, the frontmatter pass is skipped but body wikilinks still resolve through the passed resolver. - extract.ts threads the always-on `resolver` (not the conditional nullResolver) into extractPageLinks for the body pass, with opts.skipFrontmatter wired off `--include-frontmatter`. - test/link-extraction.test.ts: 75 lines covering BARE_WIKILINK_RE shape (anchor + display variants), resolveBareWikilink fuzzy + prefix + exact paths, isBareName negative cases (DIR_PATTERN prefixes still rejected), and extractPageLinks integration with opts.skipFrontmatter under both modes. Scope note: this PR is the wikilink resolver portion of the original PR garrytan#768 wave. The doctor.ts hint fix that was also in that wave has been absorbed by upstream master independently (doctor.ts:2503 now correctly says `Run: gbrain extract all`). This PR carries only the wikilink resolver — no overlap with upstream. FS-source path (extractLinksFromDir) NOT updated. It uses a different codepath via extractMarkdownLinks + resolveSlug; bare- name wikilinks in FS mode still won't resolve. Most users are on --source db (autopilot uses it); FS is for offline Obsidian-vault mode. Separate concern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Body wikilinks in wiki/topic/learning content are silently dropped on every
gbrain extractpass. Three layered issues:WIKILINK_REinsrc/core/link-extraction.tsis gated onDIR_PATTERN(people|companies|meetings|...). Wiki/topic/learning content uses bare-name wikilinks like[[Fast-Weigh]]or[[2026-05-07-cost-plan]]which fall outside that allow-list — the regex never matched, so body refs were invisible to extract.--include-frontmatterwas set, becauseextractPageLinksrouted ALL refs (body + frontmatter) throughactiveResolverwhich was set tonullResolverwhen frontmatter was off. Body refs are already free of the cost concern that gated frontmatter — they surface in markdown the user explicitly typed — so they should always resolve.extract.tscalledextractPageLinkswith one resolver doing both jobs. Splitting viaopts.skipFrontmatterlets the body pass keep the real resolver while frontmatter stays opt-in.What this PR is
This is the wikilink-resolver portion of the original PR #768 (which bundled #767 + #769 + extract polish). Two pieces of that bundle have either been absorbed or split off:
collectSyncableFilesindependently.Run: gbrain extract allinstead of the gone-since-v0.16gbrain link-extract && gbrain timeline-extract) has been absorbed by upstream master atdoctor.ts:2503.This PR carries only the wikilink resolver — no scope overlap with upstream master.
Fixes
link-extraction.tsaddsBARE_WIKILINK_REmatching[[<name>(#anchor)?(|display)?]]shapes outsideDIR_PATTERN, resolved via the newresolveBareWikilink(name, resolver)that walks fuzzy match → bare-name prefix expansion → exact-slug before giving up. Three new exports:BARE_WIKILINK_RE,resolveBareWikilink,isBareName(regex shape guard for the pre-extract candidate check).extractPageLinksgains anopts.skipFrontmatterparameter — when true, the frontmatter pass is skipped but body wikilinks still resolve through the passed resolver.extract.tsthreads the always-onresolver(not the conditionalnullResolver) intoextractPageLinksfor the body pass, withopts.skipFrontmatterwired off--include-frontmatter.Tests
test/link-extraction.test.ts: 75 lines coveringBARE_WIKILINK_REshape (anchor + display variants),resolveBareWikilinkfuzzy + prefix + exact paths,isBareNamenegative cases (DIR_PATTERNprefixes still rejected), andextractPageLinksintegration withopts.skipFrontmatterunder both modes.Local:
bun run verifyclean,bun test test/link-extraction.test.ts→ 103 pass / 0 fail.Scope note
The FS-source path (
extractLinksFromDir) is NOT updated. It uses a different codepath viaextractMarkdownLinks+resolveSlug; bare-name wikilinks in FS mode still won't resolve. Most users are on--source db(autopilot uses it); FS is for offline Obsidian-vault mode. Separate concern.Test plan
bun run verifycleanbun test test/link-extraction.test.ts→ 103/0/0bun run typecheckcleanbun run test:e2e(gated on DATABASE_URL)gbrain extract linksproduces non-zero link counts on pages using[[Bare-Name]]shapes🤖 Generated with Claude Code