Skip to content

fix(extract): resolve bare-name body wikilinks via resolver#1233

Open
rayers wants to merge 1 commit into
garrytan:masterfrom
rayers:fix/extract-bare-name-wikilinks
Open

fix(extract): resolve bare-name body wikilinks via resolver#1233
rayers wants to merge 1 commit into
garrytan:masterfrom
rayers:fix/extract-bare-name-wikilinks

Conversation

@rayers
Copy link
Copy Markdown

@rayers rayers commented May 20, 2026

Summary

Body wikilinks in wiki/topic/learning content are silently dropped on every gbrain extract pass. Three layered issues:

  1. WIKILINK_RE in src/core/link-extraction.ts is gated on DIR_PATTERN (people|companies|meetings|...). Wiki/topic/learning content uses bare-name wikilinks like [[Fast-Weigh]] or [[2026-05-07-cost-plan]] which fall outside that allow-list — the regex never matched, so body refs were invisible to extract.
  2. Body wikilinks that DID match were only resolved when --include-frontmatter was set, because extractPageLinks routed ALL refs (body + frontmatter) through activeResolver which was set to nullResolver when frontmatter was off. Body refs are already free of the cost concern that gated frontmatter — they surface in markdown the user explicitly typed — so they should always resolve.
  3. extract.ts called extractPageLinks with one resolver doing both jobs. Splitting via opts.skipFrontmatter lets the body pass keep the real resolver while frontmatter stays opt-in.

What this PR is

This is the wikilink-resolver portion of the original PR #768 (which bundled #767 + #769 + extract polish). Two pieces of that bundle have either been absorbed or split off:

This PR carries only the wikilink resolver — no scope overlap with upstream master.

Fixes

link-extraction.ts adds BARE_WIKILINK_RE matching [[<name>(#anchor)?(|display)?]] shapes outside DIR_PATTERN, resolved via the new resolveBareWikilink(name, resolver) that walks fuzzy match → bare-name prefix expansion → exact-slug before giving up. Three new exports: BARE_WIKILINK_RE, resolveBareWikilink, isBareName (regex shape guard for the pre-extract candidate check). extractPageLinks gains an opts.skipFrontmatter parameter — when true, the frontmatter pass is skipped but body wikilinks still resolve through the passed resolver.

extract.ts threads the always-on resolver (not the conditional nullResolver) into extractPageLinks for the body pass, with opts.skipFrontmatter wired off --include-frontmatter.

Tests

test/link-extraction.test.ts: 75 lines covering BARE_WIKILINK_RE shape (anchor + display variants), resolveBareWikilink fuzzy + prefix + exact paths, isBareName negative cases (DIR_PATTERN prefixes still rejected), and extractPageLinks integration with opts.skipFrontmatter under both modes.

Local: bun run verify clean, bun test test/link-extraction.test.ts → 103 pass / 0 fail.

Scope note

The FS-source path (extractLinksFromDir) is NOT updated. It uses a different codepath via extractMarkdownLinks + resolveSlug; bare-name wikilinks in FS mode still won't resolve. Most users are on --source db (autopilot uses it); FS is for offline Obsidian-vault mode. Separate concern.

Test plan

  • bun run verify clean
  • bun test test/link-extraction.test.ts → 103/0/0
  • bun run typecheck clean
  • bun run test:e2e (gated on DATABASE_URL)
  • Manual verification on a real wiki corpus — gbrain extract links produces non-zero link counts on pages using [[Bare-Name]] shapes

🤖 Generated with Claude Code

Body wikilinks in wiki/topic/learning content are silently dropped
on every `gbrain extract` pass. Three layered issues:

1. WIKILINK_RE in src/core/link-extraction.ts is gated on
   DIR_PATTERN (people|companies|meetings|...). Wiki/topic/learning
   content uses bare-name wikilinks like `[[Fast-Weigh]]` or
   `[[2026-05-07-cost-plan]]` which fall outside that allow-list —
   the regex never matched, so body refs were invisible to extract.

2. Body wikilinks that DID match were only resolved when
   `--include-frontmatter` was set, because extractPageLinks routed
   ALL refs (body + frontmatter) through `activeResolver` which was
   set to nullResolver when frontmatter was off. Body refs are
   already free of the cost concern that gated frontmatter — they
   surface in markdown the user explicitly typed — so they should
   always resolve.

3. extract.ts called extractPageLinks with one resolver doing both
   jobs. Splitting via opts.skipFrontmatter lets the body pass keep
   the real resolver while frontmatter stays opt-in.

Fixes:

- link-extraction.ts adds BARE_WIKILINK_RE matching
  `[[<name>(#anchor)?(|display)?]]` shapes outside DIR_PATTERN,
  resolved via the new `resolveBareWikilink(name, resolver)` that
  walks fuzzy match + bare-name prefix expansion + exact-slug
  before giving up. Three new exports: BARE_WIKILINK_RE,
  resolveBareWikilink, isBareName (regex shape guard for the
  pre-extract candidate check). extractPageLinks gains an
  opts.skipFrontmatter parameter — when true, the frontmatter
  pass is skipped but body wikilinks still resolve through the
  passed resolver.

- extract.ts threads the always-on `resolver` (not the conditional
  nullResolver) into extractPageLinks for the body pass, with
  opts.skipFrontmatter wired off `--include-frontmatter`.

- test/link-extraction.test.ts: 75 lines covering BARE_WIKILINK_RE
  shape (anchor + display variants), resolveBareWikilink fuzzy +
  prefix + exact paths, isBareName negative cases (DIR_PATTERN
  prefixes still rejected), and extractPageLinks integration with
  opts.skipFrontmatter under both modes.

Scope note: this PR is the wikilink resolver portion of the
original PR garrytan#768 wave. The doctor.ts hint fix that was also in
that wave has been absorbed by upstream master independently
(doctor.ts:2503 now correctly says `Run: gbrain extract all`).
This PR carries only the wikilink resolver — no overlap with
upstream.

FS-source path (extractLinksFromDir) NOT updated. It uses a
different codepath via extractMarkdownLinks + resolveSlug; bare-
name wikilinks in FS mode still won't resolve. Most users are on
--source db (autopilot uses it); FS is for offline Obsidian-vault
mode. Separate concern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant