feat!: FTS5-based relevance search for bear-search-notes (v3.0.0)#114
feat!: FTS5-based relevance search for bear-search-notes (v3.0.0)#114
Conversation
Replace the LIKE-substring backend of bear-search-notes with a BM25-
ranked, in-memory SQLite FTS5 index that searches titles, body, and
OCR-extracted attachment text. Ship as v3.0.0 with related breaking
simplifications: bear-add-file is file-path-only (base64_content
removed); BEAR_DATABASE_SCHEMA.md and evals/generate-report.js are
removed; eval tasks documented as raw npx promptfoo invocations.
Includes Stage-1 and Stage-2 code-review fixes folded in:
- Tag-join tables resolved at runtime via Z_PRIMARYKEY
(src/infra/bear-schema.ts) instead of hardcoded Z_5TAGS / Z_13TAGS,
which silently break across Bear schema migrations.
- prepareFTS5Term three-regime tokenizer: quoted/grouped passthrough,
single-identifier phrase-quote, multi-word OR-join. Tokenization
uses [\p{L}\p{N}_]+\*? so Cyrillic / Greek / accented Latin survive.
- Tag normalization centralized in JS decodeTagName, called by both
index-build and query paths (SQLite LOWER() is ASCII-only).
- runWithFts5SyntaxRemap maps SQLite errors to plain-language
envelopes; no FTS5 operator vocabulary leaks to the caller.
- Drift detection via MAX(ZMODIFICATIONDATE) + COUNT(*) catches
pre-dated bulk imports MAX alone would miss.
- Snippet-per-result for term queries (FTS5 snippet() with [...]
brackets); 200-char body prefix for filter-only queries.
- Spec, CHANGELOG, manifest descriptions, eval prompts updated to
match the natural-language stance for user-facing surfaces.
Six focused fixes from a second multi-agent code-review pass against
the FTS5 branch. Each finding was scored independently and survived
the >= 80 threshold of the pipeline:
- README and docs/user/NPM.md re-synced from manifest.json
(release-process step 1 was missed during v3.0.0 prep). The
bear-search-notes description now carries the trailing tag /
date / pinned filter sentence on all three surfaces.
- isFTS5SyntaxError now matches "unknown special query", so bare-
wildcard inputs (`*`, `**`, `* *`) trigger the operator-free
envelope instead of leaking raw SQLite text. Locked with a
parameterized error-shape test alongside the existing arms.
- CHANGELOG drops FTS5 operator vocabulary (`engin*` prefix
wildcard, "tokenize") to match the natural-language stance the
rest of the PR enforces; runWithFts5SyntaxRemap exists for
exactly this reason on the error path.
- searchNotes rejects whitespace-only term/tag inputs instead of
silently degrading to browse-all-recent-notes. Predicate and
spec assembly now agree by trimming up-front. Locked with a
parameterized unit test (no Bear interaction needed — pre-Bear
validation is unit-test scope by design).
- bear-search-notes `term` schema description rewritten in user-
observable behavior; tokenizer regime names ("OR-ranked", "split
into tokens", "consecutive token sequence", "relevance density")
removed from the most LLM-facing surface in the server.
- bear-add-file title-only path resolves the ID via findNotesByTitle
(mirroring bear-open-note's preflight) so the success response
always carries note title + ID + what changed, per the CLAUDE.md
mutation-response rule. Disambiguation: zero matches -> error;
multiple matches -> structured list response with IDs. Title
schema description updated to document the new behavior.
Apache 2.0 keeps the same broadly-permissive posture as MIT but
adds two things MIT does not:
1. An explicit patent grant from contributors (§3), terminating
if a downstream party initiates patent litigation against the
project — closes a gap MIT silently leaves open.
2. A NOTICE-file propagation requirement (§4d): any derivative
work must include a readable copy of NOTICE in its distribution,
documentation, or attribution surface. This is the closest
legitimate OSS mechanism for 'credit must travel with the code'
without going copyleft.
End users and direct consumers (npm, MCPB Bundle) see no functional
change; it's purely a redistribution-terms shift. Existing copies
under MIT keep their MIT grant — Apache 2.0 applies going forward.
Files updated:
- LICENSE.md → canonical Apache 2.0 text (unmodified per Apache guidance)
- NOTICE → new attribution file (Serhii Vasylenko, repo URL)
- package.json → SPDX 'MIT' → 'Apache-2.0'
- manifest.json → SPDX 'MIT' → 'Apache-2.0'
- website/{Footer,WhySection,BaseLayout}.astro → user-facing license labels
- CHANGELOG.md → [Unreleased] entry
Rename evals/fts5-promptfooconfig.yaml to evals/fts5-vs-like.yaml to disambiguate it from the new evals/native-vs-fts5.yaml — the A/B comparison of v3.0.0 against Bear's own MCP server (bearcli mcp-server). The standalone evals/promptfooconfig.yaml that the FTS5 work superseded is removed. Trim 9 lines from evals/README.md that referenced the removed generate-report.js workflow.
Seven fixes from the third multi-agent code-review pass against the FTS5 branch — one mid-tier survivor of the >= 80 scorer threshold, six borderline finds applied on user discretion. - bear-add-file tool description rewritten to mirror bear-open-note's symmetric ID-or-title wording. The schema now offers \`title\` as an alternative entry point (added in the prior review-fix pass), but the tool description still steered the LLM back through bear-search-notes — directly contradicting the schema. - fetchTagsForResults adds ORDER BY inside GROUP_CONCAT so the per-note tags array is byte-deterministic across calls. SQLite documents the order without ORDER BY as arbitrary; the existing test used arrayContaining to silently tolerate the non-determinism. - bear-encoding.ts moved from src/operations/ to src/infra/. fts-index.ts was the only infra module reaching up into operations; bear-encoding is a pure encoding/decoding utility, not business logic. Now infra never imports from operations. SPECIFICATION.md and CLAUDE.md tree updated. - runWithFts5SyntaxRemap takes (term, fn) instead of (hasTerm, term, fn). hasTerm was derivable from term, and passing both let callers represent impossible states (hasTerm=false, term="actual"). Project rule: no defensive code for impossible internal states. - searchByQuery wraps non-FTS5 errors in the project's standard "Database error: Failed to ..." envelope, matching database.ts, tags.ts, notes.ts. Pre-remapped FTS5 syntax errors pass through so the LLM still gets the user-facing retry hint. - searchNotes date-filter pipeline (parse -> snap to start/end of day -> convert to Core Data timestamp) collapses four near-identical branches into one toCoreDataTimestamp helper. Any future change to the pipeline lands in one place. - bear-search-notes and bear-find-untagged-notes limit schemas tighten to z.number().int().min(1).optional(). Rejects limit=0 and floats at the boundary, removing the truthy-OR vs nullish-coalesce ambiguity in the operations layer.
bear-rename-tag and bear-delete-tag already strip a leading # from tag inputs via Zod transform, but bear-search-notes and bear-add-tag did not — the schema descriptions said "without # symbol" while the runtime silently accepted "#career" and either returned zero results (bear-search-notes) or wrote a literal "##career" into the note body (bear-add-tag). Mirror the rename/delete-tag transform on both outliers so tag inputs behave consistently across all four tag-handling tools regardless of whether the agent prefixes the # or not.
The position parameter accepts 'beginning' or 'end' but the schema description didn't say what those mean when header is also set. Bear treats position as section-relative when header is set (start or end of the section's direct content) and note-relative otherwise. Without the clarification, agents had to guess whether position: 'beginning' with a header meant "start of the section" or "start of the whole note".
readAttachmentFile rejects symlinks, non-regular files, empty files, and files over 25 MB (MAX_ATTACHMENT_BYTES). The schema previously documented only the no-tilde-expansion rule, leaving agents to discover the other rejections at runtime. Adding the constraints to the description so an agent can decide whether the file is suitable before calling, rather than after the error.
bear-search-notes already states "Trashed and archived notes are not included." Two sibling read tools were silent on equivalent exclusions: - bear-find-untagged-notes excludes trashed + archived from the underlying SQL but the description didn't mention it. Agents seeing fewer results than expected wouldn't know archived-but-untagged notes were filtered out. - bear-list-tags excludes trashed + archived notes from the noteCount aggregate, and tags with zero active notes (HAVING noteCount > 0) are dropped entirely. Both invariants now stated. Encrypted notes are also excluded from all three but stay undocumented on purpose — they're invisible to the LLM by design and should not appear in the agent-facing surface.
…phrase
Five mutation tools (bear-add-text, bear-replace-text, bear-add-file,
bear-archive-note, bear-add-tag) plus bear-open-note all named different
subsets of where a note ID could come from in their `id` schema
descriptions — three said "from bear-search-notes" only, one added
"or bear-open-note", another added "or bear-find-untagged-notes",
bear-add-file used "Exact note identifier (ID) obtained from
bear-search-notes". None mentioned bear-create-note, even though it
returns an ID when called with a title.
The asymmetry encouraged two LLM failure modes: wasted bear-search-notes
round-trips after bear-create-note (because bear-add-text's schema said
the ID should be "from bear-search-notes"), and false self-consistency
checking ("this ID came from bear-open-note but bear-add-text wants one
from bear-search-notes — better re-search").
Replacing the per-schema citations with one canonical phrase, "Note
identifier (ID) for an existing Bear note". Server-level instructions
now carry the system-wide ID-flow rule once; schemas stay focused on
per-call invocation. Future ID-returning tools auto-fit without
schema updates.
- Eval doc + comment drift: evals/README.md listed deleted/renamed eval configs (promptfooconfig.yaml, fts5-promptfooconfig.yaml); same dangling filenames appeared in nine inline comments across the surviving YAMLs. Updated all references to fts5-vs-like.yaml / native-vs-fts5.yaml / shared/default-test.yaml. - Removed "(see Quick Start)" pointer in evals/README.md and the corresponding "See evals/README.md for the manual setup steps" comment in fts5-vs-like.yaml — the Quick Start section was deleted in 5244b97 but both pointers remained. - Updated evals/README.md "Tool calls (gate ≤5)" wording to "(emitted as a named score, not gated)". The eval framework no longer gates; shared/default-test.yaml emits both metrics with pass: true. - Added CHANGELOG entry documenting bear-add-file's new runtime restrictions (25 MB cap, symlink rejection, non-regular-file rejection, empty-file rejection). The previous CHANGELOG only recorded the base64_content removal, leaving these breaking input-policy changes undocumented in v3.0.0 release notes. - Dropped two "See docs/dev/SECURITY.md" pointers in note-tools.ts whose target sections were never written. Local rationale comments retained. - Strip all leading # symbols from tag inputs (^#+/) instead of just one (^#/). Aligns the five Zod transforms in tools/{note,tag}-tools.ts with the existing applyNoteConventions normalization in operations/note-conventions.ts (which uses /^#+|#+\$/g). Closes a silent zero-hit where ##career normalized to #career and then failed to match the actual career tag.
- docs(spec): align prepareFTS5Term description with the four-branch implementation. Adds the single-identifier phrase-quote case (`bear-notes-mcp` → `"bear notes mcp"`) that the spec previously collapsed into the multi-word OR-rank branch, and notes the no-wildcard guard rationale. - fix(notes): parseDateString interprets ISO YYYY-MM-DD inputs in local time. The default branch was calling new Date(dateString), which parses date-only ISO forms as UTC midnight per ECMA-262 §21.4.3.2. Callers snap bounds with local-time setHours, producing previous-day bounds for negative-UTC users (PDT user typing 2026-04-15 landed on 2026-04-14 07:00 UTC). Now matches the today/yesterday/last week branches that already build dates in local time. Component round-trip validation rejects invalid dates (Date(y, m-1, d) silently rolls Feb 30 to Mar 2) to preserve the pre-fix rejection contract. Adds 4 unit tests pinning local-time semantics and explicit-TZ fallthrough.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Pull request overview
This PR ships a breaking v3.0.0 upgrade centered on replacing the previous LIKE-based note search with an in-memory SQLite FTS5 index for relevance-ranked retrieval (BM25) with per-result context snippets, while keeping existing tag/date/pinned filters working (including resilience to Bear Core Data table renumbering). It also hardens bear-add-file against unsafe filesystem inputs, updates docs/evals, and switches licensing to Apache-2.0.
Changes:
- Replace LIKE substring search with an in-memory FTS5 index (BM25 ranking, snippet output, improved token handling) and update tool/docs messaging accordingly.
- Discover Bear Core Data join-table names at runtime (via
Z_PRIMARYKEY) to avoid hardcodedZ_5TAGS-style schema coupling. - Tighten
bear-add-filetofile_path-only with size/type/symlink guards; refresh system/unit tests, eval configs, and license artifacts.
Reviewed changes
Copilot reviewed 45 out of 48 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| website/src/layouts/BaseLayout.astro | Update structured-data license URL to Apache-2.0. |
| website/src/components/WhySection.astro | Update website copy to Apache 2.0 licensing. |
| website/src/components/Footer.astro | Update footer license label to Apache 2.0. |
| tests/system/trash-filtering.test.ts | Make trash-filtering assertions robust under relevance/OR-style matches. |
| tests/system/tag-search.test.ts | Reduce flakiness by avoiding title-term searches that can overmatch under FTS5. |
| tests/system/tag-management.test.ts | Simplify/adjust tag mutation tests to rely on returned IDs and broader assertions. |
| tests/system/replace-text.test.ts | Refactor to use returned IDs; reduce per-test cleanup logic. |
| tests/system/open-note-by-title.test.ts | Simplify tests to rely on create-note ID return and cleanup by prefix. |
| tests/system/note-conventions.test.ts | Narrow to convention-OFF integration scenario; rely on returned IDs. |
| tests/system/inspector.ts | Rework cleanup to query Bear DB directly (prefix LIKE) to avoid FTS token overmatch. |
| tests/system/create-note.test.ts | Keep ID-return verification for titled creation; remove no-title cleanup flow. |
| tests/system/attached-files.test.ts | Switch to file_path attachments and add fixture-backed non-OCR case. |
| tests/system/add-text.test.ts | Use create-note returned IDs; simplify test structure. |
| tests/system/add-tag.test.ts | Consolidate add-tag tests; use returned IDs and multi-tag coverage. |
| tests/fixtures/page.html | Add HTML fixture file for non-OCR attachment test coverage. |
| Taskfile.yml | Remove old eval tasks that referenced deleted promptfoo config/report generator. |
| src/tools/tag-tools.ts | Improve tool descriptions and strip leading # more robustly in schemas. |
| src/tools/note-tools.ts | Add FTS5-oriented search UX messaging/snippets; harden add-file; adjust schemas. |
| src/tools/note-tools.test.ts | Unit tests for new readAttachmentFile security guards and size cap. |
| src/operations/tags.ts | Use runtime schema discovery for tag joins; keep active-note-only semantics. |
| src/operations/notes.ts | Route searching through infra FTS index; fix ISO date-only parsing to local time. |
| src/operations/notes.test.ts | Add tests for ISO date-only parsing and trimmed-criteria validation. |
| src/main.ts | Clarify “always use note IDs from tool output” guidance. |
| src/infra/fts-index.ts | New in-memory FTS5 index builder/query engine (BM25 + snippet + drift rebuild). |
| src/infra/fts-index.test.ts | Comprehensive unit tests for indexing, drift, filtering, token rules, and snippets. |
| src/infra/bear-schema.ts | New runtime Core Data join-table discovery via Z_PRIMARYKEY. |
| src/infra/bear-schema.test.ts | Test schema discovery + CI guard for required FTS5 capabilities in node:sqlite. |
| src/infra/bear-encoding.ts | Centralize tag normalization (Unicode-aware) and remove base64 cleaner. |
| src/infra/bear-encoding.test.ts | Add unit test for Core Data timestamp conversion. |
| src/config.ts | Bump app version to 3.0.0. |
| README.md | Update tool descriptions for relevance search and file attachment behavior. |
| package.json | Bump version to 3.0.0; update promptfoo; change license to Apache-2.0. |
| NOTICE | Add Apache-2.0 NOTICE file for attribution requirements. |
| manifest.json | Bump version to 3.0.0; update tool descriptions; change license to Apache-2.0. |
| LICENSE.md | Replace MIT text with Apache License 2.0. |
| evals/shared/default-test.yaml | Add shared metric emitters for tool calls/turns across eval configs. |
| evals/README.md | Update eval documentation for new multi-config layout and non-gated metrics. |
| evals/promptfooconfig.yaml | Remove legacy single-config promptfoo file. |
| evals/native-vs-fts5.yaml | Add eval config comparing Bear native MCP vs this server’s FTS5 search. |
| evals/generate-report.js | Remove legacy report generator script. |
| evals/fts5-vs-like.yaml | Add eval config comparing v2.11.0 LIKE vs v3.0.0 FTS5 relevance search. |
| docs/user/NPM.md | Update published tool list descriptions for new search and add-file behavior. |
| docs/dev/SPECIFICATION.md | Document the FTS5 in-memory index design + schema discovery approach. |
| docs/dev/BEAR_DATABASE_SCHEMA.md | Remove outdated schema documentation file. |
| CLAUDE.md | Update repo structure notes and test guidance (system vs unit test intent). |
| CHANGELOG.md | Document breaking changes and new search/snippet behavior for v3.0.0. |
… cleanup feat(sonar): add SonarQube configuration files for project analysis fix(tools): improve snippet formatting and error handling in note tools refactor(fts): enhance tag handling and query construction in search functions
| try { | ||
| const attachment = readAttachmentFile(file_path); | ||
| if (!attachment.ok) return createErrorResponse(attachment.error); | ||
| const fileData = attachment.data; | ||
| const resolvedFilename = filename || basename(file_path); | ||
|
|
| name: z | ||
| .string() | ||
| .trim() | ||
| .transform((v) => v.replace(/^#/, '')) | ||
| .transform((v) => v.replace(/^#+/, '')) | ||
| .pipe(z.string().min(1, 'Tag name is required')) | ||
| .describe('Current tag name to rename (without # symbol)'), | ||
| new_name: z | ||
| .string() | ||
| .trim() | ||
| .transform((v) => v.replace(/^#/, '')) | ||
| .transform((v) => v.replace(/^#+/, '')) | ||
| .pipe(z.string().min(1, 'New tag name is required')) |
When awaitNoteCreation returns null — polling timeout or DB-lookup exception — the response now reads 'Note ID: unknown — the create request was sent, but the new note could not be confirmed. Check in Bear to verify.' instead of silently omitting the Note ID line. Following 1b5f713, info logs are toggle-only, so the previous stderr breadcrumb signaling the timeout is no longer visible by default. The silent omission was already ambiguous regardless: the same line is skipped when no title is provided, leaving callers unable to tell 'no title given' apart from 'lookup timed out' from the response alone. Restores the project's Mutation Response Metadata contract — every note-level mutation should return note ID + title + what changed.
The /^#+/ regex was duplicated across 5 Zod input schemas (tag-tools
and note-tools), and the system tests that exercised #tag / ##tag inputs
through bear-rename-tag and bear-delete-tag were removed earlier in this
PR — leaving the strip behavior unpinned and prone to silent drift.
Extract a single stripTagPrefix helper in src/operations/tags.ts and
swap all five .transform((v) => v.replace(/^#+/, '')) sites to use it.
Pin the helper with five unit cases: single # strip, multiple # strip,
bare name unchanged, lone '#' reduces to empty (downstream Zod min(1)
handles rejection), and an anchor sentinel ('foo#bar' unchanged) that
catches a regression dropping the ^ anchor.
This matches the CLAUDE.md test-strategy exception — input composition
exhaustively unit-tested + one system-test URL roundtrip is sufficient.
Bare-tag system tests for both rename and delete already exist
(tests/system/tag-management.test.ts:21-93), so additional system tests
for #tag input shapes would pay spawn-Inspector / open-Bear cost for
no integration risk.
Addresses Copilot review comment on PR #114.
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 49 out of 52 changed files in this pull request and generated 7 comments.
Comments suppressed due to low confidence (1)
tests/system/open-note-by-title.test.ts:69
- The suite no longer covers the validation path where
bear-open-noteis called with neitheridnortitle. Consider restoring a small test forargs: {}to ensure this error contract stays stable for clients and doesn’t regress during refactors.
| "sonarlint.connectedMode.project": { | ||
| "connectionId": "vasylenko", | ||
| "projectKey": "vasylenko_bear-notes-mcp" | ||
| } |
| const createResult = callTool({ | ||
| toolName: 'bear-create-note', | ||
| args: { title: TITLE, tags: TAG }, | ||
| }).content[0].text; | ||
| noteId = tryExtractNoteId(createResult)!; | ||
| }); |
| const createResult = callTool({ | ||
| toolName: 'bear-create-note', | ||
| args: { title, text: 'Note body text here', tags: 'system-test' }, | ||
| }).content[0].text; | ||
| const noteId = tryExtractNoteId(createResult)!; | ||
|
|
| const createResult = callTool({ | ||
| toolName: 'bear-create-note', | ||
| args: { title, text: 'Original content', tags: 'system-test' }, | ||
| }).content[0].text; | ||
| const noteId = tryExtractNoteId(createResult)!; | ||
|
|
| const createResult = callTool({ | ||
| toolName: 'bear-create-note', | ||
| args: { title: noteTitle, text: 'Will be trashed' }, | ||
| }).content[0].text; | ||
| const noteId = tryExtractNoteId(createResult)!; | ||
|
|
||
| try { | ||
| const createResult = callTool({ | ||
| toolName: 'bear-create-note', | ||
| args: { title: noteTitle, text: 'Will be trashed' }, | ||
| }).content[0].text; | ||
| noteId = tryExtractNoteId(createResult) ?? undefined; | ||
| expect(noteId).toBeDefined(); | ||
|
|
||
| trashNote(noteId!); | ||
| trashNote(noteId); |
| # yaml-language-server: $schema=https://promptfoo.dev/config-schema.json | ||
| description: 'bear-notes-mcp A/B — SVA-28 in-memory FTS5 search' | ||
| # Distinct outputPath from evals/native-vs-fts5.yaml so the two evals can run | ||
| # sequentially without clobbering each other's results. | ||
| outputPath: outputs/results.json | ||
|
|
| # yaml-language-server: $schema=https://promptfoo.dev/config-schema.json | ||
| description: 'bear-notes-mcp A/B — v3.0.0 FTS5 vs Bear native MCP' | ||
| # Distinct outputPath from evals/fts5-vs-like.yaml so the two evals can run | ||
| # sequentially without clobbering each other's results. | ||
| outputPath: outputs/results-native.json | ||
|
|



Summary
bear-search-notesnow ranks results by relevance (BM25) instead of substring-matching note text. Multi-word queries surface notes that cover more of the query terms across title, body, and OCR text from attached images and PDFs.[...]markers around the matched terms, so the agent can judge relevance without opening the note.bear-notes-mcpor2026-04-15is matched as a phrase; in multi-word queries those tokens are treated like any other word.Why
The old LIKE-substring backend forced agents to issue many narrow follow-up queries to assemble enough context, and ranked nothing — the first matched note won regardless of how well it answered the question. The SVA-28 A/B eval (n=30 prompts/variant) measured 98.3% content correctness on FTS5 vs 76.7% on v2.11.0 LIKE, with roughly 3× fewer tool calls per task. That gap is large enough to ship as the default search backend and makes a v3.0.0 cut justified.
--
Closes SVA-28