Skip to content

Harden @ottabase/ottasearch query parsing and remove partial reindex limits#133

Draft
Copilot wants to merge 4 commits intomainfrom
copilot/implement-ottabase-search
Draft

Harden @ottabase/ottasearch query parsing and remove partial reindex limits#133
Copilot wants to merge 4 commits intomainfrom
copilot/implement-ottabase-search

Conversation

Copy link
Copy Markdown

Copilot AI commented Feb 28, 2026

This PR addresses two valid issues in the new in-house search stack: fragile FTS query handling for malformed user input, and incomplete indexing caused by a hard 200-row reindex cap.
It also tightens helper coverage for query normalization edge cases.

  • Problem summary

    • FTS MATCH accepted raw user input, which could produce malformed queries (***, unbalanced symbols) and unstable search behavior.
    • Reindex processed only the first page of records (limit: 200), leaving larger models partially indexed.
  • Changes

    • Safe FTS query normalization
      • Added normalizeFtsQuery(query) in packages/ottasearch/src/search.ts.
      • Strips unsafe FTS tokens, preserves valid unicode alnum tokens, bounds term count, and emits prefix terms (term*) joined with OR.
      • Exported via packages/ottasearch/src/index.ts.
      • worker/routes/ottasearch.ts now uses normalized query and returns empty result sets when no valid terms remain.
    • Full-dataset reindexing
      • Replaced single-shot ModelClass.all({ limit: 200 }) with paged reads (limit + offset) until exhaustion.
      • Reindex now covers all rows for enabled models rather than only the first page.
    • Focused helper tests
      • Extended packages/ottasearch/src/__tests__/search.test.ts with normalization edge cases:
        • special-char-only input
        • whitespace-only input
        • unicode terms
        • term-limit truncation behavior
  • Illustrative snippet

// Before: raw query directly into FTS MATCH
const query = url.searchParams.get('q')?.trim() ?? '';

// After: normalized + bounded FTS expression
const query = url.searchParams.get('q')?.trim() ?? '';
const ftsQuery = normalizeFtsQuery(query);
if (!ftsQuery) return jsonResponse({ results: [] });

const ftsResult = await env.OBCF_D1
  .prepare(`... WHERE ${OTTASEARCH_FTS_TABLE} MATCH ? ...`)
  .bind(ftsQuery, limit)
  .all();

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 2 commits February 28, 2026 21:42
Co-authored-by: thinkdj <688055+thinkdj@users.noreply.github.com>
Co-authored-by: thinkdj <688055+thinkdj@users.noreply.github.com>
Copilot AI changed the title [WIP] Implement Ottabase Search with D1 FTS and vectorization Implement in-house @ottabase/ottasearch (D1 FTS + optional Vectorize), add admin search management UI, and wire global spotlight search Feb 28, 2026
Co-authored-by: thinkdj <688055+thinkdj@users.noreply.github.com>
Copilot AI changed the title Implement in-house @ottabase/ottasearch (D1 FTS + optional Vectorize), add admin search management UI, and wire global spotlight search Harden @ottabase/ottasearch query parsing and remove partial reindex limits Feb 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants