Skip to content

[codex] support TiDB Vector full-text and hybrid search#38112

Open
bkidy wants to merge 1 commit into
langgenius:mainfrom
bkidy:codex/tidb-vector-fulltext-hybrid-search
Open

[codex] support TiDB Vector full-text and hybrid search#38112
bkidy wants to merge 1 commit into
langgenius:mainfrom
bkidy:codex/tidb-vector-fulltext-hybrid-search

Conversation

@bkidy

@bkidy bkidy commented Jun 28, 2026

Copy link
Copy Markdown

Important

  1. Make sure you have read our contribution guidelines
  2. Ensure there is an associated issue and you have been assigned to it
  3. Use the correct syntax to link this PR: Fixes #<issue number>.

Summary

Fixes #37145

This PR adds optional full-text and hybrid search support for TiDB Vector.

  • Adds TIDB_VECTOR_ENABLE_FULLTEXT_SEARCH with a default of false so existing TiDB Vector deployments remain semantic-only by default.
  • Creates a TiDB FULLTEXT INDEX ... WITH PARSER MULTILINGUAL on the text column when the feature is enabled.
  • Implements search_by_full_text with FTS_MATCH_WORD, including document-id filtering and score propagation.
  • Exposes full-text and hybrid retrieval methods for TiDB Vector only when the feature flag is enabled.
  • Adds unit coverage for the provider and retrieval-method contract.

Note: I attempted to assign issue #37145 to myself, but GitHub returned bkidy not found for the upstream issue assignment. The issue is still linked here.

Screenshots

Before After
N/A N/A

Checklist

  • This change requires a documentation update, included: Dify Document
  • I understand that this PR may be closed in case there was no previous discussion or issues. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly.
  • I ran make lint && make type-check (backend) and cd web && pnpm exec vp staged (frontend) to appease the lint gods

Validation

  • PYTHONPATH=api/providers/vdb/vdb-tidb-vector/src api/.venv/bin/python -m pytest -o addopts='' api/providers/vdb/vdb-tidb-vector/tests/unit_tests/test_tidb_vector.py api/tests/unit_tests/controllers/console/datasets/test_datasets.py::TestDatasetRetrievalSettingApi::test_tidb_vector_returns_semantic_only_when_fulltext_disabled api/tests/unit_tests/controllers/console/datasets/test_datasets.py::TestDatasetRetrievalSettingApi::test_tidb_vector_returns_full_methods_when_fulltext_enabled -q
  • git diff --check
  • ruff format --check and ruff check on the touched Python files

From Codex

@bkidy bkidy marked this pull request as ready for review June 28, 2026 16:02
@dosubot dosubot Bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Jun 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support full-text and hybrid search for TiDB Vector (tidb_vector)

2 participants