Qdrant backend by jeremydoumeng · Pull Request #283 · EPFLiGHT/mmore

jeremydoumeng · 2026-04-29T19:44:13Z

Qdrant vector backend

An ARM64-safe alternative to milvus-lite for mmore's index and retrieval
paths. Opt-in via a single YAML field; the default Milvus path is unchanged.

Why

milvus-lite ships x86_64-only wheels, so the embedded mode that mmore
uses by default cannot run on ARM64 hosts (NVIDIA GH200, Apple Silicon,
some cloud ARM instances). Qdrant local mode works on any architecture and
provides the same on-disk, no-server-required experience.

Install

pip install mmore[qdrant]

This installs qdrant-client alongside the existing mmore[index] deps.
On x86_64 you can mix-and-match — both backends can be installed at the
same time.

Note: milvus-lite is gated to platform_machine == 'x86_64'. ARM64
users get the base pymilvus client (so the package still imports) but
cannot use db.backend: milvus.

Usage

Set db.backend: qdrant in your IndexerConfig / RetrieverConfig YAML. The
uri becomes a directory path (Qdrant local mode), not a .db file.

db:
  backend: qdrant          # default is "milvus"
  uri: ./my_qdrant_dir     # directory; created on first use
  name: my_db              # accepted for parity, has no Qdrant equivalent

That's it. Indexer.from_config(...) and Retriever.from_config(...) pick
up the field automatically — no other code changes are needed.

You can also pass an HTTP(S) URL to point at a remote Qdrant server:

db:
  backend: qdrant
  uri: http://qdrant.internal:6333
  name: my_db

What changes

src/mmore/index/qdrant_client.py — a MilvusClient-shaped adapter
backed by qdrant-client local mode. Implements only the methods mmore
actually calls: has_collection, list_collections,
get_collection_stats, prepare_index_params, create_collection,
describe_index, insert, delete, flush, query, hybrid_search,
close.
Indexer.from_config / Retriever.from_config — one-line branch on
db.backend to construct either MilvusClient (default, unchanged) or
the Qdrant adapter.
pyproject.toml — adds the [qdrant] extra and gates milvus-lite to
x86_64.

No other call sites change. No public signatures change. No existing tests
modified.

Caveats

Hybrid fusion uses RRF. Qdrant's local mode supports Reciprocal Rank
Fusion; the WeightedRanker(w_dense, w_sparse) weights passed by mmore
are accepted but ignored (RRF is weight-agnostic). Top-k overlap with
Milvus is high in practice but rankings will differ slightly.
String IDs → UUIDs. Qdrant requires unsigned-int or UUID point IDs.
mmore's string chunk IDs are mapped via uuid5 deterministically; the
original is preserved in payload under _str_id and surfaced as the id
field in all results.
Partitions are emulated. Qdrant has no native partition concept.
partition_name=... is stored in the payload under _partition;
partition_names=[...] on retrieval translates to a payload-field
filter.
No fallback. If the Qdrant adapter raises, the user gets a normal
exception — there is no automatic switch to Milvus.
Local-mode file lock. Qdrant local mode holds a file lock on the
data directory. Close the adapter (indexer.client.close()) before
opening another client on the same path.

Smoke test

A standalone script at the repo root:

python test_qdrant_pipeline.py

Indexes 5 toy documents into each available backend, runs 3 retrieval
queries, and prints a side-by-side top-1 comparison if both backends are
installed. No model weights are downloaded — the script uses
FakeEmbeddings for the dense model and a stub sparse model so it runs
offline in a few seconds.

Expected output on ARM64 (Qdrant only):

[1/3] Indexing 5 documents...
      Inserted: 5 chunks
[2/3] Running retrieval for 3 queries...
  Q: When was Barack Obama born?
  A: Barack Obama was born on August 4, 1961, in Honolulu, Hawaii.…
  Q: Who founded Google?
  A: Google was founded by Larry Page and Sergey Brin in September 1998.…
  Q: Where is the Eiffel Tower located?
  A: The Eiffel Tower is located on the Champ de Mars in Paris, France.…
[3/3] Checking model metadata round-trip...
      dense  model: debug  ✓
      sparse model: naver/splade-cocondenser-selfdistil  ✓
  Backend QDRANT — ALL CHECKS PASSED ✓

Migrating an existing collection

Collections created by MilvusClient are not directly readable by
QdrantMilvusClient — they live in different on-disk formats. To switch
an existing index over to Qdrant:

Run the indexer pipeline that produced your Milvus collection.
Switch db.backend to qdrant and point db.uri at a fresh
directory.
Re-index from the same source documents.

There is no automatic conversion path; the adapter exists to give ARM64
users a working backend, not to migrate Milvus data.

milvus-lite ships x86_64-only wheels, so the default mmore index/retrieval path cannot run on ARM64 hosts (NVIDIA GH200, Apple Silicon, etc.). This change adds an opt-in Qdrant backend selected via a new YAML field db.backend: qdrant. The default path is unchanged: existing configs continue to instantiate MilvusClient exactly as before, so this is a strict superset of upstream behaviour. Implementation -------------- * QdrantMilvusClient is a drop-in MilvusClient-shaped adapter backed by qdrant-client local mode. Only the methods mmore actually uses are implemented: has_collection, list_collections, get_collection_stats, prepare_index_params, create_collection, describe_index, insert, delete, flush, query, hybrid_search, close. * Indexer.from_config and Retriever.from_config gain a one-line branch: if db.backend == "qdrant" they construct the adapter, otherwise they build a MilvusClient verbatim. No other call sites change. * New extra: pip install mmore[qdrant] (sibling of mmore[index]). * milvus-lite is gated to platform_machine == 'x86_64' so ARM64 users can still install mmore[index] for the base pymilvus client. Caveats ------- * Hybrid fusion uses Qdrant's RRF; the WeightedRanker weights passed by mmore are accepted but ignored (RRF is weight-agnostic). Top-k overlap with Milvus is high in practice. * String chunk IDs are mapped to UUIDs via uuid5 deterministically and the original is stored in the payload. * Logical partitions (partition_name=) are emulated via a payload field; Qdrant has no native partition concept. A standalone smoke-test (test_qdrant_pipeline.py) indexes 5 toy documents through both backends with no model downloads and prints a top-1 comparison so reviewers can see the parity for themselves.

Retriever inherits from langchain's BaseRetriever, which is a pydantic BaseModel. The previous `client: MilvusClient` annotation made pydantic reject the QdrantMilvusClient adapter at instantiation: pydantic_core._pydantic_core.ValidationError: 1 validation error for Retriever client Input should be an instance of MilvusClient [type=is_instance_of, input_value=<...QdrantMilvusClient...>] Relaxed the annotation to `client: Any` and added a comment pointing at qdrant_client.py for the shared surface. The Indexer class is not pydantic-validated and needs no change.

Move the post-`sys.path.insert` imports together at the top of the file and silence E402 with `# noqa`, since the path tweak must run before mmore is importable.

scripts/build_qdrant_alps.sh — reproducible Qdrant compile for aarch64 64K-page systems (Alps GH200). Sets JEMALLOC_SYS_WITH_LG_PAGE=16, pins protoc 34.1, defaults to v1.17.1. docs/QDRANT_BUILD.md — why the prebuilt aarch64 binary crashes on Alps (jemalloc page-size mismatch) and what the script does about it. docs/qdrantcolpali_design.md — design + validation of QdrantColpaliManager (native multi-vector / MAX_SIM, deterministic IDs, gRPC timeout tuning, synthetic correctness test, real-PDF integration test). tests/test_qdrant_server.py — PR EPFLiGHT#283's QdrantMilvusClient against a running server (smoke). tests/test_qdrant_colpali.py — synthetic 5-page MAX_SIM correctness test. tests/test_colpali_real.py — real PDF retrieval (COVID/LLaVA/calendar).

fabnemEPFL · 2026-05-29T13:11:57Z

News?

jeremydoumeng · 2026-05-29T15:30:08Z

News?

This PR implements Qdrant-lite, it works as is, though, it needs more documentation. The issue is that benchmarks were made on qdrant-server version which was a much more heavy rework than this addition and may interfere a lot with the current master.

JCHAVEROT · 2026-05-29T19:53:38Z

Could you please provide a short step by step tutorial on how you use it with CSCS ?

Ideally a markdown file next to rcp_and_production.md in the folder docs/source/advanced_usage/

…ite pin Master added milvus-lite==2.5.1 as a separate dep; PR makes it ARM64-conditional via 'platform_machine == x86_64'. Resolution keeps both: master's 2.5.1 pin AND the ARM64 guard. ARM64 users install mmore[qdrant] and switch to db.backend: qdrant.

jeremydoumeng · 2026-06-01T13:39:36Z

The documentation has been added

JCHAVEROT

Hi @jeremydoumeng,

Good job with QDrant, I tested it locally on my ARM64 computer and could perform RAG over the DB created by the QDrant backend successfully

I made a few comments about the doc, they should be quick to handle, in any case let me know.

Once you have detailed a bit more the CSCS, I'll test on it

JCHAVEROT · 2026-06-01T14:50:50Z

+```{important}
+The prebuilt Qdrant **server** binary for aarch64 ships a jemalloc compiled
+for 4 KB pages and crashes on GH200 with
+`<jemalloc>: Unsupported system page size`. Embedded mode bypasses this
+because it never loads the Rust binary. For server-mode workloads on Alps
+you need a custom Qdrant build (see
+[qdrant-alps](https://github.com/jeremydoumeng/qdrant-alps)).
+```


It's not clear what needs to be done so to solve the problem, clicking on the link we just find a new fork repo but don't know what to do

Ideally list all necessary commands in this file in a user-friendly way so that we don't even have to leave the documentation

…cscs link - Use `mmore index` / `mmore rag --config-file` instead of calling run_index / run_rag directly, for consistency with the other docs. - Fix the index config: wrap settings in the `indexer:` section with correct nesting (it was flat and would not load). - Replace the qdrant-alps reference with qdrant-cscs and inline the server build/launch commands so the server-mode path is self-contained.

JCHAVEROT

Please remove all the # noqa you introduced, they are not used anywhere else in the codebase. This is a very bad practice, which can end up being dangerous as you're not solving the problems just hiding them

JCHAVEROT · 2026-06-04T15:46:07Z

+For the server-mode path on Alps, use
+[qdrant-cscs](https://github.com/jeremydoumeng/qdrant-cscs): it ships a build
+script that compiles a Qdrant binary patched for GH200's 64 KB pages, and a
+Slurm wrapper that starts the server. In short:
+
+```bash
+git clone https://github.com/jeremydoumeng/qdrant-cscs.git && cd qdrant-cscs
+./scripts/build_qdrant_alps.sh              # one-time build (~5 min)
+sbatch scripts/start_qdrant_server.sbatch   # serves on 127.0.0.1:6333
+```
+
+Then point this guide's `db.uri` at the server URL
+(`http://127.0.0.1:6333`) instead of a directory path; everything else in the
+index/RAG configs stays the same. See that repo's README for the full recipe.


These instructions should be given earlier as we cannot run the aforementioned commands without

JCHAVEROT · 2026-06-04T15:47:46Z

+def _milvus_filter_to_qdrant(
+    expr: Optional[str],
+    partition_names: Optional[List[str]] = None,
+):
+    """Convert a Milvus filter expression to a ``qdrant_client.models.Filter``.
+
+    Supported patterns (the only ones mmore uses):
+
+    * ``field in ["a", "b", ...]``
+    * ``field == 'value'``
+    * ``field != 'value'`` (including ``field != ""``)
+
+    Anything else raises ``ValueError`` so unsupported patterns surface
+    loudly instead of silently returning the wrong rows.
+    """


This function is awfully ugly.

It parses using regex the Milvus filter strings back into Qdrant filter objects. We need it because QdrantMilvusClient is a replacement for MilvusClient (which takes filter strings) but that is an anti-pattern solution

As Milvus remains our primary vector DB we can keep this, but later we can create a shared filter object both clients use directly, so not to have strings to convert from

Co-authored-by: Jérémy Chaverot <chaverotjrmy7@gmail.com>

…ild recipe Address review: full indexer: config section with correct indentation, python3 -m mmore commands matching other docs, env setup before any commands, and the aarch64 server build steps listed inline instead of linking out to a fork.

Drop the sys.path hack so imports sit at the top (no E402) and use importlib.util.find_spec for backend availability checks (no F401). The script now requires mmore to be installed, matching the docs.

# Conflicts: # pyproject.toml

jeremydoumeng · 2026-06-10T14:09:29Z

Hi @JCHAVEROT , all points addressed:

milvus-lite pin reverted to unconditional, as suggested
all noqa removed, imports are now legitimately at the top and the availability checks use importlib.util.find_spec, so nothing is suppressed
docs reworked: python3 -m mmore CLI throughout, valid indexer: config example, env setup before any commands, and the full aarch64 Qdrant build recipe inlined (no external links)
merged latest master (resolved the colvision extra conflict in pyproject.toml) and updated uv.lock for the qdrant extra
Smoke test passes end-to-end on a GH200 node.

JCHAVEROT

I updated the pyright.yml workflow to also include your "qdrant" extra, it seems like out of the 4 type checks there are three coming from your additions (the last remaining one concerning the processors cannot be solved currently), if you can take a look please

Other than that the doc cscs.md looks much cleaner and I could run everything successfully on the CSCS cluster

One last thing: could you please add one sentence in the documentation indexing.md to also have the precision that there also exists qdrant as an alternative to Milvus, with an hyperlink to your doc

JCHAVEROT · 2026-06-15T10:30:49Z

+class StubSparseEmbedding(BaseSparseEmbedding):
+    """Returns a deterministic sparse vector keyed by word hash."""
+
+    def embed_query(self, query: str) -> Dict[int, float]:
+        return {hash(w) % 512: 1.0 for w in query.split()}
+
+    def embed_documents(self, texts: List[str]) -> List[Dict[int, float]]:
+        return [self.embed_query(t) for t in texts]
+
+
+_orig_sparse_from_config = _sparse_base.SparseModel.from_config
+
+
+@classmethod  # type: ignore[misc]
+def _stub_sparse_from_config(cls, config):
+    return StubSparseEmbedding()
+
+
+_sparse_base.SparseModel.from_config = _stub_sparse_from_config
+
+
+# ── Toy corpus ────────────────────────────────────────────────────────────────
+DOCS = [
+    MultimodalSample(
+        text="Barack Obama was born on August 4, 1961, in Honolulu, Hawaii.",
+        modalities=[],
+        metadata={"source": "wikipedia"},
+    ),
+    MultimodalSample(
+        text="Google was founded by Larry Page and Sergey Brin in September 1998.",
+        modalities=[],
+        metadata={"source": "wikipedia"},
+    ),
+    MultimodalSample(
+        text="The Eiffel Tower is located on the Champ de Mars in Paris, France.",
+        modalities=[],
+        metadata={"source": "wikipedia"},
+    ),
+    MultimodalSample(
+        text="The Python programming language was created by Guido van Rossum.",
+        modalities=[],
+        metadata={"source": "wikipedia"},
+    ),
+    MultimodalSample(
+        text="Mount Everest is the world's highest mountain above sea level.",
+        modalities=[],
+        metadata={"source": "wikipedia"},
+    ),
+]


We already have a class creating fake embeddings and a collection of MultimodalSamples in contest.py, please reuse them and update them if necessary (just be careful not to break other tests relying on them)

JCHAVEROT · 2026-06-15T10:30:53Z

Your test file is not in the tests/ folder hence cannot run in the CI along the other tests (you may want to change the tests.yml workflow to also install the "qdrant" extra)

JCHAVEROT · 2026-06-15T10:53:57Z

Please take a look at the variables having an Any type in this file, and try to replace them by the correct one if possible

jeremydoumeng added 3 commits April 29, 2026 17:52

Fix ruff I001 import order in test_qdrant_pipeline.py

a09b56c

Move the post-`sys.path.insert` imports together at the top of the file and silence E402 with `# noqa`, since the path tweak must run before mmore is importable.

JOMENGO added 3 commits May 31, 2026 19:40

docs: add CSCS Alps quickstart for embedded Qdrant

e487a3a

lint: apply ruff format + autofix (matches pre-commit CI hooks)

3e18208

JCHAVEROT added documentation Improvements or additions to documentation enhancement New feature or request labels Jun 1, 2026

JCHAVEROT assigned jeremydoumeng Jun 1, 2026

JCHAVEROT reviewed Jun 1, 2026

View reviewed changes

JCHAVEROT reviewed Jun 4, 2026

View reviewed changes

jeremydoumeng and others added 4 commits June 10, 2026 15:26

Update pyproject.toml

d918bef

Co-authored-by: Jérémy Chaverot <chaverotjrmy7@gmail.com>

Remove noqa suppressions from smoke test

6c2b493

Drop the sys.path hack so imports sit at the top (no E402) and use importlib.util.find_spec for backend availability checks (no F401). The script now requires mmore to be installed, matching the docs.

Merge remote-tracking branch 'origin/master' into qdrant-backend

0b777fc

# Conflicts: # pyproject.toml

JCHAVEROT added 2 commits June 15, 2026 12:18

Merge remote-tracking branch 'origin/master' into feat/qdrant

31a83f5

workflow: update pyright.yml to include qdrant extra

c94e30d

JCHAVEROT reviewed Jun 15, 2026

View reviewed changes

Uh oh!

Conversation

jeremydoumeng commented Apr 29, 2026

Qdrant vector backend

Why

Install

Usage

What changes

Caveats

Smoke test

Migrating an existing collection

Uh oh!

fabnemEPFL commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeremydoumeng commented May 29, 2026

Uh oh!

JCHAVEROT commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeremydoumeng commented Jun 1, 2026

Uh oh!

JCHAVEROT left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JCHAVEROT Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JCHAVEROT left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

JCHAVEROT Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

JCHAVEROT Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

jeremydoumeng commented Jun 10, 2026

Uh oh!

JCHAVEROT left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JCHAVEROT Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

JCHAVEROT Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

JCHAVEROT Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fabnemEPFL commented May 29, 2026 •

edited

Loading

JCHAVEROT commented May 29, 2026 •

edited

Loading

JCHAVEROT Jun 1, 2026 •

edited

Loading

JCHAVEROT left a comment •

edited

Loading