Add Similarity Search optimization guides#20
Conversation
- Remove BIOS and hardware recommendation sections - Add high-level LVQ and LeanVec description - Replace benchmarks with data from Redis tech dive blog - Mention SVS can be used directly, integrations in progress - Remove contributing section - Focus on compression selection and performance data
- Add FAISS optimization guide with SVS index types - Cover IndexSVSVamana, IndexSVSVamanaLVQ, IndexSVSVamanaLeanVec - Include factory string format and examples - Add compression selection guide - Include benchmark data from Intel SVS - Update overview and main README
- Remove benchmark section (no official FAISS SVS performance data) - Update overview to remove specific performance claims - Clarify Intel-only requirement in FAQ - Keep technical how-to guidance
- Add licensing warning for LVQ/LeanVec (AGPLv3/SSPLv1 incompatible) - Add language spec to FAISS factory string code block
|
|
||
| - High-performance graph-based similarity search optimized for Intel CPUs | ||
| - Significant memory reduction with LVQ and LeanVec compression | ||
| - Best performance on Intel Xeon with AVX-512 support |
|
|
||
| ### IndexSVSVamanaLVQ | ||
|
|
||
| Combines Vamana with Locally-adaptive Vector Quantization (LVQ). LVQ applies per-vector normalization and scalar quantization, achieving up to 4x memory reduction while maintaining high accuracy. |
There was a problem hiding this comment.
Mention https://arxiv.org/abs/2304.04759 ?
|
|
||
| ### IndexSVSVamanaLeanVec | ||
|
|
||
| Extends LVQ with dimensionality reduction. Best for high-dimensional vectors (768+ dimensions), achieving up to 8-16x memory reduction. Particularly effective for text embeddings from large language models. |
There was a problem hiding this comment.
Mention https://arxiv.org/abs/2312.16335 ?
| ### From PyPI (with Intel optimizations) | ||
|
|
||
| ```bash | ||
| pip install faiss-cpu |
There was a problem hiding this comment.
As of today, faiss-cpu@1.13.2 from pip does not include SVS indices.
There was a problem hiding this comment.
with all of other faiss related questions and pending PRs - would it worth just dropping faiss doc for now and add it later on?
There was a problem hiding this comment.
Sure. I suggest to move FAISS doc to its own (draft) PR with all comments implemented and wait until SVS is released with FAISS. Then we can go ahead and merge the rest.
There was a problem hiding this comment.
Agreed. FAISS doc removed from this PR. Will create a separate draft PR with all your feedback addressed once SVS is released with FAISS.
| ### From Conda (Intel channel) | ||
|
|
||
| ```bash | ||
| conda install -c conda-forge faiss-cpu |
There was a problem hiding this comment.
As of today, the combination of faiss packages does not include SVS indices (tested with python 3.12)
faiss conda-forge/linux-64::faiss-1.9.0-py312hf23773a_0_cpu
faiss-cpu conda-forge/linux-64::faiss-cpu-1.9.0-h718b53a_0
| ### Choosing Compression | ||
|
|
||
| **Rule of thumb:** | ||
| - Dimensions < 768: Use LVQ (LVQ8 or LVQ4x8) |
There was a problem hiding this comment.
| - Dimensions < 768: Use LVQ (LVQ8 or LVQ4x8) | |
| - Dimensions < 768: Use LVQ (LVQ8x0 or LVQ4x8) |
| index = faiss.index_factory(1536, "SVSVamana32,LeanVec4x8_384") | ||
| ``` | ||
|
|
||
| Lower reduced dimensions = faster search and less memory, but may reduce recall. |
There was a problem hiding this comment.
| Lower reduced dimensions = faster search and less memory, but may reduce recall. | |
| Lower reduced dimensions = faster search and less memory, but may reduce recall. | |
| If no suffix is specified, `d/2` will be used as default. |
| ``` | ||
|
|
||
| ### Multi-threaded Search | ||
|
|
There was a problem hiding this comment.
Maybe add that we parallelize over queries? I.e., the queries passed to index.search() should contain at least n_threads elements.
|
|
||
| ```python | ||
| import faiss | ||
| print(hasattr(faiss, 'IndexSVSVamana')) # True if SVS is available |
There was a problem hiding this comment.
| print(hasattr(faiss, 'IndexSVSVamana')) # True if SVS is available | |
| "SVS" in faiss.get_compile_options() # True if SVS is available |
|
|
||
| **A:** Try these adjustments: | ||
| 1. Increase `search_window_size` (e.g., 50 or 100) | ||
| 2. Use higher-bit compression (LVQ4x8 → LVQ8) |
There was a problem hiding this comment.
| 2. Use higher-bit compression (LVQ4x8 → LVQ8) | |
| 2. Use higher-bit compression (LVQ4x8 → LVQ8x0) |
ahuber21
left a comment
There was a problem hiding this comment.
Dropping my change request to not block anything while I'm OOO. Still would like to see the FAISS doc sorted out though.
|
@napetrov Do you plan to drop FAISS document in this PR? |
yes, removed. Would give opportunity to look on this pr for couple other people |
|
|
||
| ### Creating an SVS-VAMANA Index | ||
|
|
||
| ```bash |
There was a problem hiding this comment.
Fixed, removed bash markers from Redis command blocks.
software/similarity-search/README.md
Outdated
|
|
||
| ## Intel Scalable Vector Search (SVS) | ||
|
|
||
| [Intel Scalable Vector Search (SVS)](https://intel.github.io/ScalableVectorSearch/) is a high-performance library for vector similarity search, optimized for Intel hardware. SVS can be used directly as a standalone library, and we are working on integrating it into various popular solutions to bring these optimizations to a wider audience. |
There was a problem hiding this comment.
| [Intel Scalable Vector Search (SVS)](https://intel.github.io/ScalableVectorSearch/) is a high-performance library for vector similarity search, optimized for Intel hardware. SVS can be used directly as a standalone library, and we are working on integrating it into various popular solutions to bring these optimizations to a wider audience. | |
| [Intel Scalable Vector Search (SVS)](https://intel.github.io/ScalableVectorSearch/) is a high-performance library for vector similarity search, optimized for Intel hardware. SVS can be used directly as a standalone library, and is integrated into popular solutions to bring these optimizations to a wider audience. |
There was a problem hiding this comment.
Good catch — SVS is already integrated into Redis, not just 'working on it'. Fixed.
software/similarity-search/README.md
Outdated
| Both LVQ and LeanVec support two-level compression schemes: | ||
|
|
||
| 1. **Level 1**: Fast candidate retrieval using compressed vectors | ||
| 2. **Level 2**: Re-ranking using residual encoding for accuracy |
There was a problem hiding this comment.
| 2. **Level 2**: Re-ranking using residual encoding for accuracy | |
| 2. **Level 2**: Re-ranking for accuracy (LVQ encodes residuals, LeanVec encodes the full dimensionality data) |
There was a problem hiding this comment.
Applied — important distinction between LVQ residuals and LeanVec full dimensionality encoding.
software/similarity-search/README.md
Outdated
| The naming convention reflects bits per dimension at each level: | ||
| - `LVQ4x8`: 4 bits for Level 1, 8 bits for Level 2 (12 bits total per dimension) | ||
| - `LVQ8`: Single-level, 8 bits per dimension | ||
| - `LeanVec4x8`: Dimensionality reduction + 4-bit Level 1 + 8-bit Level 2 |
There was a problem hiding this comment.
| - `LeanVec4x8`: Dimensionality reduction + 4-bit Level 1 + 8-bit Level 2 | |
| - `LeanVec4x8`: 4-bit Level 1 encoding of reduced dimensionality data + 8-bit Level 2 encoding of full dimensionality data |
| |-----------|-------------|---------|-----------------| | ||
| | TYPE | Vector data type (FLOAT16, FLOAT32) | - | FLOAT32 for accuracy, FLOAT16 for memory | | ||
| | DIM | Vector dimensions | - | Must match your embeddings | | ||
| | DISTANCE_METRIC | L2, IP, or COSINE | - | COSINE for normalized embeddings | |
There was a problem hiding this comment.
| | DISTANCE_METRIC | L2, IP, or COSINE | - | COSINE for normalized embeddings | | |
| | DISTANCE_METRIC | L2, IP, or COSINE | - | L2 for normalized embeddings | |
There was a problem hiding this comment.
Fixed — L2 for normalized embeddings.
| - You need faster index construction | ||
| - Working with lower-dimensional vectors (<512) |
There was a problem hiding this comment.
If both are not true, SVS has advantages.
| - You need faster index construction | |
| - Working with lower-dimensional vectors (<512) | |
| - Working with lower-dimensional vectors (<512) and needing faster index construction |
|
|
||
| **A:** The basic SVS-VAMANA algorithm with 8-bit scalar quantization (SQ8) is available in Redis Open Source on all platforms. Intel's LVQ and LeanVec optimizations require: | ||
| - Intel hardware with AVX-512 | ||
| - Redis Software (commercial) or building with `BUILD_INTEL_SVS_OPT=yes` |
There was a problem hiding this comment.
| - Redis Software (commercial) or building with `BUILD_INTEL_SVS_OPT=yes` | |
| - Redis Software (commercial) or [building Redis Open Source](https://github.com/redis/redis?tab=readme-ov-file#running-redis-with-the-query-engine-and-optional-proprietary-intel-svs-vamana-optimisations) with `BUILD_INTEL_SVS_OPT=yes` |
| ### Q: What if recall is too low with compression? | ||
|
|
||
| **A:** Try these steps in order: | ||
| 1. Increase `TRAINING_THRESHOLD` (e.g., 50000) |
There was a problem hiding this comment.
| 1. Increase `TRAINING_THRESHOLD` (e.g., 50000) | |
| 1. Increase `TRAINING_THRESHOLD` (e.g., 50000) if using LeanVec |
| 1. Increase `TRAINING_THRESHOLD` (e.g., 50000) | ||
| 2. Switch to higher-bit compression (LVQ4x8 → LVQ8, or LeanVec4x8 → LeanVec8x8) | ||
| 3. Increase `GRAPH_MAX_DEGREE` (e.g., 64 or 128) | ||
| 4. Increase `SEARCH_WINDOW_SIZE` at query time |
There was a problem hiding this comment.
Increasing the search window size is definitely the first thing to try.
| 2. Switch to higher-bit compression (LVQ4x8 → LVQ8, or LeanVec4x8 → LeanVec8x8) | ||
| 3. Increase `GRAPH_MAX_DEGREE` (e.g., 64 or 128) | ||
| 4. Increase `SEARCH_WINDOW_SIZE` at query time | ||
| 5. For LeanVec, try a larger `REDUCE` value (closer to original dimensions) |
There was a problem hiding this comment.
REDUCE should be after TRAINING_THRESHOLD.
- Fix SVS intro: 'is integrated' instead of 'working on integrating' - Clarify Level 2 compression: LVQ encodes residuals, LeanVec encodes full dimensionality - Fix LeanVec4x8 description with reduced/full dimensionality detail - Fix DISTANCE_METRIC guidance: L2 for normalized embeddings - Add 'slower build' note to CONSTRUCTION_WINDOW_SIZE - Fix compression ratios: LVQ4x8 ~2.5x, LeanVec notation with f factor - Add LeanVec dimensionality reduction factor explanation - Remove bash markers from Redis command blocks - Fix SVS-VAMANA parameter count (14 → 12) in LeanVec example - Remove EPSILON row (not SVS-specific, range search only) - Fix attribution: Redis benchmarking (authors are Redis) - Fix precision metric: remove '+' (calibrated value) - Clarify SVS-VAMANA effectiveness: 'improving throughput' Co-authored-by: mihaic <mihaic@users.noreply.github.com>
mihaic
left a comment
There was a problem hiding this comment.
A few of my earlier suggestions are unaddressed and there is still a reference to the deleted Faiss document.
Co-authored-by: Mihai Capotă <mihai@mihaic.ro>
Summary
This PR adds documentation for optimizing vector similarity search workloads on Intel Xeon processors.
New Content
software/similarity-search/README.md— Overview of Intel's compression technologies (LVQ, LeanVec) for vector searchsoftware/similarity-search/redis/README.md— Complete guide for Redis SVS-VAMANA configuration and tuningKey Topics Covered
Benchmarks Highlighted
Based on Redis and Intel benchmarking:
Related: Intel Scalable Vector Search library — https://intel.github.io/ScalableVectorSearch/