Add Similarity Search optimization guides by napetrov · Pull Request #20 · intel/optimization-zone

napetrov · 2026-02-10T01:44:54Z

Summary

This PR adds documentation for optimizing vector similarity search workloads on Intel Xeon processors.

New Content

software/similarity-search/README.md — Overview of Intel's compression technologies (LVQ, LeanVec) for vector search
software/similarity-search/redis/README.md — Complete guide for Redis SVS-VAMANA configuration and tuning

Key Topics Covered

SVS-VAMANA index configuration in Redis 8.2+
Vector compression options (LVQ4x4, LVQ4x8, LVQ8, LeanVec4x8, LeanVec8x8)
Performance tuning parameters
Benchmarks from Redis tech dive (memory, throughput, latency improvements)
Cross-platform compatibility (unified API, works on Intel/AMD/ARM)
FAQ addressing common questions

Benchmarks Highlighted

Based on Redis and Intel benchmarking:

Memory: 26-37% total reduction vs HNSW
Throughput: Up to 144% higher QPS
Latency: Up to 60% lower p50/p95

Related: Intel Scalable Vector Search library — https://intel.github.io/ScalableVectorSearch/

…entation

- Remove BIOS and hardware recommendation sections - Add high-level LVQ and LeanVec description - Replace benchmarks with data from Redis tech dive blog - Mention SVS can be used directly, integrations in progress - Remove contributing section - Focus on compression selection and performance data

- Add FAISS optimization guide with SVS index types - Cover IndexSVSVamana, IndexSVSVamanaLVQ, IndexSVSVamanaLeanVec - Include factory string format and examples - Add compression selection guide - Include benchmark data from Intel SVS - Update overview and main README

- Remove benchmark section (no official FAISS SVS performance data) - Update overview to remove specific performance claims - Clarify Intel-only requirement in FAQ - Keep technical how-to guidance

- Add licensing warning for LVQ/LeanVec (AGPLv3/SSPLv1 incompatible) - Add language spec to FAISS factory string code block

ahuber21 · 2026-02-10T10:29:38Z

software/similarity-search/faiss/README.md

+
+- High-performance graph-based similarity search optimized for Intel CPUs
+- Significant memory reduction with LVQ and LeanVec compression
+- Best performance on Intel Xeon with AVX-512 support


Should we distinguish AVX-512 VNNI?

ahuber21 · 2026-02-10T10:31:00Z

software/similarity-search/faiss/README.md

+
+### IndexSVSVamanaLVQ
+
+Combines Vamana with Locally-adaptive Vector Quantization (LVQ). LVQ applies per-vector normalization and scalar quantization, achieving up to 4x memory reduction while maintaining high accuracy.


Mention https://arxiv.org/abs/2304.04759 ?

ahuber21 · 2026-02-10T10:31:12Z

software/similarity-search/faiss/README.md

+
+### IndexSVSVamanaLeanVec
+
+Extends LVQ with dimensionality reduction. Best for high-dimensional vectors (768+ dimensions), achieving up to 8-16x memory reduction. Particularly effective for text embeddings from large language models.


Mention https://arxiv.org/abs/2312.16335 ?

ahuber21 · 2026-02-10T10:34:39Z

software/similarity-search/faiss/README.md

+### From PyPI (with Intel optimizations)
+
+```bash
+pip install faiss-cpu


As of today, faiss-cpu@1.13.2 from pip does not include SVS indices.

with all of other faiss related questions and pending PRs - would it worth just dropping faiss doc for now and add it later on?

Sure. I suggest to move FAISS doc to its own (draft) PR with all comments implemented and wait until SVS is released with FAISS. Then we can go ahead and merge the rest.

Agreed. FAISS doc removed from this PR. Will create a separate draft PR with all your feedback addressed once SVS is released with FAISS.

ahuber21 · 2026-02-10T11:05:08Z

software/similarity-search/faiss/README.md

+### From Conda (Intel channel)
+
+```bash
+conda install -c conda-forge faiss-cpu


As of today, the combination of faiss packages does not include SVS indices (tested with python 3.12)

faiss conda-forge/linux-64::faiss-1.9.0-py312hf23773a_0_cpu faiss-cpu conda-forge/linux-64::faiss-cpu-1.9.0-h718b53a_0

ahuber21 · 2026-02-10T11:14:13Z

software/similarity-search/faiss/README.md

+### Choosing Compression
+
+**Rule of thumb:**
+- Dimensions < 768: Use LVQ (LVQ8 or LVQ4x8)


Suggested change

- Dimensions < 768: Use LVQ (LVQ8 or LVQ4x8)

- Dimensions < 768: Use LVQ (LVQ8x0 or LVQ4x8)

ahuber21 · 2026-02-10T11:15:49Z

software/similarity-search/faiss/README.md

+index = faiss.index_factory(1536, "SVSVamana32,LeanVec4x8_384")
+```
+
+Lower reduced dimensions = faster search and less memory, but may reduce recall.


Suggested change

Lower reduced dimensions = faster search and less memory, but may reduce recall.

Lower reduced dimensions = faster search and less memory, but may reduce recall.

If no suffix is specified, `d/2` will be used as default.

ahuber21 · 2026-02-10T11:17:15Z

software/similarity-search/faiss/README.md

+```
+
+### Multi-threaded Search
+


Maybe add that we parallelize over queries? I.e., the queries passed to index.search() should contain at least n_threads elements.

ahuber21 · 2026-02-10T11:18:19Z

software/similarity-search/faiss/README.md

+
+```python
+import faiss
+print(hasattr(faiss, 'IndexSVSVamana'))  # True if SVS is available


Suggested change

print(hasattr(faiss, 'IndexSVSVamana')) # True if SVS is available

"SVS" in faiss.get_compile_options() # True if SVS is available

ahuber21 · 2026-02-10T11:19:07Z

software/similarity-search/faiss/README.md

+
+**A:** Try these adjustments:
+1. Increase `search_window_size` (e.g., 50 or 100)
+2. Use higher-bit compression (LVQ4x8 → LVQ8)


Suggested change

2. Use higher-bit compression (LVQ4x8 → LVQ8)

2. Use higher-bit compression (LVQ4x8 → LVQ8x0)

ahuber21

Dropping my change request to not block anything while I'm OOO. Still would like to see the FAISS doc sorted out though.

rsiyer-intel · 2026-02-11T20:08:06Z

@napetrov Do you plan to drop FAISS document in this PR?

napetrov · 2026-02-11T21:43:17Z

@napetrov Do you plan to drop FAISS document in this PR?

yes, removed.

Would give opportunity to look on this pr for couple other people

mihaic

Great work, @napetrov! I have some suggestions for your consideration.

mihaic · 2026-02-10T18:24:39Z

software/similarity-search/redis/README.md

+
+### Creating an SVS-VAMANA Index
+
+```bash


Remove "bash" marker.

Fixed, removed bash markers from Redis command blocks.

mihaic · 2026-02-11T23:30:29Z

software/similarity-search/README.md

+
+## Intel Scalable Vector Search (SVS)
+
+[Intel Scalable Vector Search (SVS)](https://intel.github.io/ScalableVectorSearch/) is a high-performance library for vector similarity search, optimized for Intel hardware. SVS can be used directly as a standalone library, and we are working on integrating it into various popular solutions to bring these optimizations to a wider audience.


Suggested change

[Intel Scalable Vector Search (SVS)](https://intel.github.io/ScalableVectorSearch/) is a high-performance library for vector similarity search, optimized for Intel hardware. SVS can be used directly as a standalone library, and we are working on integrating it into various popular solutions to bring these optimizations to a wider audience.

[Intel Scalable Vector Search (SVS)](https://intel.github.io/ScalableVectorSearch/) is a high-performance library for vector similarity search, optimized for Intel hardware. SVS can be used directly as a standalone library, and is integrated into popular solutions to bring these optimizations to a wider audience.

Good catch — SVS is already integrated into Redis, not just 'working on it'. Fixed.

mihaic · 2026-02-11T23:39:39Z

software/similarity-search/README.md

+Both LVQ and LeanVec support two-level compression schemes:
+
+1. **Level 1**: Fast candidate retrieval using compressed vectors
+2. **Level 2**: Re-ranking using residual encoding for accuracy


Suggested change

2. **Level 2**: Re-ranking using residual encoding for accuracy

2. **Level 2**: Re-ranking for accuracy (LVQ encodes residuals, LeanVec encodes the full dimensionality data)

Applied — important distinction between LVQ residuals and LeanVec full dimensionality encoding.

mihaic · 2026-02-11T23:43:02Z

software/similarity-search/README.md

+The naming convention reflects bits per dimension at each level:
+- `LVQ4x8`: 4 bits for Level 1, 8 bits for Level 2 (12 bits total per dimension)
+- `LVQ8`: Single-level, 8 bits per dimension
+- `LeanVec4x8`: Dimensionality reduction + 4-bit Level 1 + 8-bit Level 2


Suggested change

- `LeanVec4x8`: Dimensionality reduction + 4-bit Level 1 + 8-bit Level 2

- `LeanVec4x8`: 4-bit Level 1 encoding of reduced dimensionality data + 8-bit Level 2 encoding of full dimensionality data

Applied, much clearer now.

mihaic · 2026-02-11T23:48:17Z

software/similarity-search/redis/README.md

+|-----------|-------------|---------|-----------------|
+| TYPE | Vector data type (FLOAT16, FLOAT32) | - | FLOAT32 for accuracy, FLOAT16 for memory |
+| DIM | Vector dimensions | - | Must match your embeddings |
+| DISTANCE_METRIC | L2, IP, or COSINE | - | COSINE for normalized embeddings |


Fixed — L2 for normalized embeddings.

mihaic · 2026-02-12T00:32:10Z

software/similarity-search/redis/README.md

+- You need faster index construction
+- Working with lower-dimensional vectors (<512)


If both are not true, SVS has advantages.

Suggested change

- You need faster index construction

- Working with lower-dimensional vectors (<512)

- Working with lower-dimensional vectors (<512) and needing faster index construction

mihaic · 2026-02-12T00:37:41Z

software/similarity-search/redis/README.md

+
+**A:** The basic SVS-VAMANA algorithm with 8-bit scalar quantization (SQ8) is available in Redis Open Source on all platforms. Intel's LVQ and LeanVec optimizations require:
+- Intel hardware with AVX-512
+- Redis Software (commercial) or building with `BUILD_INTEL_SVS_OPT=yes`


Suggested change

- Redis Software (commercial) or building with `BUILD_INTEL_SVS_OPT=yes`

- Redis Software (commercial) or [building Redis Open Source](https://github.com/redis/redis?tab=readme-ov-file#running-redis-with-the-query-engine-and-optional-proprietary-intel-svs-vamana-optimisations) with `BUILD_INTEL_SVS_OPT=yes`

mihaic · 2026-02-12T00:38:28Z

software/similarity-search/redis/README.md

+### Q: What if recall is too low with compression?
+
+**A:** Try these steps in order:
+1. Increase `TRAINING_THRESHOLD` (e.g., 50000)


Suggested change

1. Increase `TRAINING_THRESHOLD` (e.g., 50000)

1. Increase `TRAINING_THRESHOLD` (e.g., 50000) if using LeanVec

mihaic · 2026-02-12T00:39:36Z

software/similarity-search/redis/README.md

+1. Increase `TRAINING_THRESHOLD` (e.g., 50000)
+2. Switch to higher-bit compression (LVQ4x8 → LVQ8, or LeanVec4x8 → LeanVec8x8)
+3. Increase `GRAPH_MAX_DEGREE` (e.g., 64 or 128)
+4. Increase `SEARCH_WINDOW_SIZE` at query time


Increasing the search window size is definitely the first thing to try.

mihaic · 2026-02-12T00:41:00Z

software/similarity-search/redis/README.md

+2. Switch to higher-bit compression (LVQ4x8 → LVQ8, or LeanVec4x8 → LeanVec8x8)
+3. Increase `GRAPH_MAX_DEGREE` (e.g., 64 or 128)
+4. Increase `SEARCH_WINDOW_SIZE` at query time
+5. For LeanVec, try a larger `REDUCE` value (closer to original dimensions)


REDUCE should be after TRAINING_THRESHOLD.

- Fix SVS intro: 'is integrated' instead of 'working on integrating' - Clarify Level 2 compression: LVQ encodes residuals, LeanVec encodes full dimensionality - Fix LeanVec4x8 description with reduced/full dimensionality detail - Fix DISTANCE_METRIC guidance: L2 for normalized embeddings - Add 'slower build' note to CONSTRUCTION_WINDOW_SIZE - Fix compression ratios: LVQ4x8 ~2.5x, LeanVec notation with f factor - Add LeanVec dimensionality reduction factor explanation - Remove bash markers from Redis command blocks - Fix SVS-VAMANA parameter count (14 → 12) in LeanVec example - Remove EPSILON row (not SVS-specific, range search only) - Fix attribution: Redis benchmarking (authors are Redis) - Fix precision metric: remove '+' (calibrated value) - Clarify SVS-VAMANA effectiveness: 'improving throughput' Co-authored-by: mihaic <mihaic@users.noreply.github.com>

mihaic

A few of my earlier suggestions are unaddressed and there is still a reference to the deleted Faiss document.

README.md

Co-authored-by: Mihai Capotă <mihai@mihaic.ro>

napetrov added 6 commits February 9, 2026 22:48

Add Similarity Search optimization guides with Redis SVS-VAMANA docum…

3af9601

…entation

Remove unverified benchmarks from FAISS guide

c9c3d46

- Remove benchmark section (no official FAISS SVS performance data) - Update overview to remove specific performance claims - Clarify Intel-only requirement in FAQ - Keep technical how-to guidance

Address CodeRabbit review comments

035891e

- Add licensing warning for LVQ/LeanVec (AGPLv3/SSPLv1 incompatible) - Add language spec to FAISS factory string code block

Remove license note, improve cross-platform messaging

60e74d5

ahuber21 suggested changes Feb 10, 2026

View reviewed changes

ahuber21 reviewed Feb 11, 2026

View reviewed changes

ahuber21 approved these changes Feb 11, 2026

View reviewed changes

Remove FAISS guide from PR, will add in separate PR after SVS release

83b9b97

mihaic reviewed Feb 12, 2026

View reviewed changes

napetrov requested review from ahuber21 and mihaic February 17, 2026 17:52

mihaic suggested changes Feb 17, 2026

View reviewed changes

README.md Outdated Show resolved Hide resolved

Update README.md

deea507

Co-authored-by: Mihai Capotă <mihai@mihaic.ro>


		### IndexSVSVamanaLVQ

		Combines Vamana with Locally-adaptive Vector Quantization (LVQ). LVQ applies per-vector normalization and scalar quantization, achieving up to 4x memory reduction while maintaining high accuracy.


		### IndexSVSVamanaLeanVec

		Extends LVQ with dimensionality reduction. Best for high-dimensional vectors (768+ dimensions), achieving up to 8-16x memory reduction. Particularly effective for text embeddings from large language models.

	- Dimensions < 768: Use LVQ (LVQ8 or LVQ4x8)
	- Dimensions < 768: Use LVQ (LVQ8x0 or LVQ4x8)

	Lower reduced dimensions = faster search and less memory, but may reduce recall.
	Lower reduced dimensions = faster search and less memory, but may reduce recall.
	If no suffix is specified, `d/2` will be used as default.

	print(hasattr(faiss, 'IndexSVSVamana')) # True if SVS is available
	"SVS" in faiss.get_compile_options() # True if SVS is available

	2. Use higher-bit compression (LVQ4x8 → LVQ8)
	2. Use higher-bit compression (LVQ4x8 → LVQ8x0)


		## Intel Scalable Vector Search (SVS)

		[Intel Scalable Vector Search (SVS)](https://intel.github.io/ScalableVectorSearch/) is a high-performance library for vector similarity search, optimized for Intel hardware. SVS can be used directly as a standalone library, and we are working on integrating it into various popular solutions to bring these optimizations to a wider audience.

	2. Level 2: Re-ranking using residual encoding for accuracy
	2. Level 2: Re-ranking for accuracy (LVQ encodes residuals, LeanVec encodes the full dimensionality data)

	- `LeanVec4x8`: Dimensionality reduction + 4-bit Level 1 + 8-bit Level 2
	- `LeanVec4x8`: 4-bit Level 1 encoding of reduced dimensionality data + 8-bit Level 2 encoding of full dimensionality data

	\| DISTANCE_METRIC \| L2, IP, or COSINE \| - \| COSINE for normalized embeddings \|
	\| DISTANCE_METRIC \| L2, IP, or COSINE \| - \| L2 for normalized embeddings \|

		- You need faster index construction
		- Working with lower-dimensional vectors (<512)

	- Redis Software (commercial) or building with `BUILD_INTEL_SVS_OPT=yes`
	- Redis Software (commercial) or [building Redis Open Source](https://github.com/redis/redis?tab=readme-ov-file#running-redis-with-the-query-engine-and-optional-proprietary-intel-svs-vamana-optimisations) with `BUILD_INTEL_SVS_OPT=yes`

	1. Increase `TRAINING_THRESHOLD` (e.g., 50000)
	1. Increase `TRAINING_THRESHOLD` (e.g., 50000) if using LeanVec

Conversation

napetrov commented Feb 10, 2026

Summary

New Content

Key Topics Covered

Benchmarks Highlighted

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ahuber21 left a comment

Choose a reason for hiding this comment

Uh oh!

rsiyer-intel commented Feb 11, 2026

Uh oh!

napetrov commented Feb 11, 2026

Uh oh!

mihaic left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mihaic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels