Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# LanceDB Mintlify Documentation
# LanceDB Documentation

Home of the [LanceDB](https://lancedb.com/) documentation. Built using [Mintlify](https://www.mintlify.com/).

Expand Down
3 changes: 1 addition & 2 deletions docs/api-reference/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ refer to the API documentation linked below.
If you're looking for a REST API reference, visit the [REST API](/api-reference/rest) page.

If you're looking for conceptual and practical namespace guidance before diving into method signatures, see
[Namespaces and Catalog Model](/namespaces) and [Using Namespaces in SDKs](/tables/namespaces).
[Namespaces and Catalog Model](/namespaces) and [Using Namespaces in SDKs](/namespaces/usage).

## Supported SDKs

Expand Down Expand Up @@ -46,4 +46,3 @@ for users working in languages other than those listed above.
| <Icon icon="swift" /> [Swift](https://github.com/RyanLisse/LanceDbSwiftKit) | Community-contributed Swift SDK for LanceDB |
| <Icon icon="R" /> [R](https://github.com/CathalByrneGit/lancedb) | Community-contributed R package for LanceDB |
| <Icon icon="flutter" /> [Flutter](https://github.com/Alexcn/flutter_lancedb) | Community-contributed Flutter bindings for LanceDB |

8 changes: 4 additions & 4 deletions docs/cloud/get-started.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ import numpy as np
import pyarrow as pa
import os

# Connect to LanceDB Cloud/Enterprise
# Connect to LanceDB Enterprise
uri = "db://your-database-uri"
api_key = "your-api-key"
region = "us-east-1"
Expand All @@ -59,7 +59,7 @@ db = lancedb.connect(
import { connect, Index, Table } from '@lancedb/lancedb';
import { FixedSizeList, Field, Float32, Schema, Utf8 } from 'apache-arrow';

// Connect to LanceDB Cloud/Enterprise
// Connect to LanceDB Enterprise
const dbUri = process.env.LANCEDB_URI || 'db://your-database-uri';
const apiKey = process.env.LANCEDB_API_KEY;
const region = process.env.LANCEDB_REGION;
Expand Down Expand Up @@ -273,7 +273,7 @@ console.log('Successfully created table');
After creating a table with vector data, you'll want to create an index to enable fast similarity searches. The index creation process optimizes the data structure for efficient vector similarity lookups, significantly improving query performance for large datasets.

<Check>
Unlike in LanceDB OSS, the `create_index`/`createIndex` operation executes **asynchronously** in LanceDB Cloud/Enterprise. To ensure the index is fully built, you can use the `wait_timeout` parameter or call `wait_for_index` on the table.
Unlike in LanceDB OSS, the `create_index`/`createIndex` operation executes **asynchronously** in LanceDB Enterprise. To ensure the index is fully built, you can use the `wait_timeout` parameter or call `wait_for_index` on the table.
</Check>

<CodeGroup>
Expand Down Expand Up @@ -373,6 +373,6 @@ console.log(filteredResults);

## What's Next?

It's time to use LanceDB Cloud/Enterprise in your own projects!
It's time to use LanceDB Enterprise in your own projects!
We've prepared more [tutorials](/tutorials/) for you to continue learning. If you
have any questions, reach out via [Discord](https://discord.gg/AUEWnJ7Txb).
32 changes: 19 additions & 13 deletions docs/docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -40,17 +40,17 @@
"pages": [
"index",
"lance",
"namespaces"
"tables-and-namespaces"
]
},
{
"group": "LanceDB Enterprise",
"pages": [
"enterprise/index",
"enterprise/features",
"enterprise/quickstart",
"enterprise/architecture",
"enterprise/benchmarks",
"enterprise/security",
"enterprise/benchmarks",
{
"group": "Deployment",
"pages": [
Expand All @@ -70,21 +70,34 @@
]
},
{
"group": "User Guide",
"group": "Guides",
"pages": [
{
"group": "Working with tables",
"group": "Table operations",
"pages": [
"tables/index",
"tables/create",
"tables/multimodal",
"tables/schema",
"tables/update",
"tables/namespaces",
"tables/versioning",
"tables/consistency"
]
},
{
"group": "Namespaces",
"pages": [
"namespaces/index",
"namespaces/usage"
]
},
{
"group": "Embeddings",
"pages": [
"embedding/index",
"embedding/quickstart"
]
},
{
"group": "Indexing",
"pages": [
Expand Down Expand Up @@ -124,13 +137,6 @@
"reranking/eval"
]
},
{
"group": "Embeddings",
"pages": [
"embedding/index",
"embedding/quickstart"
]
},
{
"group": "Storage",
"pages": [
Expand Down
2 changes: 0 additions & 2 deletions docs/embedding/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -50,8 +50,6 @@ while Rust examples typically compute query embeddings explicitly before vector

### Using an embedding function

<Badge color="green">Python SDK</Badge>

In the Python SDK, the `.create()` method accepts several arguments to configure embedding function behavior. `max_retries` is a special argument that applies to all providers.

| Argument | Type | Description |
Expand Down
79 changes: 0 additions & 79 deletions docs/enterprise/features.mdx

This file was deleted.

77 changes: 71 additions & 6 deletions docs/enterprise/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -5,16 +5,16 @@ description: "Features and benefits of LanceDB Enterprise."
icon: "server"
---

**LanceDB Enterprise** is both a **private cloud or a BYOC solution** that transforms your data lake into
a high-performance vector database or lakehouse that can operate at extreme scale.
**LanceDB Enterprise** is a private cloud or a bring-your-own-cloud (BYOC) solution that transforms your data lake into
a high-performance **multimodal lakehouse** that can operate at extreme scale.

With a vector database built for [lakehouse architecture](/enterprise/architecture), you can serve millions of tables and tens
With its [lakehouse architecture](/enterprise/architecture), you can serve millions of tables and tens
of billions of rows in a single index, improve retrieval quality using hybrid search with blazing-fast
metadata filters, and reduce costs by up to 200x with object storage.

<Callout icon="key" color="#FFC107" iconType="regular">
For private deployments, high performance at extreme scale, or if you have strict security requirements,
[reach out to us](mailto:contact@lancedb.com) regarding LanceDB Enterprise.
If you need private deployments, high performance at extreme scale, or if you have strict security requirements,
[reach out to our team](mailto:contact@lancedb.com) to set up a LanceDB Enterprise cluster in your environment.
</Callout>

## Key benefits of LanceDB Enterprise
Expand Down Expand Up @@ -53,4 +53,69 @@ changes or data migration required!
| **Effortless Migration** | Migrate from Open Source LanceDB to LanceDB Enterprise by simply using a connection URL. |
| **Observability** | First-class integration with existing observability systems for logging, monitoring, and distributed traces using OpenTelemetry. |

Take a look at a more thorough [list of benefits of LanceDB Enterprise](/enterprise/features).
## How is LanceDB Enterprise different from OSS?

LanceDB Enterprise is a distributed cluster that spans many machines (unlike LanceDB OSS, which is an embedded database that runs inside your process). Both are built on top of the same Lance columnar file format, so moving data from one edition to the other requires no conversion.

| Dimension | LanceDB OSS | LanceDB Enterprise | What the difference means |
|:----------|:------------|:-------------------|:-------------------------|
| **Mode** | Single process | Distributed fleet | OSS lives on one host. Enterprise spreads work across nodes and keeps serving even if one node fails. |
| **Latency from object storage** | 500–1000 ms | 50–200 ms | Enterprise mitigates network delay with an SSD cache and parallel reads. |
| **Throughput** | 10–50 QPS | Up to 10,000 QPS | A cluster can serve thousands of concurrent users; a single process cannot. |
| **Cache** | None | Distributed NVMe cache | Enterprise keeps hot data near compute and avoids repeated S3 calls. |
| **Indexing & compaction** | Manual | Automatic | Enterprise runs background jobs that rebuild and compact data without downtime. |
| **Data format** | Supports multiple available standards | Supports multiple available standards | No vendor lock-in; data moves freely between editions. |
| **Deployment** | Embedded in your code | Self-managed or Managed Service | Enterprise meets uptime, compliance, and support goals that OSS cannot. |

### Architecture and scale

LanceDB OSS is directly embedded into your service. The process owns all CPU, memory, and storage, so scale is limited to what the host can provide.
LanceDB Enterprise separates work into routers, execution nodes, and background workers. New nodes join the cluster through a discovery service; they register, replicate metadata, and begin answering traffic without a restart. A distributed control plane watches node health, shifts load away from unhealthy nodes, and enforces consensus rules that prevent split-brain events.

Read More: [LanceDB Enterprise Architecture](/enterprise/architecture/)

### Latency of data retrieval

With Lance OSS every query fetches data from S3, GCS, or Azure Blob. Each round trip to an object store adds several hundred milliseconds, especially when data is cold.

LanceDB Enterprise uses NVMe SSDs as a hybrid cache, before the data store is even accessed. The first read fills the cache, and subsequent reads come from the local disk and return in tens of milliseconds. Parallel chunked reads further reduce tail latency. This gap matters when the application serves interactive dashboards or real-time recommendations.

Read More: [LanceDB Enterprise Performance](/enterprise/benchmarks/)

### Throughput of search queries

A single LanceDB OSS process shares one CPU pool with the rest of the application. When concurrent queries hit that CPU, retrieval and similarity processes compete for cores. The server cannot process more work in parallel and any extra traffic waits in the queue, raising latency without increasing queries per second.

LanceDB Enterprise distributes queries across many execution nodes. Each node runs a dedicated vector search engine that exploits all cores and uses SIMD instructions. A load balancer assigns queries to the least-loaded node, so throughput grows roughly linearly as more nodes join the cluster.

### Caching of commonly retrieved data

LanceDB OSS has no built-in cache. Every read repeats the same object-store round trip and pays the same latency penalty.

LanceDB Enterprise shards a cache across the fleet with consistent hashing. Popular vectors remain on local NVMe drives until they age out under a least-recently-used policy. Cache misses fall back to the object store, fill the local shard, and serve future reads faster. This design slashes both latency and egress cost for workloads with temporal locality.

### Maintenance of vector indexes

Vector indexes fragment when data is inserted, updated, or deleted. Fragmentation slows queries because the engine must scan more blocks. LanceDB OSS offers a CLI call to compact or rebuild the index, but you must schedule it and stop queries while it runs.

LanceDB Enterprise runs compaction jobs in the background. It copies data to a scratch space, rebuilds the index, swaps the old files atomically, and frees disk space. Production traffic continues uninterrupted.

Read More: [Indexing in LanceDB](/indexing/)

### Deployment and governance

When you work with LanceDB OSS, it is included as part of your binary, Docker, or serverless function. The footprint is small, and no extra services run beside it.

LanceDB Enterprise comes in two flavors. The self-managed template installs the deployment inside your VPC, so data never leaves your account. The managed SaaS option hands day-to-day operations to the vendor, including patching, scaling, and 24×7 monitoring. Both enterprise modes support private networking, role-based access control, audit logs, and single sign-on.

Read More: [LanceDB Enterprise Performance](/enterprise/deployment/)

## Which option is best?

LanceDB OSS makes sense when the entire dataset fits on one machine, daily traffic remains under fifty queries per second, and your team can run manual maintenance without affecting users.

[It's very simple to get started with OSS](/quickstart/): Get started with `pip install lancedb` and begin ingesting your data and vectors into LanceDB.

Move to LanceDB Enterprise when you have petabyte-scale data, or you need latency to be below 200 ms, or you need higher query throughput towards thousands of QPS, or your business requires high availability, compliance controls, and vendor support.

If these sound like your use cases, [reach out via this form](https://lancedb.com/contact/) and we can help you scope your workload and arrange an Enterprise proof of concept.
Loading