Skip to content

Commit f501c1e

Browse files
authored
Merge pull request #5 from gabisonia/release-v0.2.3-prep-mssql-removal
Remove MSSQL store and related functionality
2 parents a135c8f + f5e5877 commit f501c1e

22 files changed

Lines changed: 175 additions & 2730 deletions

Makefile

Lines changed: 1 addition & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
.PHONY: test test-no-cache test-stores test-stores-no-cache test-integration-all test-integration-all-no-cache test-integration-stores test-integration-stores-no-cache test-integration-postgres test-integration-postgres-no-cache test-integration-mssql test-integration-mssql-no-cache test-integration-msql
1+
.PHONY: test test-no-cache test-stores test-stores-no-cache test-integration-all test-integration-all-no-cache test-integration-stores test-integration-stores-no-cache test-integration-postgres test-integration-postgres-no-cache
22

33
GO ?= go
44
GOTOOLCHAIN ?= local
@@ -37,11 +37,3 @@ test-integration-postgres:
3737

3838
test-integration-postgres-no-cache:
3939
$(GO_ENV) $(GO) test -count=1 -mod=$(MOD_MODE) -tags=$(INTEGRATION_TAG) -timeout=$(TEST_TIMEOUT) $(TEST_FLAGS) ./stores/postgres
40-
41-
test-integration-mssql:
42-
$(GO_ENV) $(GO) test -mod=$(MOD_MODE) -tags=$(INTEGRATION_TAG) -timeout=$(TEST_TIMEOUT) $(TEST_FLAGS) ./stores/mssql
43-
44-
test-integration-mssql-no-cache:
45-
$(GO_ENV) $(GO) test -count=1 -mod=$(MOD_MODE) -tags=$(INTEGRATION_TAG) -timeout=$(TEST_TIMEOUT) $(TEST_FLAGS) ./stores/mssql
46-
47-
test-integration-msql: test-integration-mssql

README.md

Lines changed: 5 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,6 @@ A lightweight Go vector-store library inspired by `Microsoft.Extensions.VectorDa
1010
MVP scope:
1111
- Go 1.24.x
1212
- Postgres + `pgvector`
13-
- MSSQL connector
1413
- Record-based core API with optional typed codec wrapper
1514

1615
This library can be used to build retrieval systems such as:
@@ -22,14 +21,12 @@ This library can be used to build retrieval systems such as:
2221

2322
- Go 1.24.x
2423
- PostgreSQL 16+ with `pgvector` extension
25-
- SQL Server 2022+ (for `stores/mssql`)
2624
- Docker (optional, for integration tests and sample compose flows)
2725

2826
## Project layout
2927

3028
- `vectordata`: backend-agnostic core interfaces, record model, filters, typed wrapper
3129
- `stores/postgres`: Postgres implementation with `pgxpool`
32-
- `stores/mssql`: SQL Server implementation with `database/sql`
3330
- `samples`: runnable demos (see `samples/README.md`)
3431
- `docs`: architecture and implementation notes
3532

@@ -197,25 +194,15 @@ go test ./...
197194
```
198195

199196
Notes:
200-
- Integration tests start Postgres/pgvector and SQL Server automatically via Testcontainers
197+
- Integration tests start Postgres/pgvector automatically via Testcontainers
201198
- Docker daemon must be available when running integration tests
202199
- Optional override: set `PGVECTOR_TEST_DSN` to use an existing Postgres instance instead of starting a container
203-
- Optional override: set `MSSQL_TEST_DSN` to use an existing SQL Server instance instead of starting a container
204-
205-
Run MSSQL integration tests against root compose service:
206-
207-
```bash
208-
docker compose up -d mssql
209-
MSSQL_TEST_DSN="sqlserver://sa:YourStrong%21Passw0rd@localhost:14339?database=master&encrypt=disable" \
210-
go test -tags=integration ./stores/mssql
211-
```
212200

213201
## Docker Compose (optional)
214202

215203
`docker-compose.yml` at the repository root is kept for manual local runs.
216204

217205
- Use root `docker-compose.yml` when you want a persistent local Postgres+pgvector instance outside tests
218-
- Root compose also includes SQL Server (`mssql`) for local MSSQL connector validation
219206
- Use Testcontainers (`go test -tags=integration ./...`) for integration tests
220207
- Sample apps have their own compose files at `samples/semantic-search/docker-compose.yml` and `samples/ragrimosa/docker-compose.yml`
221208
- Sample Dockerfiles use `golang:1.24-alpine` to match the repo Go version (`1.24.x`)
@@ -233,8 +220,9 @@ Release options:
233220
2. Tag-driven: push a semver tag and the workflow publishes release notes automatically:
234221

235222
```bash
236-
git tag v0.2.3
237-
git push origin v0.2.3
223+
VERSION="$(cat VERSION)"
224+
git tag "v${VERSION}"
225+
git push origin "v${VERSION}"
238226
```
239227

240228
## Samples
@@ -247,6 +235,7 @@ git push origin v0.2.3
247235

248236
- Docs index: [`docs/README.md`](docs/README.md)
249237
- Internals and architecture: [`docs/architecture.md`](docs/architecture.md)
238+
- Connector development guide: [`docs/connector-development.md`](docs/connector-development.md)
250239

251240
## License
252241

VERSION

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
0.2.3

docker-compose.yml

Lines changed: 0 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -13,13 +13,3 @@ services:
1313
interval: 2s
1414
timeout: 2s
1515
retries: 30
16-
17-
mssql:
18-
image: mcr.microsoft.com/mssql/server:2022-latest
19-
container_name: go-vectorstore-mssql
20-
environment:
21-
ACCEPT_EULA: "Y"
22-
MSSQL_PID: "Developer"
23-
MSSQL_SA_PASSWORD: "YourStrong!Passw0rd"
24-
ports:
25-
- "14339:1433"

docs/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,3 +4,4 @@ This directory contains internal design and implementation notes for `go-vectors
44

55
- [`architecture.md`](architecture.md): architecture, request flow, schema behavior, filter system, and index model
66
- [`stores-implementation.md`](stores-implementation.md): detailed store-backend design, component responsibilities, flows, invariants, and backend differences
7+
- [`connector-development.md`](connector-development.md): step-by-step checklist for implementing a new backend connector

docs/architecture.md

Lines changed: 0 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,6 @@ The project is split into two layers:
88

99
- `vectordata`: backend-agnostic contracts and primitives
1010
- `stores/postgres`: PostgreSQL + pgvector implementation
11-
- `stores/mssql`: SQL Server implementation
1211

1312
This keeps the public API stable while allowing additional storage engines later.
1413

@@ -58,11 +57,6 @@ All methods require `context.Context`.
5857
- `EnsureExtension`: `true`
5958
- `StrictByDefault`: `true`
6059

61-
`mssql.StoreOptions` defaults (`mssql.DefaultStoreOptions()`):
62-
63-
- `Schema`: `dbo`
64-
- `StrictByDefault`: `true`
65-
6660
Collection defaults:
6761

6862
- Metric defaults to cosine when omitted
@@ -127,23 +121,6 @@ Each record is validated before sending:
127121

128122
The pgvector operators are documented in the [pgvector README](https://github.com/pgvector/pgvector#querying).
129123

130-
### MSSQL: Ensure / Search
131-
132-
`MSSQLVectorStore.EnsureCollection`:
133-
134-
1. Ensures target schema exists
135-
2. Ensures internal collection metadata table exists
136-
3. Creates or validates the collection table
137-
4. Persists and validates dimension/metric metadata
138-
139-
`MSSQLCollection.SearchByVector`:
140-
141-
1. Streams records from SQL Server
142-
2. Evaluates filters against records in-process
143-
3. Computes distance in-process (cosine/l2/inner product)
144-
4. Applies threshold and keeps a bounded in-memory top-k heap
145-
5. Returns top-k sorted by distance
146-
147124
## 6) Filter System (AST -> SQL)
148125

149126
Filters are represented as an AST in `vectordata`:
@@ -170,8 +147,6 @@ Behavior details:
170147

171148
JSON path extraction behavior comes from PostgreSQL [JSON/JSONB functions and operators](https://www.postgresql.org/docs/current/functions-json.html).
172149

173-
For MSSQL in this MVP, the same AST is evaluated in-process against loaded records.
174-
175150
## 7) Schema Safety Modes
176151

177152
`CollectionSpec.Mode` controls ensure behavior:
@@ -264,7 +239,6 @@ Errors are wrapped with context so callers can use `errors.Is(...)` against base
264239
Current MVP scope:
265240

266241
- Postgres + pgvector backend
267-
- MSSQL backend (vectors stored as JSON payloads)
268242
- single-vector column per collection
269243
- metadata filtering through a focused AST
270244

@@ -273,10 +247,8 @@ Future extension points:
273247
- additional backends under `stores/`
274248
- richer filter operators
275249
- optional reranking strategies
276-
- native MSSQL vector indexing/query pushdown
277250

278251
## 13) References
279252

280253
- pgvector project docs: https://github.com/pgvector/pgvector
281254
- PostgreSQL docs: https://www.postgresql.org/docs/current/index.html
282-
- SQL Server docs: https://learn.microsoft.com/sql/sql-server/

docs/connector-development.md

Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
# Connector Development Guide
2+
3+
This guide explains how to add a new backend connector under `stores/<backend>`.
4+
5+
## 1) Scope and Contract
6+
7+
Every connector must implement the shared interfaces in `vectordata`:
8+
9+
- `vectordata.VectorStore`
10+
- `vectordata.Collection`
11+
12+
The connector must preserve shared behavior:
13+
14+
- validate collection specs (`name`, `dimension`, `metric`, `mode`)
15+
- enforce vector dimension on write and search
16+
- return `vectordata.ErrNotFound` for missing records
17+
- return `vectordata.ErrDimensionMismatch`, `vectordata.ErrSchemaMismatch`, and `vectordata.ErrInvalidFilter` where applicable
18+
- compute score via `vectordata.ScoreFromDistance`
19+
20+
## 2) Recommended Layout
21+
22+
Create a new directory:
23+
24+
- `stores/<backend>/doc.go`
25+
- `stores/<backend>/store.go`
26+
- `stores/<backend>/collection.go`
27+
- `stores/<backend>/schema.go`
28+
- `stores/<backend>/helpers.go`
29+
- `stores/<backend>/<backend>_integration_test.go`
30+
31+
Add extra files only when needed (for example filter compiler/evaluator files).
32+
33+
## 3) Store Implementation Checklist
34+
35+
Implement a store type that owns connection/resources and options:
36+
37+
1. Add `StoreOptions` and `DefaultStoreOptions()`.
38+
2. Add `NewVectorStore(...)` constructor with option normalization.
39+
3. Implement `EnsureCollection(ctx, spec)`:
40+
- normalize and validate spec
41+
- ensure schema/table/metadata structures
42+
- create or validate collection storage
43+
4. Implement `Collection(name, dimension, metric)` as a lightweight handle constructor.
44+
45+
## 4) Collection Implementation Checklist
46+
47+
Implement collection operations with strict validation:
48+
49+
1. `Insert` and `Upsert`
50+
- validate ID and vector dimension
51+
- normalize nil metadata to empty object
52+
- use batching/chunking for bulk writes
53+
2. `Get` and `Delete`
54+
- return `ErrNotFound` from `Get` when missing
55+
3. `Count`
56+
- support nil filter and filter predicates
57+
4. `SearchByVector`
58+
- validate `topK > 0`
59+
- validate query vector dimension
60+
- apply filter (SQL pushdown or in-process evaluation)
61+
- apply optional threshold
62+
- honor projection (`Metadata`, `Content`, `Vector`)
63+
- return results ordered by best match first
64+
- compute `Score` from `Distance`
65+
5. `EnsureIndexes`
66+
- create backend-specific indexes when supported
67+
- return explicit error if unsupported options are requested
68+
69+
## 5) Filter Strategy
70+
71+
Choose one strategy:
72+
73+
- SQL pushdown: compile `vectordata.Filter` to parameterized backend SQL
74+
- In-process: load records and evaluate filter AST in Go
75+
76+
Requirements:
77+
78+
- never interpolate raw values into SQL
79+
- return `vectordata.ErrInvalidFilter` for invalid AST/input
80+
81+
## 6) Schema Safety Modes
82+
83+
Respect `CollectionSpec.Mode`:
84+
85+
- `EnsureStrict`: fail on mismatches
86+
- `EnsureAutoMigrate`: add/fix optional schema parts where possible
87+
88+
When mode is unset, use connector defaults (`StrictByDefault` behavior).
89+
90+
## 7) Testing Requirements
91+
92+
Add both unit and integration coverage.
93+
94+
Unit tests:
95+
96+
- spec validation
97+
- schema mismatch behavior
98+
- filter behavior (happy path + invalid filters)
99+
- dimension mismatch and error mapping
100+
- projection and threshold behavior in search
101+
102+
Integration tests:
103+
104+
- put in `stores/<backend>/<backend>_integration_test.go`
105+
- use `//go:build integration`
106+
- start backend with Testcontainers when DSN env var is absent
107+
- allow DSN override via `<BACKEND>_TEST_DSN`
108+
109+
Run commands:
110+
111+
```bash
112+
go test ./...
113+
go test -tags=integration ./stores/<backend>
114+
```
115+
116+
## 8) Repository Wiring
117+
118+
When adding a connector, also update:
119+
120+
1. `README.md`:
121+
- scope
122+
- requirements
123+
- project layout
124+
- integration test notes
125+
2. `docs/architecture.md` and `docs/stores-implementation.md`
126+
3. `docs/README.md`
127+
4. `Makefile` integration targets (if per-backend targets are used)
128+
5. `docker-compose.yml` only if manual local backend compose is required
129+
6. `go.mod` and `go.sum` dependencies
130+
131+
## 9) Minimal Skeleton
132+
133+
```go
134+
package mybackend
135+
136+
import (
137+
"context"
138+
139+
"github.com/gabisonia/go-vectorstore/vectordata"
140+
)
141+
142+
type StoreOptions struct{}
143+
144+
func DefaultStoreOptions() StoreOptions { return StoreOptions{} }
145+
146+
type VectorStore struct{}
147+
148+
func NewVectorStore(opts StoreOptions) (*VectorStore, error) {
149+
return &VectorStore{}, nil
150+
}
151+
152+
func (s *VectorStore) EnsureCollection(ctx context.Context, spec vectordata.CollectionSpec) (vectordata.Collection, error) {
153+
return nil, nil
154+
}
155+
156+
func (s *VectorStore) Collection(name string, dimension int, metric vectordata.DistanceMetric) vectordata.Collection {
157+
return nil
158+
}
159+
```

0 commit comments

Comments
 (0)