Local Embeddings

floop supports local embedding-based retrieval for semantic behavior matching. This runs a small model on your machine — no API keys or network access required at runtime.

Overview

When enabled, floop_active embeds the current context (file, task, language) and finds the most semantically similar behaviors via vector cosine similarity. This complements the existing predicate matching and spreading activation pipeline by finding relevant behaviors even when their when conditions don't exactly match.

Setup

Interactive

floop init

The interactive setup prompts whether to download and enable local embeddings (option 4).

Non-interactive

# Enable embeddings
floop init --global --embeddings

# Skip embeddings
floop init --global --no-embeddings

What Gets Downloaded

Two runtime dependencies are downloaded to ~/.floop/ (~130 MB total):

Component	Size	Location
llama.cpp shared libraries	~50 MB	`~/.floop/lib/` (`%USERPROFILE%\.floop\lib\` on Windows)
nomic-embed-text-v1.5 (Q4_K_M)	~81 MB	`~/.floop/models/` (`%USERPROFILE%\.floop\models\` on Windows)

Downloads happen once and are cached. Subsequent floop init --embeddings calls detect existing installations and skip re-downloading.

Auto-detection

After initial setup, the MCP server auto-detects installed dependencies in ~/.floop/ on startup. No explicit configuration is needed — if the libraries and model are present, embeddings are enabled automatically.

Configuration

Config file (`~/.floop/config.yaml`)

llm:
  provider: local
  enabled: true
  local_lib_path: /home/user/.floop/lib
  local_embedding_model_path: /home/user/.floop/models/nomic-embed-text-v1.5.Q4_K_M.gguf

These values are set automatically by floop init --embeddings. You can also set them manually:

floop config set llm.provider local
floop config set llm.local_lib_path ~/.floop/lib
floop config set llm.local_embedding_model_path ~/.floop/models/nomic-embed-text-v1.5.Q4_K_M.gguf

Environment variables

Variable	Description
`FLOOP_LOCAL_LIB_PATH`	Directory containing llama.cpp shared libraries
`FLOOP_LOCAL_EMBEDDING_MODEL_PATH`	Path to GGUF embedding model
`FLOOP_LOCAL_GPU_LAYERS`	GPU layer offload count (0 = CPU only)
`FLOOP_LOCAL_CONTEXT_SIZE`	Context window size in tokens (default: 512)

How It Works

Embedding lifecycle

Learn-time: When floop_learn creates a new behavior, its canonical text is embedded in the background and stored alongside the behavior in SQLite
Startup backfill: On MCP server start, any behaviors without embeddings are backfilled in a background goroutine
Retrieval-time: floop_active composes the current context into a query, embeds it, and runs brute-force cosine similarity against all stored embeddings
Safety net: Behaviors without embeddings are always included in the candidate set — no behavior is silently dropped

Storage

Embeddings are stored as BLOB columns in the behaviors SQLite table (768 dimensions x 4 bytes = 3,072 bytes per behavior). The embedding model name is tracked alongside each embedding for staleness detection.

Vector Index

At startup, the MCP server loads all stored embeddings into a LanceDBIndex for fast approximate nearest neighbor (ANN) search. LanceDB is an embedded vector database that auto-persists to .floop/vectors/ — no separate server needed.

Backend	When Used	Notes
LanceDBIndex	Default (CGO enabled)	Embedded vector DB, auto-persists, scales to millions
BruteForceIndex	Fallback (no CGO)	O(n) exhaustive cosine similarity, in-memory only

Performance

At typical scales (~200 behaviors), search completes in microseconds. LanceDB scales efficiently to much larger stores without manual configuration. The brute-force fallback is used automatically when CGO is unavailable (e.g. cross-compiled binaries).

Building from Source

Pre-built release binaries use BruteForce vector search (no CGO dependency). For LanceDB persistence, build from source with CGO enabled.

Prerequisites

Go 1.26+
C/C++ compiler (gcc/g++ on Linux, clang/clang++ on macOS)
LanceDB native libraries (see download step below)

Steps

Run all commands from the project root directory:

# Download LanceDB native libraries for your platform
go mod download || { echo "Failed to download Go modules"; exit 1; }
LANCE_VERSION=$(go list -m -f '{{.Version}}' github.com/lancedb/lancedb-go) || { echo "lancedb-go not in go.mod"; exit 1; }

# Download native binaries from GitHub release (requires gh CLI)
gh release download "${LANCE_VERSION}" --repo lancedb/lancedb-go --pattern 'lancedb-go-native-binaries.tar.gz'
tar -xzf lancedb-go-native-binaries.tar.gz && rm -f lancedb-go-native-binaries.tar.gz
# This creates lib/ and include/ in the current directory

# Build with CGO (Linux)
make build-cgo

The build-cgo Makefile target uses Linux-specific linker flags. On macOS, override CGO_LDFLAGS:

# darwin_arm64 for Apple Silicon, darwin_amd64 for Intel Macs
CGO_LDFLAGS="-L$(pwd)/lib/$(go env GOOS)_$(go env GOARCH) -llancedb_go -framework Security -framework CoreFoundation -lc++" make build-cgo

Verify LanceDB is linked

Linux (static linking — LanceDB should NOT appear as a shared library):

ldd ./floop | grep lancedb
# No output = statically linked (correct)
# "liblancedb_go.so => not found" = linked dynamically but missing (rebuild needed)

macOS (dynamic linking):

otool -L ./floop | grep lancedb
# On macOS, LanceDB links dynamically against a .dylib.
# The library must be present at the path shown in the output.

Troubleshooting

Embeddings not working

Verify dependencies are installed:

# Linux/macOS
ls ~/.floop/lib/libllama.*    # Should show libllama.so or libllama.dylib
ls ~/.floop/models/*.gguf      # Should show the model file

# Windows (PowerShell)
dir $env:USERPROFILE\.floop\lib\libllama.*    # Should show libllama.dll
dir $env:USERPROFILE\.floop\models\*.gguf

If missing, re-run setup:

floop init --embeddings

Model not loading

Check that the library path matches your platform:

Linux: libllama.so in ~/.floop/lib/
macOS: libllama.dylib in ~/.floop/lib/
Windows: libllama.dll in %USERPROFILE%\.floop\lib\

High memory usage

The embedding model uses ~200 MB RAM when loaded. If this is a concern, disable embeddings:

floop init --no-embeddings

The system falls back to predicate matching and spreading activation (identical to pre-embedding behavior).

Technical Details

Model: nomic-embed-text-v1.5 (Q4_K_M quantization), 768-dimension output, 2048-token context
Runtime: llama.cpp via yzma purego bindings (no CGo)
Task prefixes: search_document: for behavior text, search_query: for context queries (required by nomic-embed-text)
Storage: SQLite BLOB column, ~3 KB per behavior

For the theory behind embedding-based retrieval, see SCIENCE.md.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Local Embeddings

Overview

Setup

Interactive

Non-interactive

What Gets Downloaded

Auto-detection

Configuration

Config file (`~/.floop/config.yaml`)

Environment variables

How It Works

Embedding lifecycle

Storage

Vector Index

Performance

Building from Source

Prerequisites

Steps

Verify LanceDB is linked

Troubleshooting

Embeddings not working

Model not loading

High memory usage

Technical Details

FilesExpand file tree

EMBEDDINGS.md

Latest commit

History

EMBEDDINGS.md

File metadata and controls

Local Embeddings

Overview

Setup

Interactive

Non-interactive

What Gets Downloaded

Auto-detection

Configuration

Config file (~/.floop/config.yaml)

Environment variables

How It Works

Embedding lifecycle

Storage

Vector Index

Performance

Building from Source

Prerequisites

Steps

Verify LanceDB is linked

Troubleshooting

Embeddings not working

Model not loading

High memory usage

Technical Details

Config file (`~/.floop/config.yaml`)