Skip to content

newsamples/firedb

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FireDB

An embedded key-value database library for Go, inspired by LevelDB/RocksDB/Badger.

Features

  • LSM-tree based storage engine
  • MVCC with nanosecond timestamp versioning
  • TTL support for automatic key expiration
  • Configurable HDD/SSD optimization
  • Write-ahead log (WAL) for durability
  • Bloom filters for fast lookups
  • ARC block cache (adaptive replacement)
  • Batch writes
  • Point-in-time snapshots
  • Configurable version retention
  • Split SSTable format (Cassandra-style)
  • Zstd compression support
  • Streaming compaction with O(1) memory
  • WiscKey-style value separation for reduced write amplification
  • Vector/embedding storage with HNSW indexing for AI data

Comparison

Feature FireDB Badger RocksDB LevelDB
Language Go Go C++ C++
Storage Model LSM-tree LSM-tree + Vlog LSM-tree LSM-tree
Value Separation Yes (WiscKey) Yes Optional (BlobDB) No
MVCC Yes Yes Yes No
TTL Yes Yes Yes No
Compression Zstd Zstd, Snappy Zstd, Snappy, LZ4, etc Snappy
Transactions No Yes (ACID) Yes No
Column Families No No Yes No
Bloom Filter Yes Yes Yes Yes
Block Cache ARC No (uses mmap) LRU, Clock LRU
Snapshots Yes Yes Yes Yes
Batch Writes Yes Yes Yes Yes
Vector Storage Yes (HNSW) No No (RocksDB-ML) No
SSTable Format Split files Single file Single file Single file
Compaction Leveled (streaming) Leveled + Vlog GC Leveled, Universal, FIFO Leveled
Write Amplification Low Low Medium-High High
Read Amplification Low Low Low Medium
Space Amplification Low Medium Low Low

Database File Structure

SSTable naming follows Cassandra 4.1+ convention with Time UUID identifiers:

  • Format: <version>-<sstable_id>-<format>-<component>.db
  • SSTable IDs are globally unique and lexically ordered by creation time
/path/to/db/
├── wal/                                           # Write-ahead log
│   └── 0000000001.wal
├── sst/                                           # SSTable files
│   ├── aa-0frv_0abc_123456_abcdef-big-data.db        # Data blocks
│   ├── aa-0frv_0abc_123456_abcdef-big-index.db       # Block index
│   ├── aa-0frv_0abc_123456_abcdef-big-filter.db      # Bloom filter
│   ├── aa-0frv_0abc_123456_abcdef-big-statistics.db  # Metadata
│   ├── aa-0frv_0abc_123456_abcdef-big-hnsw.db        # HNSW vector index (if vectors present)
│   └── ...
└── vlog/                                          # Value log (large values)
    ├── 00000000000000000001.vlog
    └── ...

Installation

go get github.com/newsamples/firedb

Usage

Basic Operations

package main

import (
    "log"
    "github.com/newsamples/firedb"
)

func main() {
    // Open database
    db, err := firedb.Open("/path/to/db", firedb.DefaultOptions())
    if err != nil {
        log.Fatal(err)
    }
    defer db.Close()

    // Put
    err = db.Put([]byte("key"), []byte("value"))

    // Get
    value, err := db.Get([]byte("key"))

    // Delete
    err = db.Delete([]byte("key"))
}

TTL Support

import "time"

// Key expires after 1 hour
err := db.PutWithTTL([]byte("key"), []byte("value"), time.Hour)

Batch Writes

batch := firedb.NewBatch()
batch.Put([]byte("key1"), []byte("value1"))
batch.Put([]byte("key2"), []byte("value2"))
batch.Delete([]byte("key3"))

err := db.Write(batch)

Iteration

it := db.NewIterator()
defer it.Close()

for it.Next() {
    entry := it.Entry()
    fmt.Printf("%s: %s\n", entry.Key, entry.Value)
}

// Seek to specific key
it.Seek([]byte("prefix"))

Snapshots

// Create snapshot
snap := db.Snapshot()

// Read at snapshot time
value, err := snap.Get([]byte("key"))

Vector/Embedding Storage

FireDB supports storing vector embeddings alongside key-value data for AI applications. HNSW (Hierarchical Navigable Small World) indices are automatically created for fast approximate nearest neighbor search at scale.

// Store vector by key
vector := []float32{0.1, 0.2, 0.3, 0.4}
err := db.PutVector([]byte("doc1"), vector)

// Get vector by key
v, err := db.GetVector([]byte("doc1"))

// Store key-value pair with associated vector
err = db.PutWithVector([]byte("doc2"), []byte("document content"), vector)

// Similarity search using HNSW index (returns top-K results)
query := []float32{0.15, 0.25, 0.35, 0.45}
results, err := db.SearchVectors(query, 10)
for _, r := range results {
    fmt.Printf("Key: %s, Similarity: %.4f\n", r.Key, r.Similarity)
}

Performance characteristics:

  • HNSW indices are built per-SSTable during flush
  • Search complexity: O(log n) vs O(n) brute-force
  • Supports 1M+ vectors with ~1-5ms search latency
  • Index is automatically rebuilt during compaction

Vector utility functions:

// Cosine similarity (returns -1 to 1, higher is more similar)
sim := firedb.CosineSimilarity(vecA, vecB)

// Euclidean distance (L2)
dist := firedb.EuclideanDistance(vecA, vecB)

// Dot product
dot := firedb.DotProduct(vecA, vecB)

// Normalize vector to unit length
normalized := firedb.NormalizeVector(vec)

Configuration

opts := firedb.DefaultOptions().
    WithStorageMode(firedb.StorageModeSSD).      // or StorageModeHDD
    WithSyncMode(firedb.SyncModeSync).           // or SyncModeAsync, SyncModePeriodic
    WithMemTableSize(64 * 1024 * 1024).          // 64MB memtable
    WithBlockCacheSize(128 * 1024 * 1024).       // 128MB block cache
    WithBloomFilterBits(10).                      // 10 bits per key
    WithMaxVersions(5).                           // Keep 5 versions per key
    WithVersionRetention(24 * time.Hour).         // Retain versions for 24 hours
    WithValueThreshold(1024)                      // Values > 1KB go to vlog

db, err := firedb.Open("/path/to/db", opts)

Storage Modes

  • StorageModeSSD: 4KB block size, optimized for SSDs
  • StorageModeHDD: 64KB block size, optimized for HDDs

Sync Modes

  • SyncModeSync: Sync WAL on every write (safest, slowest)
  • SyncModeAsync: No sync, relies on OS (fastest, risk of data loss)
  • SyncModePeriodic: Sync at intervals (balanced)

Value Separation (WiscKey)

Large values are stored separately in a value log (vlog) to reduce write amplification during compaction. Only keys and value pointers are stored in the LSM-tree.

  • ValueThreshold: Values larger than this (default 1KB) go to vlog. Set to 0 to disable.
  • Benefits: Reduced write amplification, faster compaction
  • Trade-off: Extra read I/O for large values

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages