A relational database engine built from scratch in Rust. HozonDB implements core database internals β page-based storage, a hand-written SQL parser, query execution, slotted page layout, and B+ tree indexing.
HozonDB is a systems programming project aimed at exploring the internal architecture of relational databases.
Modern databases hide significant complexity behind simple SQL interfaces. This project focuses on implementing core components from scratch in order to understand how storage engines, query execution, and data layout interact in real systems.
Rather than relying on existing database libraries, HozonDB explicitly implements key subsystems such as page management, row serialization, SQL parsing, query execution, and indexing. The goal is to make the behavior of the database transparent and observable while experimenting with design trade-offs commonly found in production database engines.
- Rust (latest stable)
protocβ Protocol Buffers compiler
On Debian/Ubuntu:
sudo apt-get install protobuf-compilerOn macOS:
brew install protobufhozondb/
βββ core/ # database engine (library) β storage, executor, parser, proto types
βββ server/ # gRPC server binary
βββ client/ # gRPC client library
βββ hsql/ # interactive CLI (connects to server over gRPC)
βββ tests/ # integration tests
- Page-based persistent storage with file-level locking
- Slotted page layout β stable row locations, in-place updates and deletes
- SQL:
CREATE TABLE,DROP TABLE,INSERT,SELECT,UPDATE,DELETE PRIMARY KEYsupport with automatic B+ tree index creationWHEREclause filtering with comparison and range operators (=,<,>,<=,>=)- B+ tree indexing β O(log n) point lookups and range scans on indexed columns
- Index-aware
INSERT,UPDATE,DELETEβ indexes stay consistent on every write - Multi-page tables with automatic page allocation
- System catalog for schema and index persistence across restarts
- Benchmark suite with before/after index metrics
- gRPC client-server interface β server exposes SQL execution over gRPC
hsqlCLI β readline-powered interactive shell that connects to the server over gRPC
ββββββββββββββββββββββββββββββββββββββββββββ
β hsql (CLI client) β rustyline-based interactive shell
ββββββββββββββββββββββ¬ββββββββββββββββββββββ
β gRPC
ββββββββββββββββββββββΌββββββββββββββββββββββ
β gRPC Server β tonic + tokio async transport
ββββββββββββββββββββββ¬ββββββββββββββββββββββ
β
ββββββββββββββββββββββΌββββββββββββββββββββββ
β SQL Parser β Hand-written lexer + recursive descent
β Lexer β Token stream β AST β
ββββββββββββββββββββββ¬ββββββββββββββββββββββ
β
ββββββββββββββββββββββΌββββββββββββββββββββββ
β Query Executor β AST β execution β row iteration
β Index seek / Range scan / Full scan β
ββββββββββββββββββββββ¬ββββββββββββββββββββββ
β
ββββββββββββββββββββββΌββββββββββββββββββββββ
β B+ Tree Index Layer β Per-column indexes, page-backed nodes
β Point lookup / Range scan / Node cache β
ββββββββββββββββββββββ¬ββββββββββββββββββββββ
β
ββββββββββββββββββββββΌββββββββββββββββββββββ
β Table Storage Layer β Slotted pages, schema-aware row I/O
β Slot directory, in-place update/delete β
ββββββββββββββββββββββ¬ββββββββββββββββββββββ
β
ββββββββββββββββββββββΌββββββββββββββββββββββ
β Page Manager β Fixed 4KB pages, file I/O, metadata
β allocate_page / read_page / write_page β
ββββββββββββββββββββββ¬ββββββββββββββββββββββ
β
.hdb file
Page layout:
Page 0: file header
Page 1: table catalog (schema persistence)
Page 2: index catalog (index metadata + root page IDs)
Page 3+: user data pages and B+ tree node pages (shared space)
When a table is created with a PRIMARY KEY column, HozonDB automatically creates a B+ tree index on that column. The index is stored as a set of pages within the same .hdb file.
-- auto-creates a B+ tree index on `id`
CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT);On every INSERT, the indexed column value and the new row's RowLocation (page + slot) are inserted into the tree. On SELECT WHERE id = 5, the executor uses the tree to find the exact page and slot in O(log n) rather than scanning all pages.
Index-eligible operators:
=β point lookup, reads 1 data page<,<=,>,>=β range scan, walks the leaf linked list- All other predicates (
!=,AND,OR, compound) β fall back to full scan
10,000 rows, 66 pages, with B+ tree index on primary key:
| Operation | Duration | Pages Read | Rows Scanned |
|---|---|---|---|
| SELECT full scan | 12.27ms | 66 | 10,000 |
| SELECT idx seek (point lookup) | 0.02ms | 1 | 1 |
| SELECT range scan | 10.81ms | 66 | 10,000 |
| Operation | Duration | Pages Read | Pages Written |
|---|---|---|---|
| INSERT (single row) | 7.34ms | 1 | 1 |
| UPDATE (fits slot) | 9.89ms | 1 | 1 |
| UPDATE (exceeds slot) | 29.10ms | 2 | 3 |
| DELETE (single row) | 7.68ms | 1 | 1 |
Point lookups on indexed columns: 59x fewer page reads compared to full scan.
Implemented:
- Page manager with file locking and slotted page layout
- In-place row updates and deletes (no page chain rewrite)
- System catalog with schema and index persistence
- Full SQL CRUD with WHERE filtering and range operators
- B+ tree indexing β point lookup and range scan
- Index-aware INSERT, UPDATE, DELETE
- PRIMARY KEY uniqueness enforcement
- Benchmark suite with index metrics
- gRPC client-server interface
hsqlinteractive CLI over gRPC
Planned:
CREATE INDEXβ explicit index creation on any column- Server-side streaming for SELECT results
- Write-ahead log (WAL) for crash recovery
BEGIN/COMMIT/ROLLBACKtransaction support- Distributed replication (Raft consensus)
Start the server:
cargo run -p hozondb-server -- mydbConnect with the CLI:
cargo run -p hsql -- http://[::1]:50051hozondb> CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT);
hozondb> INSERT INTO users VALUES (1, 'Alice');
hozondb> INSERT INTO users VALUES (2, 'Bob');
hozondb> SELECT * FROM users WHERE id = 1;
hozondb> SELECT * FROM users WHERE id > 1;
hozondb> UPDATE users SET name = 'Alice Smith' WHERE id = 1;
hozondb> DELETE FROM users WHERE id = 2;
hozondb> .exitcargo test --workspace # run all tests