Wait and hope for the best...
A C++ local database library with cross language bindings. Aiming to be a fast, lightweight, and easy-to-use data communication solution for RPC and coupled modeling in scientific computing.
- 2026-03-04 (Release 0.1.12): Fixed a critical issue where loading large database files (> 2GB) on Linux/Unix systems would fail to read the complete file, leading to missing tables or data corruption. The file reading logic has been improved to correctly handle partial reads for large files. (PR #23)
- 2026-03-04 (Memory Overflow Improvement): Enhanced the
MemoryStreamimplementation to handle large data sizes exceeding 4GB without causing size overflow inchunk_data_t.size(u32). This improvement allows for more robust handling of large datasets in memory. (PR #22) - 2026-02-28 (Release Improvement): Fix bugs related to build process in Windows. (PR #20)
- 2025-12-31(Bug Fix): Fixed an issue where shared memory segments were not being properly unregistered from the resource tracker upon closing, which could lead to resource leaks. (PR #17)
- 2025-12-15 (Release Improvement): Enabled distribution of pre-compiled binary wheels for macOS (Intel/Apple Silicon) and Linux (x86_64/aarch64), eliminating the need for local compilation tools during installation. (PR #15)
- 2025-12-10 (Bug Fix): Fixed the data type mapping for
U32fields in Python bindings to ensure correct representation as unsigned 32-bit integers in NumPy arrays. (PR #13) - 2025-12-10 (Bug Fix): Fixed an out-of-bounds access issue in
FastVectorDbLayer::Impl::getFieldOffset()when the field index is equal to the field count. (PR #12) - 2025-12-10 (Performance Improvement): Modified
ORM.truncate()to support directly allocating features without initializing them for performance consideration. Note that this change may have side effects; please test thoroughly. (PR #11)
You can install the Python package of fastdb via pip:
pip install fastdb4pyNote: Pre-compiled binary wheels are provided for major platforms (macOS-Intel/macOS-Apple Silicon, Linux-Ubuntu, Windows-AMD64). For other systems, the package will build from source, requiring a C++ compiler and CMake.
To use fastdb, you first need to define your data schema by subclassing fastdb4py.Feature.
Use type hints to define the fields of your feature.
import fastdb4py
class Point(fastdb4py.Feature):
x: fastdb4py.F64
y: fastdb4py.F64You can create a new database or truncate an existing one using fastdb4py.ORM.truncate.
This function takes a list of TableDefn objects, specifying the feature class and the initial capacity (number of rows).
from pathlib import Path
# specific the path for the database
DB_PATH = "my_fastdb_data"
# Create a new database with a table for 'Point' features, capacity 1000
# The name parameter is optional; if not provided, a default name will be generated based on the feature class name.
# In this example, we explicitly set the table name to 'points'.
db = fastdb4py.ORM.truncate([
fastdb4py.TableDefn(Point, 1000, name='points'),
])You can access the table using the feature class as a key. Features can be accessed by index or iterated over.
# Access the table 'points' with schema defined by the Point feature class
points_table = db[Point]['points']
# If you did not specify the table name when creating the database, you can access it using the default name:
# points_table = db[Point][Point]
# or
# points_table = db[Point]['Point']
# Ensure we are in write mode (if loaded from file later)
# For a newly created DB in memory, we are already good to go.
for i in range(10):
# Access the feature at index i
p = points_table[i]
# Set field values
p.x = i * 1.5
p.y = i * 2.5
p.label = f"point_{i}"
# Save the database to disk
db.save(DB_PATH)fastdb supports high-performance columnar access using NumPy arrays.
This allows for vectorized operations on your data.
# Load the database from disk
db = fastdb4py.ORM.load(DB_PATH, from_file=True)
points_table = db[Point]['points']
# The length of the table (number of rows) can be obtained using len()
print(f"Number of points: {len(points_table)}")
# Access fields as numpy arrays via the `.column` property
xs = points_table.column.x
ys = points_table.column.y
print(f"First 5 X values: {xs[:5]}")
# Modify data in bulk using numpy operations
# This modifies the data in memory directly!
xs += 10.0
# Verify the change via object access
print(f"Point 0 x: {points_table[0].x}") # Should be 0 * 1.5 + 10.0 = 10.0This project uses DevContainer for development environment. Please refer to the .devcontainer/devcontainer.example.json file for configuration details.
For setting up the development environment, ensure you have Docker / Podman and VSCode DevContainer extension installed. Open the project in VSCode and create the .devcontainer/devcontainer.json file based on the example provided.
After connecting to the DevContainer, you can develop and test the project within the containerized environment.
The py_utils.sh script is provided to facilitate common development tasks related to the Python bindings of fastdb. When first launching the DevContainer, py_utils.sh will automatically set up a Python virtual environment and install the necessary dependencies.
# This operation will remove C++ build artifacts and the core Python bindings (fastdb.core, auto-generated by SWIG) within the Python package.
./py_utils.sh --clean# This operation will build the C++ core library and the Python bindings.
./py_utils.sh --build# This operation will run the Python unit tests for the fastdb package.
./py_utils.sh --test