Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
47e7280
feat: add comprehensive test suite and thread safety analysis
shaia Nov 1, 2025
1d6a14b
feat: implement thread-safe bloom filter with lock-free atomic operat…
shaia Nov 1, 2025
4b27e48
fix: address critical pool storage bug and optimize thread-safe opera…
shaia Nov 1, 2025
720ead9
fix: critical defer-in-loop bug causing pool exhaustion
shaia Nov 1, 2025
098c089
docs: remove outdated THREAD_SAFETY_ANALYSIS.md
shaia Nov 1, 2025
2c50d46
perf: optimize batch operations to reuse pooled storage across items
shaia Nov 1, 2025
a0bd1a2
fix: eliminate nested pool operations in batch functions causing race…
shaia Nov 1, 2025
9d7e441
ci: simplify workflow by removing redundant race detector job
shaia Nov 1, 2025
778bed6
refactor: improve code quality with modern Go patterns and accurate d…
shaia Nov 1, 2025
22acf51
fix: apply nested pool operation fix to AddBatch function
shaia Nov 1, 2025
af8f8a3
perf: reduce test workload under race detector to prevent timeouts
shaia Nov 1, 2025
4bfe710
fix: scale down test pre-population for race detector and update docs
shaia Nov 1, 2025
55a6bb9
docs: update example to showcase thread-safety and batch operations
shaia Nov 1, 2025
bf345d0
refactor: add defensive copying of pooled storage slices in AddBatch …
shaia Nov 1, 2025
71568fc
fix: reduce race detector test workload and fix verification logic
shaia Nov 1, 2025
9d5b11d
fix: enable cross-platform testing in GitHub Actions workflow
shaia Nov 1, 2025
13b9aeb
refactor: optimize CI workflow to reduce redundancy
shaia Nov 1, 2025
65f529e
fix: prevent duplicate workflow runs on PR branches
shaia Nov 1, 2025
75e8edc
fix: update bloomfilter and tests
shaia Nov 2, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 74 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
name: Tests

on:
push:
branches: [ main ]
pull_request:
branches: [ main ]

jobs:
test:
name: Run Tests
runs-on: ubuntu-latest
strategy:
matrix:
go-version: ['1.23.x']

steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Set up Go
uses: actions/setup-go@v5
with:
go-version: ${{ matrix.go-version }}
cache: true

- name: Verify dependencies
run: go mod verify

- name: Run go vet
run: go vet ./...

- name: Run tests
run: go test -v ./...

- name: Run tests with race detector
run: go test -race -short -timeout=10m -v ./...
env:
GORACE: "halt_on_error=1 log_path=race"

- name: Upload race detector logs
if: failure()
uses: actions/upload-artifact@v4
with:
name: race-logs
path: race.*
retention-days: 7

- name: Run benchmark tests (dry run)
run: go test -bench=. -benchtime=100ms -run=^$ ./tests/benchmark/...

build:
name: Build
runs-on: ${{ matrix.os }}
strategy:
matrix:
go-version: ['1.23.x']
os: [ubuntu-latest, windows-latest, macos-latest]
Copy link

Copilot AI Nov 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The os matrix variable is defined but the job is hardcoded to run on ubuntu-latest (line 80: runs-on: ubuntu-latest). Change line 80 to runs-on: ${{ matrix.os }} to actually test on all three platforms.

Copilot uses AI. Check for mistakes.

steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Set up Go
uses: actions/setup-go@v5
with:
go-version: ${{ matrix.go-version }}
cache: true

- name: Build
run: go build -v ./...

- name: Test build with race detector enabled
run: go build -race -v ./...
53 changes: 45 additions & 8 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,26 +7,63 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Added

- **Thread-Safety**: Full concurrent support with lock-free atomic operations
- Lock-free bit operations using atomic Compare-And-Swap (CAS)
- Bounded retry limits with exponential backoff under contention
- sync.Pool optimization for zero-allocation temporary storage reuse
- **Batch Operations**: High-throughput batch Add functions
- `AddBatch(items [][]byte)` - Batch byte slice operations
- `AddBatchString(items []string)` - Batch string operations with zero-copy conversion
- `AddBatchUint64(items []uint64)` - Batch uint64 operations
- Pooled resource reuse across batch items for optimal performance
- **Comprehensive Test Suite**: Thread-safety and performance validation
- Race detector integration with GitHub Actions CI/CD
- Concurrent read/write tests (100+ goroutines)
- Stress tests with millions of operations
- Edge case and boundary condition tests

### Changed

- Refactored codebase for better maintainability and readability
- Refactored codebase for better maintainability and thread-safety
- Split monolithic `bloomfilter.go` (660 lines) into focused modules:
- `bloomfilter.go` (394 lines): Core API and public interface
- `internal/hash/hash.go` (108 lines): Hash function implementations
- `internal/storage/storage.go` (186 lines): Hybrid storage abstraction
- Moved implementation details to `internal/` package following Go conventions
- `internal/storage/storage.go` (186 lines): Hybrid storage with sync.Pool
- Modernized unsafe string-to-byte conversion using Go 1.20+ stdlib (`unsafe.StringData`/`unsafe.Slice`)
- Optimized CI/CD workflow to use `-short` flag with race detector to prevent timeouts
- Fixed stack allocation comments based on escape analysis verification
- Eliminated 150+ lines of duplicate code between array and map modes
- Simplified complex functions by 59-65% (getHashPositionsOptimized, setBitCacheOptimized, getBitCacheOptimized)
- Simplified complex functions by 59-65% with proper resource pooling
- Added `IsArrayMode()` accessor method for better encapsulation
- Updated package structure documentation in README
- Updated all documentation to reflect thread-safety improvements

### Performance

- **Concurrent Writes**: 18-23M operations/second (50 goroutines)
- **Concurrent Reads**: 10M+ operations/second (100 goroutines)
- **Lock-Free Operations**: Zero mutex contention with atomic CAS
- **Resource Pooling**: Eliminates allocations in hot paths with sync.Pool
- **Race Detector Compatible**: Tests pass with race detector in <1 second (reduced workload)

### Fixed

- Critical nested pool operation bug in batch functions causing race detector timeouts
- Empty spin loop backoff properly documented (compiler optimization acceptable)
- Defer-in-loop bug that caused pool exhaustion under high concurrency
- Pool storage slice return bug that could cause data corruption
- CAS retry limit prevents indefinite spinning under extreme contention
- Defensive copying of pooled storage slices in AddBatch functions for consistency and future-proofing

### Quality Improvements

- Zero performance regression - all benchmarks unchanged
- All tests pass (18/18)
- Zero performance regression - improved concurrency performance
- All tests pass including race detector validation
- GitHub Actions CI/CD with automated race detection
- Better separation of concerns with clear module boundaries
- Internal packages cannot be imported by users, ensuring API stability
- Easier to maintain and extend codebase
- Production-ready thread-safety with comprehensive testing

## [0.2.0] - 2025-10-26

Expand Down
18 changes: 13 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,16 @@ A high-performance, cache-line optimized bloom filter implementation in Go with

## Features

- **Thread-Safe**: Lock-free concurrent operations using atomic CAS with sync.Pool optimization
- **SIMD Acceleration**: Automatic detection and usage of AVX2, AVX512, and ARM NEON instructions
- **Cache-Optimized**: 64-byte aligned memory structures for optimal CPU cache performance
- **Hybrid Architecture**: Automatic array/map mode selection for optimal performance across all filter sizes
- **Batch Operations**: High-throughput batch Add functions with pooled resource reuse
- **Cross-Platform**: Supports x86_64 (Intel/AMD) and ARM64 architectures
- **High Performance**: 2.2x - 3.5x speedup with SIMD over scalar implementations
- **Memory Efficient**: 95% memory reduction for small filters, unlimited scalability for large filters
- **Zero Allocations**: Array mode operations with zero per-operation allocations for small filters
- **Production Ready**: Comprehensive test suite with 100% correctness validation
- **Production Ready**: Comprehensive test suite with race detection and 100% correctness validation

## Performance

Expand All @@ -31,8 +33,9 @@ A high-performance, cache-line optimized bloom filter implementation in Go with

### Throughput

- **Insertions**: ~2.1M operations/second
- **Lookups**: ~2.2M operations/second
- **Concurrent Writes**: 18-23M operations/second (50 goroutines)
- **Concurrent Reads**: 10M+ operations/second (100 goroutines)
- **Sequential Operations**: ~2M operations/second
- **False Positive Rate**: 0.05% (target: 1.0%)

### Hybrid Architecture Performance
Expand Down Expand Up @@ -314,12 +317,17 @@ func NewCacheOptimizedBloomFilter(
### Core Methods

```go
// Add operations
// Add operations (thread-safe, lock-free)
func (bf *CacheOptimizedBloomFilter) Add(data []byte)
func (bf *CacheOptimizedBloomFilter) AddString(s string)
func (bf *CacheOptimizedBloomFilter) AddUint64(n uint64)

// Contains operations
// Batch operations (optimized with pooled resources)
func (bf *CacheOptimizedBloomFilter) AddBatch(items [][]byte)
func (bf *CacheOptimizedBloomFilter) AddBatchString(items []string)
func (bf *CacheOptimizedBloomFilter) AddBatchUint64(items []uint64)

// Contains operations (thread-safe, lock-free)
func (bf *CacheOptimizedBloomFilter) Contains(data []byte) bool
func (bf *CacheOptimizedBloomFilter) ContainsString(s string) bool
func (bf *CacheOptimizedBloomFilter) ContainsUint64(n uint64) bool
Expand Down
55 changes: 36 additions & 19 deletions TESTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,12 @@ BloomFilter/
│ ├── bloomfilter_benchmark_test.go # Performance benchmarks
│ └── bloomfilter_storage_mode_benchmark_test.go # Storage mode benchmarks
└── integration/
├── bloomfilter_concurrent_test.go # Thread-safety and concurrent operations tests
├── bloomfilter_edge_cases_test.go # Edge cases and boundary conditions tests
├── bloomfilter_race_test.go # Race detector tests (build tag: race)
├── bloomfilter_simd_comparison_test.go # SIMD comparison tests (build tag: simd_comparison)
└── bloomfilter_storage_mode_test.go # Storage mode selection tests
├── bloomfilter_storage_mode_test.go # Storage mode selection tests
└── bloomfilter_stress_test.go # Large-scale stress tests
```

## Test Categories
Expand Down Expand Up @@ -60,17 +64,30 @@ go test -bench=BenchmarkInsertion -cpuprofile=cpu.prof ./tests/benchmark

### 3. Integration Tests (tests/integration/)

Tests that verify interactions between components and cross-package functionality.
Tests that verify interactions between components, thread-safety, and cross-package functionality.

**Files:**
- `bloomfilter_concurrent_test.go` - Thread-safety tests with concurrent reads/writes
- `bloomfilter_edge_cases_test.go` - Edge cases, boundary conditions, and collision resistance
- `bloomfilter_race_test.go` - Race detector tests (build tag: `race`)
- `bloomfilter_simd_comparison_test.go` - SIMD vs fallback performance validation (build tag: `simd_comparison`)
- `bloomfilter_storage_mode_test.go` - Hybrid storage mode selection tests (array vs map)
- `bloomfilter_stress_test.go` - Large-scale stress tests (millions of operations)

**Running:**
```bash
# All integration tests (without build tags)
go test -v ./tests/integration

# Thread-safety tests
go test -v ./tests/integration -run=TestConcurrent

# With race detector (uses -short flag to reduce workload)
go test -race -short -v ./tests/integration

# Stress tests
go test -v ./tests/integration -run=TestLargeDataset

# Storage mode selection tests
go test -v ./tests/integration -run=TestHybridMode

Expand Down Expand Up @@ -230,23 +247,23 @@ func TestIntegrationScenario(t *testing.T) {

Tests are automatically run in GitHub Actions workflows:

### Pull Request Workflow
- Standard unit tests (`go test ./...`)
- Basic SIMD correctness tests
- Build validation

### Pre-Release Workflow
- All unit tests
- SIMD comparison tests (`-tags=simd_comparison`)
- Build validation
- Version validation

### Release Workflow
- Full test suite including integration tests
- SIMD performance validation
- Build for all platforms

See `.github/workflows/` for full workflow definitions.
### Tests Workflow (on push/PR)
- Standard unit tests (`go test -v ./...`)
- Race detector tests (`go test -race -short -timeout=10m -v ./...`)
- Uses `-short` flag to reduce workload for race detector (5-10x overhead)
- 10-minute timeout for comprehensive race detection
- Uploads race logs on failure
- Build validation for all platforms (Ubuntu, Windows, macOS)
- Build with race detector enabled
- Benchmark dry run

### Key Features
- **Race Detection**: Automated data race detection on every push/PR
- **Cross-Platform**: Tests on Ubuntu, Windows, and macOS
- **Comprehensive Coverage**: Unit, integration, and stress tests
- **Performance Validation**: Benchmark tests ensure no regressions

See [.github/workflows/test.yml](.github/workflows/test.yml) for full workflow definition.

## Test Coverage Goals

Expand Down
Loading
Loading