Generated: 2026-02-15
Repository: infinityabundance/ds-runtime
Status: Experimental DirectStorage-style runtime for Linux
This document provides a deep inspection of the ds-runtime codebase, comparing:
- What is documented vs what is implemented
- What is working vs what is incomplete/stub
- What is claimed vs what is missing
The current codebase does not compile. There are missing implementations that prevent successful builds:
- Missing field:
Request::bytes_transferredreferenced but not declared in header - Missing method:
Impl::take_completed()called but not implemented
ds-runtime/
├── include/ # Public API headers
│ ├── ds_runtime.hpp # Core API (Queue, Backend, Request)
│ ├── ds_runtime_c.h # C ABI wrapper for Wine/Proton
│ ├── ds_runtime_vulkan.hpp # Vulkan backend interface
│ └── ds_runtime_uring.hpp # io_uring backend interface
├── src/ # Implementation
│ ├── ds_runtime.cpp # CPU backend + Queue orchestration
│ ├── ds_runtime_c.cpp # C ABI implementation
│ ├── ds_runtime_logging.cpp # Error reporting
│ ├── ds_runtime_vulkan.cpp # Vulkan backend (experimental)
│ └── ds_runtime_uring.cpp # io_uring backend (experimental)
├── tests/ # Test suite (4 tests)
│ ├── basic_queue_test.cpp
│ ├── c_abi_stats_test.c
│ ├── compression_gdeflate_stub_test.cpp
│ └── io_uring_backend_test.cpp
├── examples/ # Demo programs
│ ├── ds_demo_main.cpp # Basic CPU demo
│ ├── asset_streaming_main.cpp # Asset streaming demo
│ └── vk-copy-test/ # Vulkan groundwork (unused)
│ ├── copy.comp # GLSL compute shader
│ ├── copy.comp.spv # Precompiled SPIR-V
│ └── vk_copy_test.cpp # GPU copy demo
└── docs/ # Design documentation
├── design.md
├── wine_proton.md
└── archlinux_vulkan_integration.md
src/ds_runtime.cpp:298:21: error: 'struct ds::Request' has no member named 'bytes_transferred'
src/ds_runtime.cpp:441:39: error: 'struct ds::Request' has no member named 'bytes_transferred'
src/ds_runtime.cpp:546:19: error: 'struct ds::Queue::Impl' has no member named 'take_completed'
Location: Referenced in src/ds_runtime.cpp lines 298, 441
Status: ❌ Field declared in API but not in header
Impact: Critical - prevents compilation
The implementation tracks bytes transferred in:
- Error path (line 298):
req.bytes_transferred = 0; - Stats aggregation (line 441):
total_bytes_transferred_.fetch_add(...)
But include/ds_runtime.hpp struct Request does NOT include this field.
Location: Called in src/ds_runtime.cpp line 546
Status: ❌ Method declared in public API but not implemented
Impact: Critical - prevents compilation
The public API declares:
std::vector<Request> Queue::take_completed();But Queue::Impl has no implementation of this method. The infrastructure exists:
completed_vector stores finished requests- Mutex
mtx_protects access - But no method to extract and clear the list
| Backend | Status | Working | Notes |
|---|---|---|---|
| CPU (pread/pwrite) | ✅ Implemented | Cannot verify (build broken) | ThreadPool + POSIX I/O |
| Vulkan | Unknown | Staging buffers only, no compute | |
| io_uring | Unknown | Host memory only |
- ThreadPool: ✅ Implemented (fixed-size, FIFO queue)
- pread/pwrite: ✅ Implemented
- Error handling: ✅ Implemented (rich context)
- Request validation: ✅ Implemented (fd, size, buffer checks)
- Completion callbacks: ✅ Implemented
- Stats tracking:
⚠️ Broken (references missingbytes_transferred)
What EXISTS:
- ✅ Vulkan instance/device/queue creation
- ✅ Staging buffer allocation (host-visible, transient)
- ✅
vkCmdCopyBuffer(staging ↔ GPU) - ✅ Command pool & command buffer management
- ✅ Memory type selection & allocation
- ✅ Synchronization (
vkDeviceWaitIdle)
What's MISSING:
- ❌ Compute pipelines (no
vkCreateComputePipelines) - ❌ GPU decompression (no shader invocation)
- ❌ Descriptor sets (needed for compute)
- ❌ Pipeline layout (needed for compute)
- ❌ Shader module loading (SPIR-V exists but unused)
Actual Capability: File → Staging → GPU copy (pure data transfer, no compute)
What EXISTS:
- ✅
io_uring_queue_init()with 256 entries - ✅ SQE submission (
io_uring_prep_read/write) - ✅ CQE completion handling
- ✅ Single worker thread
- ✅ Batch submission/completion
What's MISSING:
- ❌ GPU memory support (explicitly rejected with
EINVAL) - ❌ Decompression (no compression handling)
- ❌ Multi-queue (single worker only)
⚠️ worker_countfield exists but unused
Actual Capability: Kernel async I/O for host buffers only
| Feature | Status | Working | Implementation |
|---|---|---|---|
| None (pass-through) | ✅ Implemented | Yes | Default behavior |
| FakeUppercase (demo) | ✅ Implemented | Yes | ASCII uppercase transform |
| GDeflate | ❌ STUB | NO | Returns error via callback |
enum class Compression {
None, // ✅ Works
FakeUppercase, // ✅ Works (demo only)
GDeflate // ❌ NOT IMPLEMENTED
};- Declaration: Enum value exists in
ds_runtime.hpp - Implementation: NONE - no decompression logic
- Test:
tests/compression_gdeflate_stub_test.cppverifies it fails - Error behavior: Runtime calls error callback when requested
- Roadmap: Listed as "◻️ Real compression format (CPU GDeflate first)" - NOT STARTED
Critical Note: The test explicitly checks that GDeflate does not work - this is intentional stub behavior.
| Feature | Status | Notes |
|---|---|---|
| Request struct | Missing bytes_transferred field |
|
| Queue enqueue | ✅ Implemented | Thread-safe |
| Queue submit_all | ✅ Implemented | Batch submission |
| Queue wait_all | ✅ Implemented | Blocks on in-flight |
| Queue in_flight | ✅ Implemented | Atomic counter |
| Queue take_completed | ❌ STUB | Declared but not implemented |
| Completion callbacks | ✅ Implemented | Worker thread execution |
Declared in ds_runtime.hpp:
struct Request {
int fd;
std::uint64_t offset;
std::size_t size;
void* dst;
const void* src;
void* gpu_buffer;
std::uint64_t gpu_offset;
RequestOp op;
RequestMemory dst_memory;
RequestMemory src_memory;
Compression compression;
RequestStatus status;
int errno_value;
// ❌ MISSING: bytes_transferred
};Referenced in ds_runtime.cpp but NOT DECLARED:
- Line 298:
req.bytes_transferred = 0; - Line 441:
total_bytes_transferred_.fetch_add(completed_req.bytes_transferred, ...)
This is a critical oversight - the implementation assumes a field that doesn't exist.
Public API (ds_runtime.hpp line 219):
std::vector<Request> take_completed();Implementation (ds_runtime.cpp line 545-547):
std::vector<Request> Queue::take_completed() {
return impl_->take_completed(); // ❌ Method doesn't exist in Impl
}What EXISTS:
Queue::Impl::completed_vector stores finished requests- Mutex
mtx_protects access - Completion callback populates the vector (line 433)
What's MISSING:
- Actual implementation to extract and clear
completed_ - Should likely be:
std::vector<Request> take_completed() { std::lock_guard<std::mutex> lock(mtx_); std::vector<Request> result; result.swap(completed_); return result; }
| Feature | Status | Working |
|---|---|---|
| ErrorContext struct | ✅ Implemented | Yes |
| set_error_callback | ✅ Implemented | Yes |
| report_error | ✅ Implemented | Yes |
| report_request_error | ✅ Implemented | Yes |
| Rich context | ✅ Implemented | subsystem/operation/detail/file/line/errno |
Error reporting is fully implemented and includes:
- Process-wide callback registration
- Subsystem tagging (cpu, vulkan, io_uring)
- Source location tracking (
__FILE__,__LINE__,__func__) - Request context (fd, offset, size, memory type)
- Timestamp tracking
| Feature | Status | Notes |
|---|---|---|
| C header | ✅ Complete | ds_runtime_c.h |
| C implementation | ✅ Complete | Enum/struct conversions |
| C test | ✅ Exists | c_abi_stats_test.c |
| Wine shim | ❌ Not implemented | Documented but no code |
| Proton integration | ❌ Not implemented | Documented but no code |
The C ABI wrapper exists and provides:
- Type conversions (C enums ↔ C++ enums)
- Opaque handle types (
ds_queue_t,ds_backend_t) - Function wrappers for all core operations
But actual Wine/Proton integration code does not exist - only documentation.
| Claim | Reality | Evidence |
|---|---|---|
| "Status: Experimental" | ✅ Accurate | Build broken, features missing |
| "Backend: CPU (implemented)" | Build fails | |
| "GPU/Vulkan backend: Experimental (file ↔ GPU buffer transfers)" | Only copy, no compute | |
| "io_uring backend (experimental)" | Host-only, build unknown | |
| "Demo 'decompression' stage (uppercase transform); GDeflate is stubbed" | ✅ Accurate | Test confirms stub |
| "Roadmap: ✅ Vulkan backend" | Only copy works, compute missing | |
| "Roadmap: ✅ io_uring backend" | Build broken | |
| "Roadmap: ◻️ Real compression format (CPU GDeflate first)" | ✅ Accurate | Not implemented |
| Section | Claim | Reality |
|---|---|---|
| CPU backend | "current", "provides: POSIX I/O, thread pool, decompression stage" | |
| Vulkan backend | "Read file data into staging buffers, Issue Vulkan buffer copies, Dispatch compute workloads" | |
| io_uring backend | "Submit read/write via kernel ring, Invoke completion callbacks, Reject GPU requests" | |
| Compression | "Planned progression: 1. CPU-based GDeflate (currently stubbed with ENOTSUP)" | ✅ Accurate - explicitly stubbed |
| Claim | Reality |
|---|---|
"native Linux shared library (libds_runtime.so)" |
|
"C ABI (include/ds_runtime_c.h)" |
✅ Exists and complete |
| "can be used as backend for Wine/Proton-facing shim" | ❌ No shim code exists |
| "Recommended integration path" | ✅ Documented but not implemented |
| Test | Purpose | Status |
|---|---|---|
basic_queue_test.cpp |
Queue operations | |
c_abi_stats_test.c |
C ABI validation | |
compression_gdeflate_stub_test.cpp |
Verifies GDeflate FAILS | |
io_uring_backend_test.cpp |
io_uring backend |
- CMake option:
DS_BUILD_TESTS=ON - Test framework: None (plain executables with exit codes)
- Coverage: Minimal - only 4 tests
- Actual execution: IMPOSSIBLE - build broken
| Example | Purpose | Status |
|---|---|---|
ds_demo_main.cpp |
CPU backend demo (raw + uppercase) | |
asset_streaming_main.cpp |
Concurrent reads with offsets | |
vk-copy-test/vk_copy_test.cpp |
Vulkan GPU copy demo |
Precompiled artifacts:
vk-copy-test/copy.comp.spv- SPIR-V shader (256 bytes)- Not used by any code - Vulkan backend doesn't load/use shaders
- ❌ Request::bytes_transferred - field missing from struct definition
- ❌ Queue::Impl::take_completed() - method declared but not implemented
- ❌ GDeflate compression - enum exists, runtime fails (intentional stub)
- ❌ Vulkan compute pipelines - only data copy works, no GPU compute
- ❌ GPU decompression - no shader execution implemented
⚠️ io_uring GPU support - explicitly rejected (host-only)
- ❌ Wine/Proton shim - documented integration but no code
⚠️ Request cancellation - noCancelledstatus in enum⚠️ Shader loading - SPIR-V exists but unused⚠️ Multi-queue io_uring - single worker only
⚠️ Comprehensive tests - only 4 basic tests⚠️ Performance metrics - stats infrastructure exists but untested⚠️ Documentation examples - README examples cannot run
Given the build is broken, nothing can be verified as working. However, analyzing the code:
- ✅ CPU ThreadPool (standard pattern, complete implementation)
- ✅ Error reporting infrastructure (complete, well-structured)
- ✅ POSIX I/O calls (pread/pwrite - simple, direct)
- ✅ FakeUppercase transform (trivial ASCII uppercase)
- ✅ Request validation (comprehensive checks)
- ✅ Mutex/atomic synchronization (standard patterns)
- ❌ Anything requiring compilation - build broken
- ❌ GDeflate decompression - intentional stub
- ❌ GPU compute - not implemented
- ❌ take_completed() - not implemented
- ❌ bytes_transferred tracking - field missing
Priority: CRITICAL
Effort: 1-2 hours
-
Add
bytes_transferredto Request struct- File:
include/ds_runtime.hpp - Add:
std::size_t bytes_transferred = 0; - Location: After
errno_valuefield (line ~84)
- File:
-
Implement
Queue::Impl::take_completed()- File:
src/ds_runtime.cpp - Location: After
in_flight()method (~line 478) - Implementation:
std::vector<Request> take_completed() { std::lock_guard<std::mutex> lock(mtx_); std::vector<Request> result; result.swap(completed_); return result; }
- File:
-
Update bytes_transferred in success path
- File:
src/ds_runtime.cpp - Location: After successful I/O (~line 301)
- Add:
req.bytes_transferred = static_cast<std::size_t>(io_bytes);
- File:
-
Verify build succeeds
cmake --build build ctest --test-dir build --verbose
Success Criteria:
- ✅ Project compiles without errors
- ✅ Tests can be executed
- ✅ Examples can be built
Priority: HIGH
Effort: 2-4 hours
Depends on: Phase 1 complete
-
Run existing tests
- Execute all 4 tests
- Document results (pass/fail)
- Capture any runtime errors
-
Run CPU demo
- Execute
ds_demo - Verify raw read works
- Verify FakeUppercase works
- Document output
- Execute
-
Run asset streaming demo
- Execute
ds_asset_streaming - Verify concurrent reads work
- Check error reporting
- Document output
- Execute
-
Test take_completed() API
- Create simple test that:
- Enqueues requests
- Submits and waits
- Calls take_completed()
- Verifies list matches submissions
- Create simple test that:
-
Test bytes_transferred tracking
- Create test that:
- Submits various size requests
- Checks bytes_transferred matches size
- Verifies partial reads set correctly
- Create test that:
Success Criteria:
- ✅ All existing tests pass
- ✅ CPU backend demonstrably works
- ✅ New fields/methods work correctly
- ✅ Error paths tested
Priority: MEDIUM
Effort: 1-2 weeks
Depends on: Phase 2 complete
-
Shader module loading
- Load SPIR-V from
copy.comp.spv - Create
VkShaderModule - Validate shader compilation
- Load SPIR-V from
-
Compute pipeline creation
- Define pipeline layout
- Create descriptor set layout (input/output buffers)
- Build compute pipeline
-
Descriptor management
- Allocate descriptor pool
- Create descriptor sets
- Update bindings for buffer ops
-
Compute dispatch
- Record compute commands
- Bind pipeline + descriptors
- Dispatch workgroups
- Add compute barriers
-
GPU decompression infrastructure
- Define decompression interface
- Add compression format detection
- Implement placeholder GPU transform
-
Testing
- Test GPU copy via compute (not just vkCmdCopyBuffer)
- Test staging → compute → GPU path
- Benchmark vs CPU backend
Success Criteria:
- ✅ Compute pipeline executes on GPU
- ✅ Buffer transforms work (even if simple)
- ✅ Vulkan validation layers pass
- ✅ Performance metrics captured
Priority: MEDIUM
Effort: 2-4 weeks
Depends on: Phase 3 complete (for GPU path)
-
Research GDeflate spec
- Study Microsoft DirectStorage GDeflate format
- Identify decoder requirements
- Survey existing implementations
-
CPU GDeflate decoder
- Implement basic decoder
- Handle chunk boundaries
- Add error handling
- Unit test against known data
-
Integrate CPU path
- Wire decoder into CpuBackend
- Replace error with actual decompression
- Update
compression_gdeflate_stub_test.cppto expect success
-
GPU GDeflate decoder (optional)
- Port algorithm to GLSL compute shader
- Handle GPU buffer constraints
- Optimize for wavefront/warp sizes
-
Benchmark
- Compare CPU vs GPU performance
- Test on realistic game assets
- Document compression ratios
Success Criteria:
- ✅ CPU GDeflate decompression works
- ✅ Tests pass with real compressed data
- ✅ GPU path (if implemented) faster than CPU
- ✅ Roadmap updated to "✅ GDeflate"
Priority: LOW
Effort: 1 week
Depends on: Phase 2 complete
-
Multi-worker support
- Honor
worker_countconfig - Create multiple io_uring instances
- Load-balance requests
- Honor
-
Advanced features
- Implement linked operations
- Add timeout support
- Handle IORING_OP_READ_FIXED
-
Testing
- Stress test with concurrent submissions
- Verify completion ordering
- Benchmark vs CPU backend
Success Criteria:
- ✅ io_uring backend feature-complete
- ✅ Performance exceeds CPU backend
- ✅ Production-ready error handling
Note: GPU support likely impossible for io_uring (host-only by design)
Priority: LOW (requires external coordination)
Effort: 4-8 weeks
Depends on: Phases 1-4 complete
-
Create dstorage.dll shim
- Map DSTORAGE_REQUEST to ds::Request
- Implement queue forwarding
- Handle Windows file handles → fds
-
Integrate with Wine
- Build as Wine builtin DLL
- Configure override
- Test with DirectStorage apps
-
Integrate with Proton
- Link with vkd3d-proton
- Share Vulkan device/queue
- Test in Steam runtime
-
Real game testing
- Test with actual DirectStorage games
- Capture performance metrics
- Identify missing features
Success Criteria:
- ✅ Games run without DirectStorage crashes
- ✅ Asset loading works
- ✅ Performance acceptable
- ✅ Roadmap updated to "✅ Wine/Proton integration"
- Build system fragility - Currently broken, may have deeper issues
- Vulkan complexity - GPU compute is non-trivial, requires validation
- GDeflate spec - May be proprietary/undocumented
- Wine integration - Requires coordination with Wine maintainers
- Performance - No benchmarks, may not meet expectations
- Maintenance burden - Small codebase but complex domain
- Testing gap - Only 4 tests, needs comprehensive suite
- Documentation drift - Claims don't match reality already
- Dependency complexity - Vulkan, io_uring, Wine all optional but critical
- ✅ Fix build - Add missing field and method (Phase 1)
- ✅ Update README - Clarify build is broken in current state
- ✅ Add CI - GitHub Actions to catch build failures
- ✅ Document status - Create ANALYSIS.md (this document)
- ✅ Validate core - Run all tests, verify CPU backend (Phase 2)
- ✅ Add tests - Cover take_completed() and bytes_transferred
- ✅ Update docs - Reflect actual implementation state
⚠️ Roadmap honesty - Change ✅ to⚠️ for partial features
⚠️ Complete Vulkan - Add compute pipelines (Phase 3)⚠️ Implement GDeflate - Replace stub (Phase 4)⚠️ Wine integration - Actual shim code (Phase 6)⚠️ Performance testing - Benchmarks vs DirectStorage
The ds-runtime project is an excellent architectural foundation with:
- ✅ Clean API design
- ✅ Proper separation of concerns (Queue vs Backend)
- ✅ Extensible backend model
- ✅ Rich error reporting
- ✅ C ABI for FFI integration
However, it is not production-ready and cannot currently build:
- ❌ Critical compilation errors (missing field, missing method)
- ❌ No real compression support (intentional stub)
- ❌ Incomplete Vulkan backend (copy only, no compute)
- ❌ No Wine/Proton shim (documentation only)
⚠️ Minimal test coverage (4 tests, none runnable)
Status is accurately labeled "Experimental" but README claims need clarification:
- Change "✅ Vulkan backend" to "
⚠️ Vulkan backend (copy only)" - Change "✅ io_uring backend" to "
⚠️ io_uring backend (host-only, untested)" - Add note: "
⚠️ Current build is broken - see ANALYSIS.md"
Primary value: Reference implementation and learning resource, not drop-in solution.
Path forward: Complete Phase 1 (fix build), then Phase 2 (validate), then decide on Phase 3+ based on goals.
- Error reporting infrastructure (set_callback, report_error)
- C ABI wrapper (ds_runtime_c.h)
- ThreadPool implementation (standard pattern)
- CPU backend (pread/pwrite)
- FakeUppercase compression
- Queue orchestration
- Request validation
- Vulkan staging buffer copies
- Build system (compilation errors)
- take_completed() (not implemented)
- bytes_transferred (field missing)
- GDeflate (intentional stub)
- GPU compute (not implemented)
- Wine shim (no code)
| File | LOC | Status | Issues |
|---|---|---|---|
ds_runtime.hpp |
248 | Missing bytes_transferred in Request |
|
ds_runtime.cpp |
549 | ❌ Broken | References missing field/method |
ds_runtime_vulkan.hpp |
~150 | ✅ Complete | Interface only |
ds_runtime_vulkan.cpp |
~800 | No compute pipelines | |
ds_runtime_uring.hpp |
~100 | ✅ Complete | Interface only |
ds_runtime_uring.cpp |
~400 | Host-only, worker_count unused | |
ds_runtime_c.h |
~200 | ✅ Complete | C ABI wrapper |
ds_runtime_c.cpp |
~300 | ✅ Complete | C ABI implementation |
ds_runtime_logging.cpp |
~150 | ✅ Complete | Error reporting |
End of Analysis