Implement Write-Ahead Log with recovery and skip list concurrency improvements#22
Implement Write-Ahead Log with recovery and skip list concurrency improvements#22Souvik606 wants to merge 19 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a framed WAL (CRC32 + size + opcode + key-length + key + value), exported errors and Record marshal/unmarshal, Replay recovery with truncation behavior, a batched rotating LogWriter with fsync and Close semantics, extensive WAL unit tests, and two SkipList concurrency stress tests. ChangesWrite-Ahead Log (WAL) Durability Implementation
SkipList Concurrency Testing
Sequence Diagram(s) sequenceDiagram
participant Client
participant LogWriter
participant BatchWorker
participant FileSystem
Client->>LogWriter: Append(record)
LogWriter->>BatchWorker: enqueue commitTicket
BatchWorker->>FileSystem: write batch of frames
FileSystem->>FileSystem: fsync()
FileSystem-->>BatchWorker: write/fsync result
BatchWorker-->>LogWriter: complete tickets (errors)
LogWriter-->>Client: Append result
sequenceDiagram
participant Startup
participant Replay
participant FileSystem
participant Unmarshal
participant MemTable
Startup->>Replay: Replay(directory, memtable)
Replay->>FileSystem: open & read segment header+payload
FileSystem->>Unmarshal: pass frame
Unmarshal->>Replay: validate CRC/unmarshal
Replay->>MemTable: Put/Delete applied
Replay-->>Startup: return highest segment id or error
Assessment against linked issues
Comment |
Signed-off-by: Souvik Mukherjee <115608850+Souvik606@users.noreply.github.com>
|
@coderabbitai set pr title |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
# Fetch current PR details
gh pr view 22 --repo makeshift-engineering/penguin-db --json title,body,filesLength of output: 1921 🏁 Script executed: #!/bin/bash
gh pr edit 22 --repo makeshift-engineering/penguin-db --title "feat: implement Write-Ahead Log (WAL) with recovery and skip list concurrency improvements"Length of output: 464
This title reflects the key changes in the PR:
|
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (1)
internal/storage/wal/writer.go (1)
107-116: ⚡ Quick winConsider clearing batch slice references to avoid potential memory retention.
Lines 108–109 reuse
commitBatchandwriteBufferby truncating to[:0]and appending new data. While the slices are reused efficiently, the underlying array may still hold pointers to oldcommitTicketobjects if a previous batch was larger than the current one. Those tickets will not be garbage-collected until the array is reallocated or overwritten.For example, if one batch has 10,000 tickets and the next has 10, the array retains 9,990 stale ticket pointers until they are eventually overwritten.
♻️ Proposed fix
Nil out ticket references after signaling to allow garbage collection:
for _, ticket := range commitBatch { ticket.resultChan <- err } + for i := range commitBatch { + commitBatch[i] = nil + } } }This ensures that completed tickets can be collected promptly, preventing retention of large batches in memory.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 02fd8400-1f9e-40da-8a94-9c5f32793bd2
📒 Files selected for processing (9)
internal/storage/memtable/skiplist_test.gointernal/storage/wal/.gitkeepinternal/storage/wal/errors.gointernal/storage/wal/record.gointernal/storage/wal/record_test.gointernal/storage/wal/recovery.gointernal/storage/wal/recovery_test.gointernal/storage/wal/writer.gointernal/storage/wal/writer_test.go
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/storage/wal/recovery.go (1)
43-48:⚠️ Potential issue | 🟠 Major | ⚡ Quick winCheck
fmt.Sscanfreturn value to satisfy errcheck and handle malformed filenames.The return value is unchecked. If a file named e.g.
"abc.wal"exists,segmentIDremains 0 while the file is still replayed—silently corrupting segment-id tracking. Consider skipping files that don't match the expected pattern.🛡️ Proposed fix
for _, fileName := range walFiles { var segmentID int - fmt.Sscanf(fileName, "%d.wal", &segmentID) + if n, _ := fmt.Sscanf(fileName, "%d.wal", &segmentID); n != 1 { + slog.Debug("skipping WAL file with unexpected name format", "file", fileName) + continue + } if segmentID > highestSegmentID { highestSegmentID = segmentID }
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 96173958-2688-4048-946d-a169d99a40ca
📒 Files selected for processing (4)
internal/storage/memtable/skiplist_test.gointernal/storage/wal/recovery.gointernal/storage/wal/recovery_test.gointernal/storage/wal/writer.go
🚧 Files skipped from review as they are similar to previous changes (3)
- internal/storage/memtable/skiplist_test.go
- internal/storage/wal/writer.go
- internal/storage/wal/recovery_test.go
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
internal/storage/wal/recovery_test.go (2)
228-230: ⚡ Quick winHandle errors explicitly in test setup.
These test setup blocks ignore errors when opening files. If
os.OpenFilefails, the subsequentWriteorCloseoperations will panic, making test failures harder to diagnose.🛡️ Proposed fix for explicit error handling
For line 228-230:
- f, _ := os.OpenFile(path, os.O_APPEND|os.O_WRONLY, 0644) + f, err := os.OpenFile(path, os.O_APPEND|os.O_WRONLY, 0644) + if err != nil { + t.Fatal(err) + } f.Write(badFrame) f.Close()Apply similar changes to lines 262-264 and 290-293.
Also applies to: 262-264, 290-293
440-442: ⚡ Quick winHandle errors explicitly in test setup.
Similar to earlier tests, these blocks ignore errors when opening files. Explicit error handling improves test diagnostics.
🛡️ Proposed fix
For line 440-442:
- f, _ := os.OpenFile(path, os.O_APPEND|os.O_WRONLY, 0644) + f, err := os.OpenFile(path, os.O_APPEND|os.O_WRONLY, 0644) + if err != nil { + t.Fatal(err) + } f.Write(hdr) f.Close()Apply a similar change to lines 474-476.
Also applies to: 474-476
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 88bf4d59-f5d8-441b-8b2e-562b6f2a4fc9
📒 Files selected for processing (8)
internal/storage/memtable/skiplist_test.gointernal/storage/wal/errors.gointernal/storage/wal/record.gointernal/storage/wal/record_test.gointernal/storage/wal/recovery.gointernal/storage/wal/recovery_test.gointernal/storage/wal/writer.gointernal/storage/wal/writer_test.go
🚧 Files skipped from review as they are similar to previous changes (7)
- internal/storage/wal/errors.go
- internal/storage/memtable/skiplist_test.go
- internal/storage/wal/record.go
- internal/storage/wal/record_test.go
- internal/storage/wal/writer.go
- internal/storage/wal/recovery.go
- internal/storage/wal/writer_test.go
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
internal/storage/wal/record.go (1)
82-90: ⚡ Quick winValidate the stored frame size during decode.
UnmarshalRecordnever checks the encodedFrame Sizefield againstlen(frameData), so that header is effectively write-only. Adding that check would catch malformed frames earlier and make the framing contract self-consistent.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 51e5d53d-ef64-444a-a210-d805ac12f993
📒 Files selected for processing (1)
internal/storage/wal/record.go
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/storage/wal/recovery.go (1)
129-138:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDon't truncate the segment on non-EOF payload read failures.
This branch turns any
io.ReadFullerror intoTruncate(validBytes)+Sync(). If the underlying read fails for an actual disk/I/O reason, recovery will mutate the WAL and discard data instead of surfacing the read failure. Onlyio.EOF/io.ErrUnexpectedEOFshould take the truncation path.Suggested fix
payloadBuffer := make([]byte, payloadSizeBytes) _, err = io.ReadFull(file, payloadBuffer) if err != nil { - slog.Debug("unexpected EOF in payload, truncating segment", - "file", filepath.Base(filePath), - "valid_bytes", validBytes) - - if truncErr := file.Truncate(validBytes); truncErr != nil { - return truncErr - } - return file.Sync() + if errors.Is(err, io.EOF) || errors.Is(err, io.ErrUnexpectedEOF) { + slog.Debug("unexpected EOF in payload, truncating segment", + "file", filepath.Base(filePath), + "valid_bytes", validBytes) + + if truncErr := file.Truncate(validBytes); truncErr != nil { + return truncErr + } + return file.Sync() + } + return fmt.Errorf("unexpected disk error reading frame payload: %w", err) }
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 0adeb482-0dc2-494b-be91-0ea2a0f91f38
📒 Files selected for processing (2)
internal/storage/wal/recovery.gointernal/storage/wal/writer.go
|
|
||
| info, err := file.Stat() | ||
| if err != nil { | ||
| file.Close() |
Issue Reference
Summary by CodeRabbit
New Features
Tests