Add optional snapshot compression defaults and standby integration#149
Add optional snapshot compression defaults and standby integration#149sjmiller609 merged 34 commits intomainfrom
Conversation
✱ Stainless preview buildsThis PR will update the
|
Co-authored-by: Steven Miller <sjmiller609@gmail.com>
The compression integration tests were using zstd level 19 and lz4 level 9, which are very slow for compressing ~1GB memory files. After merging main (which added more integration tests to lib/instances), the total package test time exceeded the 20-minute CI timeout. Reduce to level 3 for both zstd and lz4 high-level cases. The tests still exercise the full compression/decompression pipeline across both algorithms and multiple levels (1 and 3 for zstd, 0 and 3 for lz4). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Five compression cycles (each involving VM standby + compress + restore + boot + exec readiness) consistently exceed the 20-minute CI timeout after merging main. Reduce to three cycles: one in-flight zstd, one completed zstd, and one completed lz4. This still exercises both algorithms and both the in-flight/completed code paths. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… error handling - Add descriptions to snapshot_policy and compression fields in openapi.yaml per Steven's review comments - Check dst.Close() errors in runGoCompression and runGoDecompression to prevent silently corrupt snapshot files on delayed write failures Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
masnwilliams
left a comment
There was a problem hiding this comment.
solid feature — async compression design is clean, cancellation at every lifecycle point is thorough, and the standby body middleware is a nice backward-compat solution. main concern is the missing server-side validation for compression config (algorithm + level). see inline comments.
…metadata errors - Validate algorithm (zstd/lz4) and per-algorithm level ranges in toDomainSnapshotCompressionConfig instead of passing through unchecked - Log metadata update errors in compression jobs instead of silently discarding them - Normalize algorithm to lowercase in config struct after validation - Fix misleading test name (OmitsLevel -> PreservesLevel) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Summary
API / Config
POST /instances/{id}/standbyaccepts optionalcompressionPOST /instances/{id}/snapshotsaccepts optionalcompressionPOST /instancesaccepts optionalsnapshot_policysnapshot.compression_default(enabled,algorithm,level)Validation and behavior
Tests
go test -run '^$' ./cmd/api/api ./lib/instances ./lib/providers ./integrationgo test ./lib/instances -run 'TestNormalizeCompressionConfig|TestResolveSnapshotCompressionPolicyPrecedence|TestValidateCreateRequestSnapshotPolicy|TestValidateCreateSnapshotRequestRejectsStoppedCompression'Note
High Risk
High risk because it changes core instance lifecycle paths (standby/restore/fork/snapshot/delete) to manage asynchronous compression jobs and on-demand decompression, which can affect VM correctness, performance, and snapshot data integrity.
Overview
Adds optional snapshot compression (zstd or lz4) across the API and instance domain model:
POST /instancescan setsnapshot_policy,POST /instances/{id}/standbyandPOST /instances/{id}/snapshotsaccept per-request compression overrides, and instance/snapshot responses now include compression policy + compression state/size metadata.Implements asynchronous snapshot-memory compression for standby instances and centrally stored standby snapshots with configurable defaults (server
snapshot.compression_default, per-instance policy, then request override). Foreground operations (restore,fork,create snapshot,delete) now cancel/wait in-flight compression and ensure raw memory is available (including on-demand decompression) before proceeding.Updates plumbing and tests accordingly:
instances.NewManagertakes snapshot defaults,StandbyInstancetakes a request object, adds OpenTelemetry metrics for compression jobs/fallbacks/preemptions, skips copying.zst.tmp/.lz4.tmpartifacts during forks, and introduces middleware to normalize empty standby POST bodies to{}for strict request decoding.Written by Cursor Bugbot for commit 8638cce. This will update automatically on new commits. Configure here.