Summary
The content-addressed upload path can permanently bind the wrong bytes to a content hash, after which the true bytes can never reach the destination, and the index/indexer can mint an immutable contents row whose size_bytes disagrees with the actual content.
Where
sync/content_addressed.go — uploadOneObject: the drift guard stats the file before rclone reads it (stat→copy race), the post-upload check compares size only, and HasRemoteObject suppresses any future re-upload of that hash.
index/index.go — size/mtime captured at walk time, hash computed later from a fresh open with no re-stat (store/files.go lookupContentTx then errors forever on the honest size).
Scenario
Wrong bytes under a hash: a file is edited in place (size+mtime preserved, or changed within the stat→read window) during a content-addressed push. rclone uploads bytes that don't match the recorded hash; InsertRemoteObject lands; HasRemoteObject now suppresses re-upload forever. If a clean duplicate of the true content exists elsewhere and is later offloaded, "recovery" yields the poisoned object.
Immutable wrong size: a file is appended to between the walker's d.Info() and the worker's hashFile. The contents row binds the new digest to the old size. Because contents is immutable, every later observation of those exact bytes fails the size cross-check — aborting the whole ApplyIndexBatch repeatedly and refusing the content-addressed push forever.
Fix shape
- In the upload path, hash the bytes as uploaded (stream-hash, or stage a copy and hash it) and refuse on mismatch with the indexed hash — never record a
remote_objects row for unverified bytes. (This also dovetails with the scan-back fingerprint work.)
- In the indexer,
Stat the open handle after hashing and build the contents row from that size/mtime, so hash and metadata describe the same inode state. Optionally isolate a single poisoned entry instead of failing the whole batch.
Adversarial audit of offload-v1 (auditor D F4, auditor B MEDIUM-6).
Summary
The content-addressed upload path can permanently bind the wrong bytes to a content hash, after which the true bytes can never reach the destination, and the index/indexer can mint an immutable
contentsrow whosesize_bytesdisagrees with the actual content.Where
sync/content_addressed.go—uploadOneObject: the drift guard stats the file before rclone reads it (stat→copy race), the post-upload check compares size only, andHasRemoteObjectsuppresses any future re-upload of that hash.index/index.go— size/mtime captured at walk time, hash computed later from a freshopenwith no re-stat (store/files.golookupContentTxthen errors forever on the honest size).Scenario
Wrong bytes under a hash: a file is edited in place (size+mtime preserved, or changed within the stat→read window) during a content-addressed push. rclone uploads bytes that don't match the recorded hash;
InsertRemoteObjectlands;HasRemoteObjectnow suppresses re-upload forever. If a clean duplicate of the true content exists elsewhere and is later offloaded, "recovery" yields the poisoned object.Immutable wrong size: a file is appended to between the walker's
d.Info()and the worker'shashFile. Thecontentsrow binds the new digest to the old size. Becausecontentsis immutable, every later observation of those exact bytes fails the size cross-check — aborting the wholeApplyIndexBatchrepeatedly and refusing the content-addressed push forever.Fix shape
remote_objectsrow for unverified bytes. (This also dovetails with the scan-back fingerprint work.)Statthe open handle after hashing and build thecontentsrow from that size/mtime, so hash and metadata describe the same inode state. Optionally isolate a single poisoned entry instead of failing the whole batch.Adversarial audit of offload-v1 (auditor D F4, auditor B MEDIUM-6).