Skip to content

fix(runtime): embedded runtime survives in-place rollout (split from #681)#839

Draft
G4614 wants to merge 1 commit into
boxlite-ai:mainfrom
G4614:feat/logs-tee-exec
Draft

fix(runtime): embedded runtime survives in-place rollout (split from #681)#839
G4614 wants to merge 1 commit into
boxlite-ai:mainfrom
G4614:feat/logs-tee-exec

Conversation

@G4614

@G4614 G4614 commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Make the embedded runtime survive an in-place binary rollout: invalidate the extracted-runtime cache when the embedded manifest changes at the same crate version, and enforce +x on extracted executables. Split from #681.

Test plan

  • cargo clippy --workspace --all-targets --all-features -- -D warnings + cargo fmt --check clean.
  • Two-side verified (revert the fix → test fails with the exact symptom → restore → passes):
    • stamp_matches_current_manifest_requires_hash — legacy/no-hash and different-hash stamps are refreshed.
    • runtime_executable_permissions_are_enforced — an extracted runtime exec gets 0o755.
    • copy_shim_to_box_marks_existing_copy_executable.

🤖 Generated with Claude Code

Split out of boxlite-ai#681. These fixes are about embedded-runtime robustness
under in-place binary rollout (the runner-rollout flow), independent of
the boxlite-ai#652 default-sandbox-on change — they affect boxlite-shim/boxlite-guest,
which every box needs regardless of whether the sandbox is enabled.

- embedded.rs: stamp the runtime cache with the embedded manifest hash so
  a same-version binary swap (dev-channel rollout) invalidates stale
  extracted runtime binaries instead of silently reusing them; enforce +x
  on extracted runtime executables (boxlite-shim/boxlite-guest/bwrap).
- build.rs: emit BOXLITE_MANIFEST_HASH and set executable mode on the
  embedded runtime binaries.
- shim_copy.rs: re-mark an already-present copied shim as executable.

Tests (two-side verified): cache-staleness predicate, runtime-exec perms
enforcement, shim-copy perms.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

The PR adds executable-permission enforcement for named runtime binaries (boxlite-shim, boxlite-guest, bwrap) at three sites: build-time manifest generation and prebuilt binary copying, post-copy shim installation, and cached runtime extraction. The .complete stamp format is extended to include an embedded manifest hash; mismatched or legacy stamps now trigger cache invalidation and re-extraction.

Changes

Runtime Executable Permissions and Cache Stamp Validation

Layer / File(s) Summary
Build-time executable permission helpers
src/boxlite/build.rs
scan_entries switches from file_mode to runtime_file_mode; new runtime_file_mode and is_runtime_executable helpers OR 0o755 into the manifest mode for named executables; copy_prebuilt_binary calls new set_binary_permissions after copy to chmod embedded binaries to 0755.
Shim copy permission enforcement
src/boxlite/src/jailer/shim_copy.rs
copy_shim_to_box calls new ensure_executable on the destination after copy; ensure_executable ORs current mode with 0o755 on Unix; unit test verifies a restrictive-mode shim is corrected to 0o755.
Cache stamp validation + runtime permission enforcement
src/boxlite/src/runtime/embedded.rs
RUNTIME_EXECUTABLES constant added; extraction fast path now validates the .complete stamp against the current manifest hash, enforces executable permissions and refreshes the stamp on match, or removes the cache dir on mismatch; stamp write adds manifest hash as a third line; stamp_matches_current_manifest requires all three lines; ensure_runtime_permissions + is_runtime_executable enforce bits for each listed executable; tests cover hash-absent and stale-hash stamp rejection and permission enforcement on boxlite-shim.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

Suggested reviewers

  • DorianZheng

🐇 Hop, hop, hooray!
The shim now wears its bits with pride,
0o755—it can't hide!
Old stamps get checked, then tossed aside,
Fresh permissions on every ride.
✨ Execute away! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix(runtime): embedded runtime survives in-place rollout (split from #681)' directly and specifically describes the main change—fixing embedded runtime robustness during in-place binary rollouts by ensuring executable permissions and cache invalidation.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@G4614 G4614 marked this pull request as ready for review June 23, 2026 09:52
@G4614 G4614 requested a review from a team as a code owner June 23, 2026 09:52
@G4614 G4614 enabled auto-merge June 23, 2026 09:52

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/boxlite/src/runtime/embedded.rs`:
- Around line 111-113: The filetime::set_file_mtime call result is being
discarded with let _ = which silently ignores errors that could indicate
filesystem issues or break cache staleness tracking. Instead of discarding the
error, capture the Result returned by set_file_mtime and explicitly handle the
failure case - either propagate the error by converting it to the appropriate
error type and returning it, or log it at an appropriate level while documenting
the decision. Ensure the error handling preserves the original cause information
from the filesystem operation.
- Around line 277-280: The metadata permission retrieval is silently masking
errors by using unwrap_or(0o644) when std::fs::metadata(&path) fails, which
prevents proper error diagnosis. Instead of falling back to a default mode
value, propagate the metadata error directly using the error handling operator
(?) after the map operation so that the root cause of stat failures is preserved
and visible to callers, allowing proper diagnostics when cache directories are
damaged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: ee5cb14f-f31e-4aee-bbc8-57d9ca5def36

📥 Commits

Reviewing files that changed from the base of the PR and between cad894b and d7f4821.

📒 Files selected for processing (3)
  • src/boxlite/build.rs
  • src/boxlite/src/jailer/shim_copy.rs
  • src/boxlite/src/runtime/embedded.rs

Comment on lines +111 to +113
let now = filetime::FileTime::now();
let _ = filetime::set_file_mtime(&stamp, now);
return Ok(Self { dir });

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Handle mtime refresh failures explicitly on cache hits.

set_file_mtime errors are currently discarded, which can silently break stale-cache accounting and hide filesystem issues.

As per coding guidelines, “Never swallow errors silently” and “Fail fast on missing config or invalid inputs … Preserve the original cause when wrapping errors.”

Suggested fix
-                let now = filetime::FileTime::now();
-                let _ = filetime::set_file_mtime(&stamp, now);
+                let now = filetime::FileTime::now();
+                filetime::set_file_mtime(&stamp, now).map_err(|e| {
+                    BoxliteError::Storage(format!("touch {}: {}", stamp.display(), e))
+                })?;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let now = filetime::FileTime::now();
let _ = filetime::set_file_mtime(&stamp, now);
return Ok(Self { dir });
let now = filetime::FileTime::now();
filetime::set_file_mtime(&stamp, now).map_err(|e| {
BoxliteError::Storage(format!("touch {}: {}", stamp.display(), e))
})?;
return Ok(Self { dir });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/boxlite/src/runtime/embedded.rs` around lines 111 - 113, The
filetime::set_file_mtime call result is being discarded with let _ = which
silently ignores errors that could indicate filesystem issues or break cache
staleness tracking. Instead of discarding the error, capture the Result returned
by set_file_mtime and explicitly handle the failure case - either propagate the
error by converting it to the appropriate error type and returning it, or log it
at an appropriate level while documenting the decision. Ensure the error
handling preserves the original cause information from the filesystem operation.

Source: Coding guidelines

Comment on lines +277 to +280
let mode = std::fs::metadata(&path)
.map(|metadata| metadata.permissions().mode() & 0o777)
.unwrap_or(0o644);
Self::set_permissions(&path, mode)?;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Do not mask metadata failures during permission repair.

Falling back to 0o644 on metadata error hides the root cause (stat failure) and makes diagnostics harder when cache dirs are damaged.

As per coding guidelines, “Fail fast … Preserve the original cause when wrapping errors” and “Never swallow errors silently.”

Suggested fix
-            let mode = std::fs::metadata(&path)
-                .map(|metadata| metadata.permissions().mode() & 0o777)
-                .unwrap_or(0o644);
+            let mode = std::fs::metadata(&path)
+                .map(|metadata| metadata.permissions().mode() & 0o777)
+                .map_err(|e| BoxliteError::Storage(format!("stat {}: {}", path.display(), e)))?;
             Self::set_permissions(&path, mode)?;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let mode = std::fs::metadata(&path)
.map(|metadata| metadata.permissions().mode() & 0o777)
.unwrap_or(0o644);
Self::set_permissions(&path, mode)?;
let mode = std::fs::metadata(&path)
.map(|metadata| metadata.permissions().mode() & 0o777)
.map_err(|e| BoxliteError::Storage(format!("stat {}: {}", path.display(), e)))?;
Self::set_permissions(&path, mode)?;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/boxlite/src/runtime/embedded.rs` around lines 277 - 280, The metadata
permission retrieval is silently masking errors by using unwrap_or(0o644) when
std::fs::metadata(&path) fails, which prevents proper error diagnosis. Instead
of falling back to a default mode value, propagate the metadata error directly
using the error handling operator (?) after the map operation so that the root
cause of stat failures is preserved and visible to callers, allowing proper
diagnostics when cache directories are damaged.

Source: Coding guidelines

.map_err(|e| BoxliteError::Storage(format!("stat {}: {}", path.display(), e)))?;
let executable_mode = mode | 0o755;
if executable_mode != mode {
std::fs::set_permissions(path, std::fs::Permissions::from_mode(executable_mode)).map_err(

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should not have this issue by design, must be a bug

@G4614 G4614 marked this pull request as draft June 23, 2026 13:57
auto-merge was automatically disabled June 23, 2026 13:57

Pull request was converted to draft

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants