Harden app-store broker + supervisor (grants, backoff, rotation, RLIMIT_AS, TOCTOU, extend DoS)#19
Merged
Conversation
Add a deny-by-default authorization gate to the app-store broker: - manifest.ExposesMethod / HasGrant helpers (cap+target matching with exact, prefix-wildcard, and blanket forms). - supervisor.callFrom enforces, before any dial: target installed, method in exposes (all callers), and a matching ipc.call grant for cross-app callers. New typed errors ErrMethodNotExposed/ErrGrantMissing. - Service.CallFrom exposes the caller-aware entry point; Call stays the trusted daemon/pilotctl path. Update existing call-path tests to declare exposed methods.
Replace the fixed 30s verify-fail retry sleep with a capped exponential ramp (1s→2s→…→30s) that resets on a successful verify, consistent with crash-loop restart handling. Extract the shared nextBackoff helper and unit-test that it grows and saturates at the cap.
Replace single-step supervisor.log->.1 rotation with an N-generation shift (.1..N), discarding the oldest. Add Config.AuditLogMaxBackups (default 3); worst-case footprint is (backups+1) x AuditLogMaxBytes per app. Cover generation count and shift semantics with tests.
Extend applyChildResourceLimits to set RLIMIT_AS alongside RLIMIT_NOFILE for spawned children, configurable via Config.ChildMemoryLimitBytes (default 4 GiB — generous enough for Go/Node/Python virtual reservations, still a runaway guard). No-op on non-Linux, documented. Tests read the limits back via prlimit64 to confirm enforcement.
Add verifyAtSpawn, run immediately before execve in spawn(): Lstat symlink rejection plus a sha256 re-check against the pinned hash. Closes the time-of-check/time-of-use gap between scanInstalled (runs once at discovery) and the actual launch — a binary swapped for a symlink or different bytes after the scan is now refused. Tests cover both swap cases and the valid-binary pass.
Add two DoS guards to pkg/extend: - Per-app token-bucket rate limiter on hook dispatch (Registry.Run), enabled via SetRateLimit; aborts with ErrRateLimited when an app's budget is spent. Off by default (back-compat); injectable clock for deterministic tests. - Cap of maxDynamicRegistrationsPerApp (32) on runtime hook registrations per app in DaemonHandler.Register, via Registry.CountForApp; refuses growth with ErrTooManyRegistrations.
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
Close the codecov patch gap: Service.CallFrom (started + not-started), SetRateLimit disable branch, and rotateGenerations keep<1 floor.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Hardens the app-store broker and supervisor against several abuse/DoS
vectors and closes a TOCTOU gap, plus tightens the dynamic-extension
surface. Six self-contained changes, each with tests.
Changes
Broker authorization (deny-by-default).
supervisor.callFromenforces, before any socket dial: target installed → method present in
the target's
exposes(every caller) → for cross-appipc.call, thecaller's manifest holds a matching
ipc.callgrant. NewService.CallFrom(callerID, …)is the caller-aware entry point;Callstays the trusted daemon/pilotctl path. New manifest helpers
ExposesMethod/HasGrant(exact,prefix.*, and*target forms).Errors:
ErrMethodNotExposed,ErrGrantMissing.Exponential verify-fail backoff. The fixed 30s verify-fail retry is
now a capped exponential ramp (1s→2s→…→30s) that resets on success,
consistent with crash-loop restart handling. Shared
nextBackoffhelper, unit-tested for growth + saturation.
Multi-generation audit log rotation. Single-step
supervisor.log → .1becomes an N-generation shift (.1..N, oldest discarded). NewConfig.AuditLogMaxBackups(default 3).Address-space cap (Linux).
applyChildResourceLimitsnow setsRLIMIT_ASalongsideRLIMIT_NOFILEfor spawned apps, configurable viaConfig.ChildMemoryLimitBytes(default 4 GiB — generous forGo/Node/Python virtual reservations, still a runaway guard). No-op on
non-Linux (documented). Tests read limits back via
prlimit64.Spawn-time TOCTOU re-verify.
verifyAtSpawnre-checks (symlinkrejection + sha256) immediately before
execve, closing the gapbetween the one-time scan and launch.
Extension DoS guards.
pkg/extendgains a per-app token-bucketrate limiter on hook dispatch (
Registry.SetRateLimit, off by default,ErrRateLimited) and a cap of 32 dynamic registrations per app inDaemonHandler.Register(ErrTooManyRegistrations).Testing
go build ./...,go vet ./...clean.go test -race -parallel 4 -count=1 ./...— all packages green.growth, N-generation rotation, RLIMIT_AS read-back, TOCTOU symlink +
content swaps, rate-limit refill + per-app isolation + registration cap.
Notes
exposesgate now applies to all callers (it is the app'sIPC surface); existing call-path tests were updated to declare the
methods they invoke.
gofmtdrift in untouched files is left as-is to keep thediff focused.