feat: implement static asset placement for sandbox executions#292
feat: implement static asset placement for sandbox executions#292ArthurCRodrigues merged 13 commits intomainfrom
Conversation
This commit introduces the ability to inject grader-owned static assets (e.g., datasets, fixtures) from an S3 provider into the sandbox before setup commands and grading run. Key changes: - Added `ministack` to `docker-compose.yml` for local S3 emulation. - Created `SetupConfig` and `AssetConfig` Pydantic models for validation. - Implemented `AssetSourceResolver` with disk/memory LRU caching of pre-compressed tar archives. - Added `inject_assets` to `SandboxContainer` using Docker SDK's `put_archive`. - Refactored `PreFlightStep` and `PreFlightService` to orchestrate asset injection. - Updated documentation with the new `assets` configuration schema. Co-authored-by: gemini-cli <218195315+gemini-cli@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds a pipeline feature to fetch grader-owned static assets from an S3-compatible provider and inject them into the execution sandbox during preflight, before setup commands and grading run.
Changes:
- Introduces
SetupConfig/AssetConfigPydantic models and updates preflight config resolution. - Implements an S3-backed asset resolver with memory/disk caching and tar-archive generation for Docker
put_archive. - Adds sandbox-side asset injection plus docs, docker-compose local S3 emulation, and tests.
Reviewed changes
Copilot reviewed 15 out of 16 changed files in this pull request and generated 18 comments.
Show a summary per file
| File | Description |
|---|---|
tests/unit/test_asset_injection.py |
Unit coverage for config parsing, resolver orchestration, and sandbox injection call shape. |
tests/integration/test_asset_preflight_integration.py |
Integration-style coverage that preflight triggers resolve + injection. |
sandbox_manager/sandbox_container.py |
Adds inject_assets() using Docker exec_run + put_archive. |
docs/features/setup_config_feature.md |
Documents new root-level assets schema and behavior. |
docker-compose.yml |
Adds ministack service for local S3 emulation. |
autograder/steps/pre_flight_step.py |
Orchestrates asset resolution + injection during preflight. |
autograder/services/pre_flight_service.py |
Refactors to accept SetupConfig and return a typed language config. |
autograder/services/assets/tar_utils.py |
Builds an in-memory tar archive for Docker put_archive. |
autograder/services/assets/s3_provider.py |
S3 provider to fetch objects and cache tar archives. |
autograder/services/assets/resolver.py |
Resolves AssetConfig → ResolvedAsset via provider. |
autograder/services/assets/provider.py |
Provider interface for asset backends. |
autograder/services/assets/cache_manager.py |
In-memory LRU + disk cache for tar bytes. |
autograder/models/dataclass/asset.py |
Adds ResolvedAsset dataclass. |
autograder/models/config/setup.py |
Defines the new setup/asset config models. |
.env.example |
Documents environment variables for the asset S3 provider. |
| for asset in assets_config: | ||
| source = asset.source | ||
| target = asset.target | ||
| read_only = asset.read_only | ||
|
|
||
| # Resolve asset | ||
| tar_content = self.provider.get_asset_tar(source, target, read_only) | ||
|
|
||
| if not tar_content: | ||
| raise RuntimeError(f"Failed to resolve asset: source={source}, target={target}") | ||
|
|
||
| resolved_assets.append(ResolvedAsset( | ||
| target=target, | ||
| tar_content=tar_content | ||
| )) |
There was a problem hiding this comment.
The docs/stateful intent is that assets are placed under /tmp when the configured target doesn’t start with /tmp/, but the resolver currently passes asset.target through unchanged when generating the tar. Since SandboxContainer.inject_assets() uploads the tar verbatim, the actual extraction path will follow whatever was used when creating the tar (not the later rewritten target_path). Normalize/enforce the final target path here (or in AssetConfig validation) before calling get_asset_tar(), and store that normalized path in ResolvedAsset.target so tar creation and injection agree.
| if not sandbox: | ||
| error_msg = t("preflight.error.setup_command_missing_sandbox", locale=pipeline_exec.locale) | ||
| return pipeline_exec.add_step_result(StepResult.fail( | ||
| step=self.step_name, | ||
| error=error_msg | ||
| )) | ||
|
|
||
| logger.info("Injecting assets into sandbox (external_user_id=%s)", pipeline_exec.submission.user_id) | ||
| try: | ||
| resolved_assets = self._asset_resolver.resolve_assets(self._setup_config.assets) | ||
| sandbox.inject_assets(resolved_assets) | ||
| except Exception as e: | ||
| error_msg = f"Failed to inject assets: {str(e)}" | ||
| logger.error("Asset injection failed (external_user_id=%s): %s", pipeline_exec.submission.user_id, error_msg) | ||
| return pipeline_exec.add_step_result(StepResult.fail( | ||
| step=self.step_name, | ||
| error=error_msg | ||
| )) |
There was a problem hiding this comment.
The injected-assets missing-sandbox path reuses preflight.error.setup_command_missing_sandbox, which is misleading when only assets are configured. Also, on injection failure the step returns a raw str(e) in the student-facing error, which can be noisy and inconsistent with other preflight failures that return structured error_data. Add a dedicated translated error key for asset injection failures/missing sandbox and consider attaching structured details (e.g., source/target) instead of leaking raw exception text.
| # Environment variables | ||
| self.bucket_name = os.getenv("CRITERIA_ASSETS_BUCKET_NAME") | ||
| self.access_key = os.getenv("AWS_ACCESS_ID") | ||
| self.secret_key = os.getenv("AWS_SECRET_ACCESS_KEY") | ||
| self.region = os.getenv("AWS_REGION", "us-east-1") | ||
| self.endpoint_url = os.getenv("S3_ENDPOINT_URL") | ||
|
|
||
| # Initialize boto3 client | ||
| self.s3 = boto3.client( | ||
| 's3', | ||
| endpoint_url=self.endpoint_url, | ||
| aws_access_key_id=self.access_key, | ||
| aws_secret_access_key=self.secret_key, | ||
| region_name=self.region, | ||
| config=Config(signature_version='s3v4') | ||
| ) |
There was a problem hiding this comment.
S3AssetProvider will happily initialize with bucket_name=None / missing credentials, and then later get_object() will fail with a less clear boto3 error. Consider validating required env vars up-front (bucket name at minimum; and credentials depending on endpoint) and raising a clear, actionable exception so preflight failures are easier to diagnose.
| import io | ||
| import tarfile | ||
| import os | ||
| import time | ||
|
|
||
| def create_tar_archive(filename: str, content: bytes, target_path: str, read_only: bool = True) -> bytes: | ||
| """ |
There was a problem hiding this comment.
This module has unused imports/args that will fail CI linting (pylint enables unused-import/unused-argument on changed files): os is unused and the filename parameter is never referenced. Either remove them or use filename (e.g., to derive the tar entry name) and drop the unused import.
| def test_invalid_asset_target(self): | ||
| """Test validation of absolute target path.""" | ||
| with pytest.raises(Exception): | ||
| AssetConfig(source="s", target="relative/path") | ||
|
|
||
| def test_asset_path_traversal(self): | ||
| """Test rejection of path traversal in source.""" | ||
| with pytest.raises(Exception): | ||
| AssetConfig(source="../secret", target="/tmp/s") |
There was a problem hiding this comment.
These validation tests use pytest.raises(Exception), which is overly broad and can mask unrelated failures (and makes intent unclear). Since AssetConfig validation errors are deterministic, assert on the specific exception type raised by Pydantic (e.g., pydantic.ValidationError) or ValueError as appropriate.
…ompatible implementation
…lution implementation
|
@jaoppb had to refactor some of your implementation: I discovered that using Docker's native put_archive method has compatibility issues with the gVisor (runsc) runtime. Under high isolation, put_archive would often bypass the sandboxed VFS layer, resulting in "file not found" To maintain our security posture while ensuring reliable asset injection, we switched to a Base64-encoded exec_run approach. By encoding the file content and piping it through base64 -d inside an exec session, we ensure that |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Pull Request: Secure Sandbox Asset Injection and gVisor Compatibility
Objective
Resolve issues with sandbox asset injection when running under high-security isolation. The primary goal was to ensure that assets fetched from S3 are correctly injected into sandboxes using the gVisor (runsc) runtime, which previously failed due to filesystem driver incompatibilities with Docker's put_archive.
Changes
Sandbox Injection (gVisor Compatibility)
Asset Pipeline Refactor
Grader and Template Fixes
Infrastructure and Config
Testing and QA Results
Performed a full "Green Run" in the local stack to verify these changes:
Risks and Considerations