Skip to content

Feat/webCaseCenter#615

Open
liciazhu wants to merge 47 commits into
TencentCloud:masterfrom
liciazhu:feat/webCaseCenterV5
Open

Feat/webCaseCenter#615
liciazhu wants to merge 47 commits into
TencentCloud:masterfrom
liciazhu:feat/webCaseCenterV5

Conversation

@liciazhu

Copy link
Copy Markdown
Contributor

add cube web case center and mod web sandbox detail

@liciazhu liciazhu force-pushed the feat/webCaseCenterV5 branch 3 times, most recently from b9cb05d to 321e017 Compare June 25, 2026 01:52
Comment thread CubeAPI/src/services/examples.rs Outdated
.env("CUBE_API_ENVD_AUTH", &self.envd_auth)
.current_dir(&work_dir);

if std::env::var("CUBE_API_KEY").is_err() {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security: Hardcoded default CUBE_API_KEY

When CUBE_API_KEY is not set in the parent process environment, the example subprocess receives a well-known default key (cube_0000000000...). Any user who triggers an example run can extract this key from the process environment and use it to authenticate arbitrary API calls.

Either remove the fallback entirely (scripts should handle a missing key gracefully) or use an obviously non-functional sentinel (e.g., empty string) that scripts can detect and refuse to use.

Comment thread CubeAPI/src/services/examples.rs Outdated
}

// 3. Any healthy/ready template
if let Ok(tpls) = templates.list_templates().await {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance: Template validation iterates healthy candidates twice + makes 2N gRPC calls

Line 448 chains tpls.iter() (the full set) after the filtered subset, so every template is visited at least once and healthy/ready templates are validated twice. The inner templates.get_template() call is a gRPC round-trip to CubeMaster — in the worst case this makes 2N sequential gRPC calls.

The fix: replace .chain(tpls.iter()) with .chain(tpls.iter().filter(|t| t.status != "healthy" && t.status != "ready")), or better, inline the status check from list_templates and skip get_template entirely for templates already confirmed as healthy/ready.

Comment thread CubeAPI/src/services/examples.rs Outdated
}
};

let fingerprint = {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DefaultHasher uses random seeds per-process — cache misses after every restart

DefaultHasher::new() uses RandomState with per-process seeds, so the same requirements.txt content produces a different hash after every server restart. The fingerprint stamp file written by a previous process lifetime will never match, causing pip3 install to rerun unconditionally on the first example run after each restart.

Replace with a deterministic hash (e.g., SipHasher::new_with_keys(0, 0)) or simply compare the file content directly for these trivially small requirements files.

Comment thread CubeAPI/src/db.rs Outdated
.map_err(anyhow::Error::from)
}

#[allow(dead_code)]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dead code with explicit #[allow(dead_code)] — will drift from schema

This method is never called (explicitly acknowledged by the attribute). DB access methods that go unused will silently drift as columns are renamed/added/removed, then fail when someone finally calls them. Either wire it into a real caller or remove it.

Comment thread CubeAPI/src/config/mod.rs
fn default_sandbox_proxy_url() -> String {
std::env::var("AGENTHUB_SANDBOX_PROXY_URL").unwrap_or_else(|_| "http://127.0.0.1".to_string())
}
fn default_envd_auth() -> String {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security: Default envd_auth is a well-known base64 value

The default Basic cm9vdDo= decodes to root: with an empty password — the envd built-in default. This shared credential is sent to every sandbox's Jupyter/envd endpoint. Anyone who learns it (including by running code in their own sandbox and observing request headers) can access the in-sandbox services of other sandboxes. Use per-sandbox credentials (the sandbox detail already includes an envd_access_token field) instead of a cluster-wide shared secret.

// Pull all images in parallel, then inspect each.
let handles: Vec<_> = STORE_IMAGES
/// Pulls all store images from the registry then returns the updated metadata.
pub async fn refresh_store_meta(State(state): State<AppState>) -> impl IntoResponse {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security & performance: Concurrent Docker pulls with no rate limiting

This spawns one docker pull --quiet per catalog image concurrently via spawn_blocking. With 6+ images (some multi-GB), this can saturate Docker daemon capacity and network bandwidth. Also, POST /store/refresh should use with_auth_and_rate_limit in the router (like the node isolation endpoint) rather than plain with_auth. Consider adding a semaphore to cap concurrent pulls (e.g., 2-3 at a time).

Comment thread CubeAPI/src/services/examples.rs Outdated
.env("CUBE_API_ENVD_AUTH", &self.envd_auth)
.current_dir(&work_dir);

if std::env::var("CUBE_API_KEY").is_err() {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security: Hardcoded default CUBE_API_KEY

When CUBE_API_KEY is not set in the parent process environment, the example subprocess receives a well-known default key (cube_0000000000...). Any user who triggers an example run can extract this key from the process environment and use it to authenticate arbitrary API calls.

Either remove the fallback entirely (scripts should handle a missing key gracefully) or use an obviously non-functional sentinel (e.g., empty string) that scripts can detect and refuse to use.

Comment thread CubeAPI/src/services/examples.rs Outdated
}

// 3. Any healthy/ready template
if let Ok(tpls) = templates.list_templates().await {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance: Template validation iterates healthy candidates twice + makes 2N gRPC calls

Line 448 chains tpls.iter() (the full set) after the filtered subset, so every template is visited at least once and healthy/ready templates are validated twice. The inner templates.get_template() call is a gRPC round-trip to CubeMaster — in the worst case this makes 2N sequential gRPC calls.

The fix: replace .chain(tpls.iter()) with .chain(tpls.iter().filter(|t| t.status != "healthy" && t.status != "ready")), or better, inline the status check from list_templates and skip get_template entirely for templates already confirmed as healthy/ready.

Comment thread CubeAPI/src/services/examples.rs Outdated
}
};

let fingerprint = {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DefaultHasher uses random seeds per-process → cache misses after every restart

DefaultHasher::new() uses RandomState with per-process seeds, so the same requirements.txt content produces a different hash after every server restart. The fingerprint stamp file written by a previous process lifetime will never match, causing pip3 install to rerun unconditionally on the first example run after each restart.

Replace with a deterministic hash (e.g., SipHasher::new_with_keys(0, 0)) or simply compare the file content directly for these trivially small requirements files.

Comment thread CubeAPI/src/examples/registry.rs Outdated
/// calls return the same `&'static` slice, so the front-end gets a stable
/// ordering without the borrow checker complaining about temporaries.
pub fn scenario_registry() -> &'static [ScenarioSpec] {
Box::leak(Box::new(vec![

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scenario_registry() currently allocates and leaks a new Vec<ScenarioSpec> on every call, even though the comment says the registry is materialized once. Endpoints such as list, source, and run can call this repeatedly, so a long-lived API process will grow memory over time. Please store the vector behind OnceLock/LazyLock and return the initialized slice, ideally with a regression test that repeated calls reuse the same registry.

Comment thread CubeAPI/src/services/exec.rs Outdated
let start = Instant::now();

let output = match req.language.as_str() {
"python" => self.exec_python(sandbox_id, domain, &req.code).await?,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timeout_secs is only used after the Jupyter/envd request has already returned; the service still awaits exec_python/exec_bash without a real deadline. A hung or slow sandbox request can keep the handler occupied, and larger user timeouts will still be cut off by the outer 30s route timeout. Please wrap the actual execution future in tokio::time::timeout and align the route timeout with the allowed request budget.


let start = Instant::now();
let max_secs = sc.timeout_secs.unwrap_or(120);
let run_result = timeout(Duration::from_secs(max_secs), cmd.output()).await;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timing out cmd.output() drops the future but does not explicitly kill the spawned process or its process tree, so long-running Python/Go/Chromium children can continue after the API has returned. This path also installs requirements.txt during a request, mutating the server's Python environment and using a source-tree stamp file without concurrency control. Please spawn and kill/wait the child on timeout, and move dependency installation to build/deploy time or an isolated, locked cache that fails closed.

Comment thread CubeAPI/src/routes.rs Outdated
fn build_examples_run_routes(state: &AppState, auth_configured: bool) -> Router<AppState> {
let routes = Router::new().route("/examples/run", post(examples::run_example));
if auth_configured {
routes.layer(middleware::from_fn_with_state(state.clone(), unified_auth))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

POST /examples/run is moved to a long-timeout router with auth only, but it no longer gets the rate limiting used by other resource-creating or execution endpoints. Because this route starts local processes and can create sandboxes or browser sessions for up to several minutes, missing rate limits increases abuse and resource-exhaustion risk. Please apply the existing auth+rate-limit layer here, or add a dedicated concurrency/rate budget for example runs.

// `hidden: true` filter so even an attacker who knows the IDs cannot reach
// these scenarios through the HTTP API.

export const EXAMPLE_SCENARIOS: ExampleScenario[] = [

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds a second hand-maintained example registry in the frontend while the backend registry also defines scenario IDs, files, categories, store items, visibility, and run metadata. The two sources can drift, producing UI entries that cannot run or backend examples with stale labels/topology in the UI. Please make the backend /examples response include the UI metadata, or generate both Rust and TS registries from a shared data source.

pub async fn exec_code(
State(state): State<AppState>,
Path(sandbox_id): Path<String>,
Json(body): Json<ExecCodeRequest>,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExecCodeRequest derives Validate, but this handler only extracts Json(body) and never calls body.validate() before executing the request. As a result, the documented code length constraint is not enforced and the API can accept payloads larger than the model contract suggests. Please validate here and map failures to a 400 response, or use a shared validated-JSON extractor.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR adds the compiled examples/cube-bench/cube-bench binary alongside the Go source. Committing the build artifact substantially increases PR size, is platform/architecture-specific, and makes supply-chain review noisier. Please remove the binary and add an ignore rule for this executable or for example build outputs.

Comment thread CubeAPI/src/services/examples.rs Outdated
exit_code,
success,
elapsed_ms,
steps: Vec::new(),

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The run response now exposes steps, and the frontend adds a StepTimeline, but the service always returns steps: Vec::new() and the page does not render the timeline from backend data. This leaves a partially implemented API/UI surface that users cannot observe and future maintainers may assume is complete. Please either generate and render real step logs, or remove the unused field/component/comment until the feature is wired through.

"""

with Sandbox.create(template=template_id) as sandbox:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This added blank line contains trailing whitespace, so git diff --check still fails for the PR. Please remove the stray spaces (and any other diff-check formatting issues) to avoid CI or review noise.

Comment thread web/src/pages/SandboxCases.tsx Outdated
</div>
);
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git diff --check also reports this as a new blank line at EOF. Please remove the extra trailing blank line so the PR is clean for whitespace checks.

@liciazhu liciazhu force-pushed the feat/webCaseCenterV5 branch 2 times, most recently from 488bea9 to 2f8f47f Compare June 25, 2026 09:12
Comment thread CubeAPI/src/services/examples.rs Outdated
.env("CUBE_TEMPLATE_ID", &template_id)
.env("SSL_CERT_FILE", ssl_cert)
.env("AGENTHUB_SANDBOX_PROXY_URL", &self.sandbox_proxy_url)
.env("CUBE_API_ENVD_AUTH", &self.envd_auth)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security: envd_auth credential leaked into example subprocess environment

The full envd_auth header value (credential for the sandbox proxy) is injected as the CUBE_API_ENVD_AUTH environment variable into every example subprocess. Example scripts (including user-edited code in req.code) can read os.environ["CUBE_API_ENVD_AUTH"] and authenticate as root to the sandbox proxy, bypassing API-level authorization.

Impact: Escalation from "can run an example" to "can execute arbitrary code inside any sandbox."

Suggestion: Do not pass envd_auth to subprocesses. If example scripts need proxy access, create a scoped time-limited token instead.

liciazhu added 20 commits June 26, 2026 15:00
Signed-off-by: liciazhu <liciazhu@tencent.com>
Signed-off-by: liciazhu <liciazhu@tencent.com>
Signed-off-by: liciazhu <liciazhu@tencent.com>
Signed-off-by: liciazhu <liciazhu@tencent.com>
Signed-off-by: liciazhu <liciazhu@tencent.com>
…SandboxCases

Signed-off-by: liciazhu <liciazhu@tencent.com>
Signed-off-by: liciazhu <liciazhu@tencent.com>
Signed-off-by: liciazhu <liciazhu@tencent.com>
Signed-off-by: liciazhu <liciazhu@tencent.com>
… prefix

Signed-off-by: liciazhu <liciazhu@tencent.com>
…put, centralise credentials

Signed-off-by: liciazhu <liciazhu@tencent.com>
…stry

Signed-off-by: liciazhu <liciazhu@tencent.com>
Signed-off-by: liciazhu <liciazhu@tencent.com>
Signed-off-by: liciazhu <liciazhu@tencent.com>
…licy

Signed-off-by: liciazhu <liciazhu@tencent.com>
POST /store/refresh previously spawned an unbounded number of
concurrent blocking tasks — one docker-pull per catalog entry —
which could saturate Docker daemon and network bandwidth when the
catalog is large.

Introduce a tokio Semaphore (MAX_CONCURRENT_PULLS = 3) so that at
most 3 docker pulls run simultaneously. Each catalog image acquires
a permit before entering the spawn_blocking call and releases it
automatically on drop.

Assisted-by: CodeBuddy:claude-sonnet-4-6
Signed-off-by: liciazhu <liciazhu@tencent.com>
…ement Improvement in scenario_registry();Consistent Auth + Rate Limiting on /examples/run;Remove Committed Binary from Repository;Whitespace Formatting Fixes

Signed-off-by: liciazhu <liciazhu@tencent.com>
…xec-code

Signed-off-by: liciazhu <liciazhu@tencent.com>
Signed-off-by: liciazhu <liciazhu@tencent.com>
…bprocess environment;resolve N+1 HTTP cascade in template resolution;resolve Hardcoded fallback API key;resolve SVG XSS via DOMPurify's SVG profile and topology layout;resolve async runtie

Signed-off-by: liciazhu <liciazhu@tencent.com>
@liciazhu liciazhu force-pushed the feat/webCaseCenterV5 branch from db07d7d to b4457e7 Compare June 26, 2026 07:01
Comment thread web/src/components/CodeTerminal.tsx Outdated
return (
<div
className="jupyter-html-output max-h-[300px] overflow-auto rounded-md bg-white p-2 text-xs ring-1 ring-border/60 [&_table]:border-collapse [&_td]:border [&_td]:px-2 [&_td]:py-1 [&_th]:border [&_th]:px-2 [&_th]:py-1 [&_th]:bg-muted/40"
dangerouslySetInnerHTML={{ __html: DOMPurify.sanitize(result.html) }}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security: DOMPurify default config allows SVG elements, enabling potential mXSS attack surface

DOMPurify.sanitize(result.html) uses the library's default configuration, which allows SVG tags (e.g. <svg>, <use>, <animate>). SVG's complex namespace model has been the source of multiple mutation XSS bypasses across many DOMPurify versions. Since this content originates from sandbox code execution output (potentially user-controlled), this is a defense-in-depth concern.

Suggestion: Restrict to safe HTML tags:

DOMPurify.sanitize(result.html, {
  FORBID_TAGS: ['svg', 'use', 'animate', 'animateTransform', 'set'],
  SANITIZE_NAMED_PROPS: true,
})

<div className="flex items-center gap-1.5 text-xs font-medium text-foreground/80">
<Monitor size={12} />
Sandbox desktop preview (noVNC)
</div>

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i18n gap: Hardcoded English string — should use t() like the rest of this component

Every other user-facing string in this file uses t('...') from useTranslation('examples'), but this line renders "Sandbox desktop preview (noVNC)" as a raw JSX string. Also consider extracting the repeated isZh detection pattern ((i18n.language ?? 'en').toLowerCase().startsWith('zh') — used at least 3 times in this file) into a shared utility.

let _ = tokio::fs::write(&stamp_file, &req_content).await;
true
} else {
tracing::warn!(

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silent failure: pip install errors are swallowed with true return

When pip3 install -r requirements.txt fails (non-zero exit or spawn error), ensure_requirements returns true and the example script runs anyway. The user will see a confusing ModuleNotFoundError that looks like their code's fault rather than a failed dependency install.

Suggestion: Log at error level instead of warn so operators have visibility. Alternatively, include a warning header in the run response when dependencies were not installed.

Comment thread CubeAPI/src/config/mod.rs Outdated
pub cube_proxy_node_ip: Option<String>,

/// CubeProxy HTTP port (passed as CUBE_PROXY_PORT_HTTP).
/// Env var: CUBE_PROXY_PORT_HTTP (default "80")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc comment mismatch: claims default is "80", but default_cube_proxy_port_http() returns None

The doc comment on line 91 reads (default "80"), but the default function default_cube_proxy_port_http() returns Option<u16> and resolves to None when the env var is absent. There is no fallback to port 80 — when unset the variable is simply omitted from the subprocess environment.

Suggestion: Update the comment: (no default; omitted when unset)

Comment thread web/src/components/CodeTerminal.tsx Outdated
return (
<div
className="jupyter-html-output max-h-[300px] overflow-auto rounded-md bg-white p-2 text-xs ring-1 ring-border/60 [&_table]:border-collapse [&_td]:border [&_td]:px-2 [&_td]:py-1 [&_th]:border [&_th]:px-2 [&_th]:py-1 [&_th]:bg-muted/40"
dangerouslySetInnerHTML={{ __html: DOMPurify.sanitize(result.html) }}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security: DOMPurify default config allows SVG elements, enabling potential mXSS attack surface

DOMPurify.sanitize(result.html) uses the library's default configuration, which allows SVG tags (e.g. <svg>, <use>, <animate>). SVG's complex namespace model has been the source of multiple mutation XSS bypasses across many DOMPurify versions. Since this content originates from sandbox code execution output (potentially user-controlled), this is a defense-in-depth concern.

Suggestion: Restrict to safe HTML tags:

DOMPurify.sanitize(result.html, {
  FORBID_TAGS: ['svg', 'use', 'animate', 'animateTransform', 'set'],
  SANITIZE_NAMED_PROPS: true,
})

<div className="flex items-center gap-1.5 text-xs font-medium text-foreground/80">
<Monitor size={12} />
Sandbox desktop preview (noVNC)
</div>

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i18n gap: Hardcoded English string — should use t() like the rest of this component

Every other user-facing string in this file uses t('...') from useTranslation('examples'), but this line renders "Sandbox desktop preview (noVNC)" as a raw JSX string. Also consider extracting the repeated isZh detection pattern ((i18n.language ?? 'en').toLowerCase().startsWith('zh') — used at least 3 times in this file) into a shared utility.

let _ = tokio::fs::write(&stamp_file, &req_content).await;
true
} else {
tracing::warn!(

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silent failure: pip install errors are swallowed with true return

When pip3 install -r requirements.txt fails (non-zero exit or spawn error), ensure_requirements returns true and the example script runs anyway. The user will see a confusing ModuleNotFoundError that looks like their code's fault rather than a failed dependency install.

Suggestion: Log at error level instead of warn so operators have visibility. Alternatively, include a warning header in the run response when dependencies were not installed.

Comment thread CubeAPI/src/config/mod.rs Outdated
pub cube_proxy_node_ip: Option<String>,

/// CubeProxy HTTP port (passed as CUBE_PROXY_PORT_HTTP).
/// Env var: CUBE_PROXY_PORT_HTTP (default "80")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc comment mismatch: claims default is "80", but default_cube_proxy_port_http() returns None

The doc comment on line 91 reads (default "80"), but the default function default_cube_proxy_port_http() returns Option<u16> and resolves to None when the env var is absent. There is no fallback to port 80 — when unset the variable is simply omitted from the subprocess environment.

Suggestion: Update the comment: (no default; omitted when unset)

Signed-off-by: liciazhu <liciazhu@tencent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants