Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.ko.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ Runtime은 Forge `agent_manifest.json`을 선택적으로 읽어 기존 Lab-comp

이 기능은 reliable edge agent runtime 방향의 첫 Runtime-side contract입니다. `agent_id`, `task_id`, `agent_type`, priority, latency budget, queue wait, fallback usage, telemetry context를 기록하지만 기존 `result.json`의 top-level compare/report 필드는 변경하지 않습니다.

Runtime result JSON에는 `runtime_health_snapshot`, `runtime_error_classification`, `runtime_events`도 additive evidence로 기록됩니다. 이제 health snapshot은 backend availability, latency budget/deadline observation, tegrastats evidence availability를 함께 남기고, runtime events는 sequential `event_index`를 가진 lifecycle trace로 기록됩니다. `--timeout-ms`는 latency timeout 관측 기준을 남기는 옵션이며, production request cancellation을 의미하지 않습니다. 실행이 `skipped`로 끝나면 Runtime은 `runtime_execution_skipped`, `retryable: true`, `retry_hint: check_backend_availability`를 남겨 Lab/Orchestrator가 failure handling evidence로 해석할 수 있게 합니다.
Runtime result JSON에는 `runtime_health_snapshot`, `runtime_error_classification`, `runtime_events`, `runtime_operation_summary`도 additive evidence로 기록됩니다. 이제 health snapshot은 backend availability, latency budget/deadline observation, tegrastats evidence availability와 `health_reason`을 함께 남기고, runtime events는 sequential `event_index`를 가진 lifecycle trace로 기록됩니다. `runtime_operation_summary`는 Lab/Orchestrator/AIGuard handoff용 compact index로 `risk_labels`, `evidence_gaps`, retryability, conservative `recommended_action`을 남기되 `decision_owner: lab`, `scheduler_owner: orchestrator`, `production_cancellation: false`를 유지합니다. `--timeout-ms`는 latency timeout 관측 기준을 남기는 옵션이며, production request cancellation을 의미하지 않습니다. 실행이 `skipped`로 끝나면 Runtime은 `runtime_execution_skipped`, `retryable: true`, `retry_hint: check_backend_availability`를 남겨 Lab/Orchestrator가 failure handling evidence로 해석할 수 있게 합니다.

예시:

Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -490,6 +490,7 @@ Runtime result JSON also includes additive operation evidence blocks:
- `runtime_health_snapshot`: execution health, backend/device context, backend availability, run count, latency/FPS summary, latency-budget/deadline observation, tegrastats evidence availability, and explicit timeout observation status. `--timeout-ms` records an observation threshold; it does not claim production request cancellation.
- `runtime_error_classification`: structured success/error category, severity, retryability, retry hint, observed mean latency, and timeout budget for downstream report context. Skipped execution is recorded as `runtime_execution_skipped` with `retry_hint: check_backend_availability` so Lab/Orchestrator can explain runtime failure handling without treating Runtime as a worker daemon.
- `runtime_events`: compact indexed lifecycle event log for configuration, benchmark completion, error classification, optional agent context, and tegrastats parsing.
- `runtime_operation_summary`: compact handoff index for Lab/Orchestrator/AIGuard with `health_reason`, `risk_labels`, `evidence_gaps`, retryability, and a conservative `recommended_action`. It keeps `decision_owner: lab`, `scheduler_owner: orchestrator`, and `production_cancellation: false`.

These fields are evidence for Orchestrator/Lab analysis. Runtime still does not schedule tasks or own deployment decisions.

Expand Down
36 changes: 36 additions & 0 deletions docs/agent_runtime_result_contract.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Runtime may also append additive operation evidence blocks:
- `runtime_health_snapshot`
- `runtime_error_classification`
- `runtime_events`
- `runtime_operation_summary`

These blocks support downstream runtime operation reporting without turning Runtime into a scheduler or deployment decision owner.

Expand Down Expand Up @@ -57,6 +58,7 @@ threshold. It records:
- `runtime_health_snapshot.timeout_observed: true`
- `runtime_error_classification.category: "runtime_timeout_observed"`
- `runtime_error_classification.retryable: true`
- `runtime_operation_summary.recommended_action: "review_latency_budget_or_degrade"`

Lab treats this as deployment review evidence. Runtime still only records the
observation; it does not cancel production requests or make deployment
Expand Down Expand Up @@ -86,6 +88,7 @@ When provided, Runtime appends:
"runs": 1,
"run_once": false,
"success": true,
"health_reason": "benchmark_completed",
"latency_mean_ms": 0.0,
"latency_p95_ms": 0.0,
"latency_p99_ms": 0.0,
Expand Down Expand Up @@ -150,14 +153,43 @@ When provided, Runtime appends:
"status": "none",
"category": "none",
"severity": "none",
"health_reason": "benchmark_completed",
"timeout_policy": "latency_threshold",
"timeout_budget_ms": 1,
"observed_mean_ms": 0.0,
"timeout_observed": false,
"retryable": false,
"retry_hint": "none"
},
{
"schema_version": "inferedge-runtime-event-v1",
"event_index": 3,
"type": "runtime_operation_summary_recorded",
"status": "ok",
"health_reason": "benchmark_completed",
"recommended_action": "none",
"risk_labels": [],
"evidence_gaps": ["thermal_memory_evidence_missing"]
}
],
"runtime_operation_summary": {
"schema_version": "inferedge-runtime-operation-summary-v1",
"observation_scope": "single_runtime_result",
"decision_owner": "lab",
"scheduler_owner": "orchestrator",
"production_cancellation": false,
"health_status": "ok",
"health_reason": "benchmark_completed",
"error_category": "none",
"retryable": false,
"recommended_action": "none",
"risk_labels": [],
"evidence_gaps": ["thermal_memory_evidence_missing"],
"timeout_observed": false,
"latency_budget_exceeded": false,
"deadline_missed": false,
"thermal_memory_evidence_available": false
},
"agent": {
"schema_version": "inferedge-runtime-agent-task-v1",
"source_contract": "inferedge-agent-manifest-v1",
Expand Down Expand Up @@ -220,7 +252,11 @@ When provided, Runtime appends:
- `execution_status` defaults to the Runtime benchmark status unless overridden.
- `runtime_health_snapshot`, `runtime_error_classification`, and `runtime_events` are additive and safe for existing consumers to ignore.
- `runtime_health_snapshot` includes backend availability, latency-budget/deadline observation, timeout observation, and tegrastats evidence availability when those values are known.
- `runtime_health_snapshot.health_reason` gives a compact reason such as `benchmark_completed`, `backend_unavailable_or_not_enabled`, `runtime_execution_skipped`, or `timeout_threshold_exceeded`.
- `runtime_events` uses additive `inferedge-runtime-event-v1` entries with sequential `event_index` values so Lab/Orchestrator reports can show a compact lifecycle trace.
- `runtime_operation_summary` is an additive handoff index for Lab/Orchestrator/AIGuard. It repeats the health reason, retryability, risk labels, evidence gaps, and a conservative `recommended_action` without making the deployment decision itself.
- `runtime_operation_summary.decision_owner` must remain `lab`, and `scheduler_owner` must remain `orchestrator`.
- `runtime_operation_summary.production_cancellation` is always `false`; Runtime records observations only.
- Runtime does not claim production request cancellation. `--timeout-ms` is an observation threshold: if a successful benchmark mean latency exceeds the configured threshold, Runtime records `timeout_observed: true`, `runtime_error_classification.category: runtime_timeout_observed`, and `retryable: true` for downstream reliability reporting.
- If execution is skipped because Runtime cannot complete the configured benchmark, Runtime records `runtime_error_classification.category: runtime_execution_skipped`, `severity: warning`, `retryable: true`, and `retry_hint: check_backend_availability`. This is failure-handling evidence for Lab/Orchestrator reporting, not a production worker retry loop.
- Without `--timeout-ms`, results record `timeout_policy: not_configured`, `timeout_budget_ms: null`, and `timeout_observed: false`.
Expand Down
26 changes: 26 additions & 0 deletions scripts/smoke_default.sh
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ assert data["jetson_evidence"]["tegrastats_summary"]["status"] == "not_provided"
health = data["runtime_health_snapshot"]
assert health["status"] == "degraded", health
assert health["success"] is False
assert health["health_reason"] == "backend_unavailable_or_not_enabled", health
assert health["timeout_policy"] == "not_configured"
assert health["timeout_observed"] is False
error = data["runtime_error_classification"]
Expand All @@ -73,6 +74,21 @@ assert error["retry_hint"] == "check_backend_availability", error
events = {event["type"]: event for event in data["runtime_events"]}
assert events["runtime_error_classified"]["category"] == "runtime_execution_skipped"
assert events["runtime_error_classified"]["retryable"] is True
assert events["runtime_error_classified"]["health_reason"] == health["health_reason"]
summary_event = events["runtime_operation_summary_recorded"]
assert summary_event["health_reason"] == health["health_reason"], summary_event
assert summary_event["recommended_action"] == "check_backend_availability", summary_event
operation = data["runtime_operation_summary"]
assert operation["schema_version"] == "inferedge-runtime-operation-summary-v1", operation
assert operation["decision_owner"] == "lab", operation
assert operation["scheduler_owner"] == "orchestrator", operation
assert operation["production_cancellation"] is False, operation
assert operation["health_status"] == "degraded", operation
assert operation["health_reason"] == health["health_reason"], operation
assert operation["recommended_action"] == "check_backend_availability", operation
assert "runtime_execution_skipped" in operation["risk_labels"], operation
assert "backend_unavailable" in operation["risk_labels"], operation
assert "timeout_policy_not_configured" in operation["evidence_gaps"], operation
PY

INFEREDGE_RUNTIME_RESULT_JSON="${OUTPUT_PATH}" python3 tests/test_lab_result_schema.py
Expand Down Expand Up @@ -108,6 +124,7 @@ health = data["runtime_health_snapshot"]
assert health["timeout_policy"] == "latency_threshold"
assert health["timeout_budget_ms"] == 1
assert health["timeout_observed"] is False
assert "health_reason" in health
assert health["latency_budget_ms"] == 33
assert "latency_budget_exceeded" in health
assert "deadline_missed" in health
Expand All @@ -120,8 +137,17 @@ assert "retry_hint" in error
events = {event["type"]: event for event in data["runtime_events"]}
assert events["runtime_error_classified"]["timeout_policy"] == "latency_threshold"
assert events["runtime_error_classified"]["timeout_budget_ms"] == 1
assert events["runtime_error_classified"]["health_reason"] == health["health_reason"]
assert events["benchmark_completed"]["latency_budget_ms"] == 33
assert events["runtime_operation_summary_recorded"]["health_reason"] == health["health_reason"]
assert [event["event_index"] for event in data["runtime_events"]] == list(range(len(data["runtime_events"])))
operation = data["runtime_operation_summary"]
assert operation["decision_owner"] == "lab", operation
assert operation["scheduler_owner"] == "orchestrator", operation
assert operation["production_cancellation"] is False, operation
assert operation["health_reason"] == health["health_reason"], operation
assert isinstance(operation["risk_labels"], list), operation
assert isinstance(operation["evidence_gaps"], list), operation
assert data["extra"]["agent_manifest_recorded"] is True
PY

Expand Down
Loading
Loading