gwonxhj · hyeokjun32 · May 21, 2026 · May 21, 2026
diff --git a/README.ko.md b/README.ko.md
@@ -75,7 +75,7 @@ Runtime은 Forge `agent_manifest.json`을 선택적으로 읽어 기존 Lab-comp
 
 이 기능은 reliable edge agent runtime 방향의 첫 Runtime-side contract입니다. `agent_id`, `task_id`, `agent_type`, priority, latency budget, queue wait, fallback usage, telemetry context를 기록하지만 기존 `result.json`의 top-level compare/report 필드는 변경하지 않습니다.
 
-Runtime result JSON에는 `runtime_health_snapshot`, `runtime_error_classification`, `runtime_events`도 additive evidence로 기록됩니다. 이제 health snapshot은 backend availability, latency budget/deadline observation, tegrastats evidence availability를 함께 남기고, runtime events는 sequential `event_index`를 가진 lifecycle trace로 기록됩니다. `--timeout-ms`는 latency timeout 관측 기준을 남기는 옵션이며, production request cancellation을 의미하지 않습니다. 실행이 `skipped`로 끝나면 Runtime은 `runtime_execution_skipped`, `retryable: true`, `retry_hint: check_backend_availability`를 남겨 Lab/Orchestrator가 failure handling evidence로 해석할 수 있게 합니다.
+Runtime result JSON에는 `runtime_health_snapshot`, `runtime_error_classification`, `runtime_events`, `runtime_operation_summary`도 additive evidence로 기록됩니다. 이제 health snapshot은 backend availability, latency budget/deadline observation, tegrastats evidence availability와 `health_reason`을 함께 남기고, runtime events는 sequential `event_index`를 가진 lifecycle trace로 기록됩니다. `runtime_operation_summary`는 Lab/Orchestrator/AIGuard handoff용 compact index로 `risk_labels`, `evidence_gaps`, retryability, conservative `recommended_action`을 남기되 `decision_owner: lab`, `scheduler_owner: orchestrator`, `production_cancellation: false`를 유지합니다. `--timeout-ms`는 latency timeout 관측 기준을 남기는 옵션이며, production request cancellation을 의미하지 않습니다. 실행이 `skipped`로 끝나면 Runtime은 `runtime_execution_skipped`, `retryable: true`, `retry_hint: check_backend_availability`를 남겨 Lab/Orchestrator가 failure handling evidence로 해석할 수 있게 합니다.
 
 예시:
 

diff --git a/README.md b/README.md
@@ -490,6 +490,7 @@ Runtime result JSON also includes additive operation evidence blocks:
 - `runtime_health_snapshot`: execution health, backend/device context, backend availability, run count, latency/FPS summary, latency-budget/deadline observation, tegrastats evidence availability, and explicit timeout observation status. `--timeout-ms` records an observation threshold; it does not claim production request cancellation.
 - `runtime_error_classification`: structured success/error category, severity, retryability, retry hint, observed mean latency, and timeout budget for downstream report context. Skipped execution is recorded as `runtime_execution_skipped` with `retry_hint: check_backend_availability` so Lab/Orchestrator can explain runtime failure handling without treating Runtime as a worker daemon.
 - `runtime_events`: compact indexed lifecycle event log for configuration, benchmark completion, error classification, optional agent context, and tegrastats parsing.
+- `runtime_operation_summary`: compact handoff index for Lab/Orchestrator/AIGuard with `health_reason`, `risk_labels`, `evidence_gaps`, retryability, and a conservative `recommended_action`. It keeps `decision_owner: lab`, `scheduler_owner: orchestrator`, and `production_cancellation: false`.
 
 These fields are evidence for Orchestrator/Lab analysis. Runtime still does not schedule tasks or own deployment decisions.
 

diff --git a/docs/agent_runtime_result_contract.md b/docs/agent_runtime_result_contract.md
@@ -9,6 +9,7 @@ Runtime may also append additive operation evidence blocks:
 - `runtime_health_snapshot`
 - `runtime_error_classification`
 - `runtime_events`
+- `runtime_operation_summary`
 
 These blocks support downstream runtime operation reporting without turning Runtime into a scheduler or deployment decision owner.
 
@@ -57,6 +58,7 @@ threshold. It records:
 - `runtime_health_snapshot.timeout_observed: true`
 - `runtime_error_classification.category: "runtime_timeout_observed"`
 - `runtime_error_classification.retryable: true`
+- `runtime_operation_summary.recommended_action: "review_latency_budget_or_degrade"`
 
 Lab treats this as deployment review evidence. Runtime still only records the
 observation; it does not cancel production requests or make deployment
@@ -86,6 +88,7 @@ When provided, Runtime appends:
     "runs": 1,
     "run_once": false,
     "success": true,
+    "health_reason": "benchmark_completed",
     "latency_mean_ms": 0.0,
     "latency_p95_ms": 0.0,
     "latency_p99_ms": 0.0,
@@ -150,14 +153,43 @@ When provided, Runtime appends:
       "status": "none",
       "category": "none",
       "severity": "none",
+      "health_reason": "benchmark_completed",
       "timeout_policy": "latency_threshold",
       "timeout_budget_ms": 1,
       "observed_mean_ms": 0.0,
       "timeout_observed": false,
       "retryable": false,
       "retry_hint": "none"
+    },
+    {
+      "schema_version": "inferedge-runtime-event-v1",
+      "event_index": 3,
+      "type": "runtime_operation_summary_recorded",
+      "status": "ok",
+      "health_reason": "benchmark_completed",
+      "recommended_action": "none",
+      "risk_labels": [],
+      "evidence_gaps": ["thermal_memory_evidence_missing"]
     }
   ],
+  "runtime_operation_summary": {
+    "schema_version": "inferedge-runtime-operation-summary-v1",
+    "observation_scope": "single_runtime_result",
+    "decision_owner": "lab",
+    "scheduler_owner": "orchestrator",
+    "production_cancellation": false,
+    "health_status": "ok",
+    "health_reason": "benchmark_completed",
+    "error_category": "none",
+    "retryable": false,
+    "recommended_action": "none",
+    "risk_labels": [],
+    "evidence_gaps": ["thermal_memory_evidence_missing"],
+    "timeout_observed": false,
+    "latency_budget_exceeded": false,
+    "deadline_missed": false,
+    "thermal_memory_evidence_available": false
+  },
   "agent": {
     "schema_version": "inferedge-runtime-agent-task-v1",
     "source_contract": "inferedge-agent-manifest-v1",
@@ -220,7 +252,11 @@ When provided, Runtime appends:
 - `execution_status` defaults to the Runtime benchmark status unless overridden.
 - `runtime_health_snapshot`, `runtime_error_classification`, and `runtime_events` are additive and safe for existing consumers to ignore.
 - `runtime_health_snapshot` includes backend availability, latency-budget/deadline observation, timeout observation, and tegrastats evidence availability when those values are known.
+- `runtime_health_snapshot.health_reason` gives a compact reason such as `benchmark_completed`, `backend_unavailable_or_not_enabled`, `runtime_execution_skipped`, or `timeout_threshold_exceeded`.
 - `runtime_events` uses additive `inferedge-runtime-event-v1` entries with sequential `event_index` values so Lab/Orchestrator reports can show a compact lifecycle trace.
+- `runtime_operation_summary` is an additive handoff index for Lab/Orchestrator/AIGuard. It repeats the health reason, retryability, risk labels, evidence gaps, and a conservative `recommended_action` without making the deployment decision itself.
+- `runtime_operation_summary.decision_owner` must remain `lab`, and `scheduler_owner` must remain `orchestrator`.
+- `runtime_operation_summary.production_cancellation` is always `false`; Runtime records observations only.
 - Runtime does not claim production request cancellation. `--timeout-ms` is an observation threshold: if a successful benchmark mean latency exceeds the configured threshold, Runtime records `timeout_observed: true`, `runtime_error_classification.category: runtime_timeout_observed`, and `retryable: true` for downstream reliability reporting.
 - If execution is skipped because Runtime cannot complete the configured benchmark, Runtime records `runtime_error_classification.category: runtime_execution_skipped`, `severity: warning`, `retryable: true`, and `retry_hint: check_backend_availability`. This is failure-handling evidence for Lab/Orchestrator reporting, not a production worker retry loop.
 - Without `--timeout-ms`, results record `timeout_policy: not_configured`, `timeout_budget_ms: null`, and `timeout_observed: false`.

diff --git a/scripts/smoke_default.sh b/scripts/smoke_default.sh
@@ -62,6 +62,7 @@ assert data["jetson_evidence"]["tegrastats_summary"]["status"] == "not_provided"
 health = data["runtime_health_snapshot"]
 assert health["status"] == "degraded", health
 assert health["success"] is False
+assert health["health_reason"] == "backend_unavailable_or_not_enabled", health
 assert health["timeout_policy"] == "not_configured"
 assert health["timeout_observed"] is False
 error = data["runtime_error_classification"]
@@ -73,6 +74,21 @@ assert error["retry_hint"] == "check_backend_availability", error
 events = {event["type"]: event for event in data["runtime_events"]}
 assert events["runtime_error_classified"]["category"] == "runtime_execution_skipped"
 assert events["runtime_error_classified"]["retryable"] is True
+assert events["runtime_error_classified"]["health_reason"] == health["health_reason"]
+summary_event = events["runtime_operation_summary_recorded"]
+assert summary_event["health_reason"] == health["health_reason"], summary_event
+assert summary_event["recommended_action"] == "check_backend_availability", summary_event
+operation = data["runtime_operation_summary"]
+assert operation["schema_version"] == "inferedge-runtime-operation-summary-v1", operation
+assert operation["decision_owner"] == "lab", operation
+assert operation["scheduler_owner"] == "orchestrator", operation
+assert operation["production_cancellation"] is False, operation
+assert operation["health_status"] == "degraded", operation
+assert operation["health_reason"] == health["health_reason"], operation
+assert operation["recommended_action"] == "check_backend_availability", operation
+assert "runtime_execution_skipped" in operation["risk_labels"], operation
+assert "backend_unavailable" in operation["risk_labels"], operation
+assert "timeout_policy_not_configured" in operation["evidence_gaps"], operation
 PY
 
 INFEREDGE_RUNTIME_RESULT_JSON="${OUTPUT_PATH}" python3 tests/test_lab_result_schema.py
@@ -108,6 +124,7 @@ health = data["runtime_health_snapshot"]
 assert health["timeout_policy"] == "latency_threshold"
 assert health["timeout_budget_ms"] == 1
 assert health["timeout_observed"] is False
+assert "health_reason" in health
 assert health["latency_budget_ms"] == 33
 assert "latency_budget_exceeded" in health
 assert "deadline_missed" in health
@@ -120,8 +137,17 @@ assert "retry_hint" in error
 events = {event["type"]: event for event in data["runtime_events"]}
 assert events["runtime_error_classified"]["timeout_policy"] == "latency_threshold"
 assert events["runtime_error_classified"]["timeout_budget_ms"] == 1
+assert events["runtime_error_classified"]["health_reason"] == health["health_reason"]
 assert events["benchmark_completed"]["latency_budget_ms"] == 33
+assert events["runtime_operation_summary_recorded"]["health_reason"] == health["health_reason"]
 assert [event["event_index"] for event in data["runtime_events"]] == list(range(len(data["runtime_events"])))
+operation = data["runtime_operation_summary"]
+assert operation["decision_owner"] == "lab", operation
+assert operation["scheduler_owner"] == "orchestrator", operation
+assert operation["production_cancellation"] is False, operation
+assert operation["health_reason"] == health["health_reason"], operation
+assert isinstance(operation["risk_labels"], list), operation
+assert isinstance(operation["evidence_gaps"], list), operation
 assert data["extra"]["agent_manifest_recorded"] is True
 PY
-Original file line number
+Diff line change
@@ Expand Up @@
     이 기능은 reliable edge agent runtime 방향의 첫 Runtime-side contract입니다. `agent_id`, `task_id`, `agent_type`, priority, latency budget, queue wait, fallback usage, telemetry context를 기록하지만 기존 `result.json`의 top-level compare/report 필드는 변경하지 않습니다.
-    Runtime result JSON에는 `runtime_health_snapshot`, `runtime_error_classification`, `runtime_events`도 additive evidence로 기록됩니다. 이제 health snapshot은 backend availability, latency budget/deadline observation, tegrastats evidence availability를 함께 남기고, runtime events는 sequential `event_index`를 가진 lifecycle trace로 기록됩니다. `--timeout-ms`는 latency timeout 관측 기준을 남기는 옵션이며, production request cancellation을 의미하지 않습니다. 실행이 `skipped`로 끝나면 Runtime은 `runtime_execution_skipped`, `retryable: true`, `retry_hint: check_backend_availability`를 남겨 Lab/Orchestrator가 failure handling evidence로 해석할 수 있게 합니다.
+    Runtime result JSON에는 `runtime_health_snapshot`, `runtime_error_classification`, `runtime_events`, `runtime_operation_summary`도 additive evidence로 기록됩니다. 이제 health snapshot은 backend availability, latency budget/deadline observation, tegrastats evidence availability와 `health_reason`을 함께 남기고, runtime events는 sequential `event_index`를 가진 lifecycle trace로 기록됩니다. `runtime_operation_summary`는 Lab/Orchestrator/AIGuard handoff용 compact index로 `risk_labels`, `evidence_gaps`, retryability, conservative `recommended_action`을 남기되 `decision_owner: lab`, `scheduler_owner: orchestrator`, `production_cancellation: false`를 유지합니다. `--timeout-ms`는 latency timeout 관측 기준을 남기는 옵션이며, production request cancellation을 의미하지 않습니다. 실행이 `skipped`로 끝나면 Runtime은 `runtime_execution_skipped`, `retryable: true`, `retry_hint: check_backend_availability`를 남겨 Lab/Orchestrator가 failure handling evidence로 해석할 수 있게 합니다.
     예시:
@@ Expand Down @@