Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions README.ko.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,12 +58,17 @@ Recommended demo flow:
```bash
poetry run inferedgelab demo-evidence-summary
poetry run inferedgelab demo-evidence-summary --format json
poetry run inferedgelab agent-runtime-report \
--orchestration-summary examples/agent_runtime/agent_3_orchestration_summary.json \
--guard-analysis examples/agent_runtime/aiguard_runtime_guard_analysis.json
poetry run inferedgelab export-demo-evidence --output reports/studio_demo_evidence.md
```

Load Demo Evidence는 bundled ONNX Runtime CPU / TensorRT Jetson result fixture를 불러오고, Run / Import / Jetson Helper는 기존 CLI/API workflow를 local UI로 확장하는 보조 기능입니다.
Studio evidence와 jobs는 in-memory이며 local server process가 재시작되면 초기화됩니다.

`agent-runtime-report`는 Orchestrator scheduling evidence와 AIGuard runtime reliability `guard_analysis`를 Lab-owned agent deployment decision context로 묶는 additive report path입니다. 기존 Runtime result나 compare contract는 변경하지 않습니다.

## 이 레포의 역할

- Runtime benchmark/result JSON을 읽어 compare/report를 생성합니다.
Expand Down
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,9 @@ poetry run inferedgelab demo-evidence-summary
poetry run inferedgelab demo-evidence-summary --format json
poetry run inferedgelab portfolio-demo-check
poetry run inferedgelab core4-conformance-check
poetry run inferedgelab agent-runtime-report \
--orchestration-summary examples/agent_runtime/agent_3_orchestration_summary.json \
--guard-analysis examples/agent_runtime/aiguard_runtime_guard_analysis.json
poetry run inferedgelab export-demo-evidence --output reports/studio_demo_evidence.md
```

Expand All @@ -122,6 +125,9 @@ It validates the committed Studio fixtures, expected README/PPT metrics, portfol
It validates the bundled Forge manifest/metadata fixture, Runtime result JSON, Lab compare/deployment decision surface, and AIGuard `guard_analysis` evidence without mutating existing schemas.
The Lab decision surface now also exposes `policy_version`, `triggered_rules`, and `policy_summary` so reviewers can see which local policy rules produced deploy/review/block/unknown outcomes.

`agent-runtime-report` is an additive reliable edge agent runtime report path.
It bundles Orchestrator scheduling evidence and AIGuard runtime reliability `guard_analysis` into a Lab-owned agent deployment decision context without changing existing Runtime result or compare contracts.

![InferEdge Local Studio demo evidence](assets/images/local-studio-demo-evidence.png)

Verified demo fixture values:
Expand Down
77 changes: 77 additions & 0 deletions docs/portfolio/agent_runtime_reliability_report.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# Agent Runtime Reliability Report

## Scope

This report is the first Lab-side bundle view for the reliable edge agent
runtime path.

It connects:

- Forge `agent_manifest.json` metadata
- Runtime `result.agent` metadata
- Orchestrator `inferedge-orchestration-summary-v1`
- AIGuard `inferedge-aiguard-diagnosis-v1`
- Lab-owned agent deployment decision context

This is a local-first report path. It is not a production cloud orchestration
dashboard and does not add DB/queue/auth/billing behavior.

## Demo Bundle

Committed lightweight fixtures:

- `examples/agent_runtime/agent_3_orchestration_summary.json`
- `examples/agent_runtime/aiguard_runtime_guard_analysis.json`

Generate a Markdown report:

```bash
poetry run inferedgelab agent-runtime-report \
--orchestration-summary examples/agent_runtime/agent_3_orchestration_summary.json \
--guard-analysis examples/agent_runtime/aiguard_runtime_guard_analysis.json \
--format markdown \
--output reports/agent_runtime_reliability_report.md
```

## Evidence Summary

| Evidence | Value |
|---|---:|
| executed_count | 10 |
| dropped_count | 14 |
| deadline_missed_count | 1 |
| fallback_count | 14 |
| drop_rate | 0.583333 |
| fallback_rate | 0.583333 |
| deadline_miss_rate | 0.1 |
| queue_backlog_policy_decision_count | 1 |

## Lab Decision Context

Expected decision:

```text
blocked
```

Primary reason:

```text
Agent runtime reliability evidence indicates blocked deployment risk.
```

Triggered rules:

- `guard_blocked_runtime_block`
- `drop_rate_block`
- `fallback_rate_block`
- `deadline_miss_review`
- `queue_backlog_review`

## Boundary

- Orchestrator records scheduling and policy evidence.
- AIGuard explains runtime reliability risk.
- Lab remains the final deployment decision owner.
- This report is an additive agent-runtime path and does not change existing
Runtime result, compare output, or classic deployment decision contracts.
69 changes: 69 additions & 0 deletions examples/agent_runtime/agent_3_orchestration_summary.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
{
"schema_version": "inferedge-orchestration-summary-v1",
"agent_runtime_summary": {
"schema_version": "inferedge-orchestration-summary-v1",
"source_contracts": {
"forge_agent_manifest": "inferedge-agent-manifest-v1",
"runtime_agent_result": "inferedge-runtime-agent-task-v1"
},
"agents": {
"safety_monitor_agent": {
"agent_id": "safety_monitor_agent",
"agent_type": "safety",
"priority": 100,
"latency_budget_ms": 20.0,
"fallback_policy": "protect",
"task_id": "task_safety_monitor_agent"
},
"vision_agent": {
"agent_id": "vision_agent",
"agent_type": "vision",
"priority": 90,
"latency_budget_ms": 33.0,
"fallback_policy": "drop_stale",
"task_id": "task_vision_agent"
},
"voice_command_agent": {
"agent_id": "voice_command_agent",
"agent_type": "voice",
"priority": 50,
"latency_budget_ms": 120.0,
"fallback_policy": "defer",
"task_id": "task_voice_command_agent"
}
},
"totals": {
"executed_count": 10,
"dropped_count": 14,
"deadline_missed_count": 1,
"fallback_count": 14,
"policy_decision_count": 14,
"overload_event_count": 14
}
},
"policy_decision_log": [
{
"agent_id": "vision_agent",
"task_id": "task_vision_agent",
"decision": "load_shedding",
"reason": "queue_backlog_threshold_exceeded",
"fallback_used": true,
"protected_agent_id": "safety_monitor_agent"
}
],
"drop_events": [
{
"agent_id": "vision_agent",
"task_id": "task_vision_agent",
"reason": "load_shedding_backlog_threshold_exceeded"
}
],
"overload_events": [
{
"agent_id": "vision_agent",
"task_id": "task_vision_agent",
"fallback_used": true,
"reason": "queue_backlog_threshold_exceeded"
}
]
}
57 changes: 57 additions & 0 deletions examples/agent_runtime/aiguard_runtime_guard_analysis.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
{
"schema_version": "inferedge-aiguard-diagnosis-v1",
"source": {
"orchestration_summary_schema_version": "inferedge-orchestration-summary-v1"
},
"guard_verdict": "blocked",
"severity": "high",
"confidence": 0.88,
"primary_reason": "drop_rate indicates runtime reliability risk under orchestrated multi-agent load.",
"evidence": [
{
"type": "excessive_drop_rate",
"metric_name": "drop_rate",
"observed_value": 0.5833333333333334,
"baseline_value": null,
"threshold": 0.2,
"delta": null,
"delta_pct": null,
"increase_factor": null,
"severity": "high",
"status": "failed",
"explanation": "Drop rate crossed the configured review threshold under synthetic 3-agent load.",
"why_it_matters": "High drop rate can make camera or command workloads stale even if selected high-priority tasks are protected.",
"suspected_causes": [
"queue_backlog",
"overload_load_shedding",
"producer_rate_exceeds_runtime_capacity"
],
"recommendation": "Tune target FPS, queue size, drop policy, or fallback policy for affected agents.",
"raw_context": {
"executed_count": 10,
"dropped_count": 14
}
}
],
"suspected_causes": [
"queue_backlog",
"overload_load_shedding",
"producer_rate_exceeds_runtime_capacity"
],
"recommendations": [
"Tune target FPS, queue size, drop policy, or fallback policy for affected agents."
],
"thresholds": {
"drop_rate_review": 0.2,
"drop_rate_blocked": 0.5
},
"baseline_summary": {},
"candidate_summary": {
"runtime_reliability": {
"drop_rate": 0.5833333333333334,
"fallback_rate": 0.5833333333333334,
"deadline_miss_rate": 0.1
}
},
"created_at": "2026-05-17T00:00:00Z"
}
4 changes: 4 additions & 0 deletions inferedgelab/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
from inferedgelab.commands.demo_evidence import export_demo_evidence_cmd
from inferedgelab.commands.demo_evidence import portfolio_demo_check_cmd
from inferedgelab.commands.core4_conformance import core4_conformance_check_cmd
from inferedgelab.commands.agent_runtime_report import agent_runtime_report_cmd
from inferedgelab.commands.list_results import list_results_cmd
from inferedgelab.commands.history_report import history_report_cmd
from inferedgelab.commands.serve import serve_cmd
Expand Down Expand Up @@ -52,6 +53,9 @@ def version_cmd() -> None:
app.command("core4-conformance-check", help="Validate Forge/Runtime/Lab/AIGuard contract conformance")(
core4_conformance_check_cmd
)
app.command("agent-runtime-report", help="Generate Agent Runtime Reliability report from Orchestrator/AIGuard evidence")(
agent_runtime_report_cmd
)
app.command("list-results", help="List recent structured benchmark results")(list_results_cmd)
app.command("history-report", help="Generate HTML history report from structured benchmark results")(history_report_cmd)
app.command("serve", help="Run InferEdgeLab FastAPI server")(serve_cmd)
Expand Down
70 changes: 70 additions & 0 deletions inferedgelab/commands/agent_runtime_report.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
from __future__ import annotations

from pathlib import Path

import typer
from rich import print as rprint

from inferedgelab.services.agent_runtime_report import (
agent_runtime_reliability_json,
build_agent_runtime_reliability_markdown,
load_agent_runtime_reliability_bundle,
)


def agent_runtime_report_cmd(
orchestration_summary: str = typer.Option(
...,
"--orchestration-summary",
help="Path to InferEdgeOrchestrator orchestration_summary JSON",
),
guard_analysis: str = typer.Option(
"",
"--guard-analysis",
help="Optional AIGuard runtime reliability guard_analysis JSON",
),
format: str = typer.Option("text", "--format", "-f", help="text/json/markdown"),
output: str = typer.Option("", "--output", "-o", help="Optional output path"),
) -> None:
report = load_agent_runtime_reliability_bundle(
orchestration_summary_path=orchestration_summary,
guard_analysis_path=guard_analysis or None,
)
normalized_format = format.strip().lower()
if normalized_format == "json":
text = agent_runtime_reliability_json(report)
elif normalized_format in {"markdown", "md"}:
text = build_agent_runtime_reliability_markdown(report)
elif normalized_format == "text":
text = _text_summary(report)
else:
raise typer.BadParameter("--format must be one of: text, json, markdown")

if output:
path = Path(output)
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(text, encoding="utf-8")
rprint(f"[green]Saved[/green]: {path}")
else:
print(text, end="")


def _text_summary(report: dict) -> str:
metrics = report["agent_runtime_summary"]["metrics"]
decision = report["agent_deployment_decision"]
guard = report["guard_summary"]
lines = [
"InferEdge Agent Runtime Reliability Report",
f"schema_version: {report['schema_version']}",
f"decision: {decision['decision']}",
f"policy_version: {decision['policy_version']}",
f"reason: {decision['reason']}",
f"guard_verdict: {guard.get('guard_verdict')}",
f"drop_rate: {metrics['drop_rate']:.6g}",
f"fallback_rate: {metrics['fallback_rate']:.6g}",
f"deadline_miss_rate: {metrics['deadline_miss_rate']:.6g}",
"triggered_rules:",
]
lines.extend(f"- {rule}" for rule in decision["triggered_rules"])
lines.append("")
return "\n".join(lines)
Loading
Loading