EdgeEnv

Language: English | 한국어

InferEdgeEnv is a local-first run evidence registry and comparability checker for Edge AI inference benchmark results. The user-facing CLI command is edgeenv.

Start Here for v0.1.5

v0.1.5 is the current v1-complete release baseline. InferEdgeEnv v1 is complete as a local-first run evidence registry and comparability checker; later work should be treated as v1.1+ extensions, not missing MVP scope. The first path is:

Install and run doctor.
Record a deterministic fake run.
Try a local command run.
Compare only after EdgeEnv checks comparability.
Use Jetson docs only when you are ready to run EdgeEnv locally on the Jetson shell.

Validated scope: fake/local benchmark recording, artifact storage, registry lookup, export/import, comparability reports, optional resource metrics, read-only bundle summaries, and Jetson tegrastats sampled evidence through local execution on Jetson.

Start with Quickstart. If install fails while pip is fetching build dependencies, check Install And Quickstart Resilience before treating it as an EdgeEnv runtime failure.

If the first path is confusing or blocked, open a README Quickstart feedback issue and use the first-user feedback backlog to classify the first blocked step.

After the first fake run, choose the next path:

Connect your command: Local Command Contract Guide
Compare two runs: Compare Workflow Guide
Repeat Jetson measurements: Jetson Measurement Operations Checklist

Problem

Edge inference results are easy to record but hard to compare honestly. A latency number is only meaningful when model identity, input shape, precision, batch size, warmup/repeat protocol, and preprocess/postprocess boundaries are known.

EdgeEnv focuses on recording benchmark evidence locally and judging whether two runs are directly comparable, conditionally comparable, or not comparable.

What EdgeEnv Is Not

EdgeEnv is not:

An OS, bootloader, GRUB, BCD, or Linux compatibility layer
A VM, Docker, WSL, or cloud target manager
A cloud database, login/auth system, web dashboard, or public leaderboard
A model upload server or dataset upload server
A single-score ranking system for all models

Quickstart

Install and confirm both entrypoints:

python -m pip install -e ".[dev]"
python -m inferedge_env.cli doctor
edgeenv doctor

1. Record a Fake Run

Run the deterministic fake benchmark first. This checks the CLI, config schema, artifact writer, and registry without executing a real model.

edgeenv profile validate examples/profiles/local_fake.yaml
edgeenv bench validate examples/benches/yolov8n_fire.yaml
edgeenv bench run --target examples/profiles/local_fake.yaml --config examples/benches/yolov8n_fire.yaml
edgeenv runs list
edgeenv runs show <run_id>

Use the Run ID printed by bench run, or copy it from edgeenv runs list, when replacing <run_id>.

2. Record a Local Command Run

Then try the local runner examples. These execute small deterministic Python commands on the current machine.

edgeenv bench run --target examples/profiles/local.yaml --config examples/benches/local_echo_metrics.yaml
edgeenv bench run --target examples/profiles/local.yaml --config examples/benches/local_resource_metrics.yaml
edgeenv bench run --target examples/profiles/local.yaml --config examples/benches/local_template.yaml
edgeenv bench run --target examples/profiles/local.yaml --config examples/benches/local_adapter_template.yaml
edgeenv bench run --target examples/profiles/local.yaml --config examples/benches/local_runtime_adapter.yaml

The local target executes command on the current machine and reads an explicit EDGEENV_METRICS_JSON= line from stdout. Local commands may also emit an optional EDGEENV_RESOURCE_METRICS_JSON= line for memory, power, energy, or temperature evidence. bench run reports whether resource metrics were stored or omitted.

To connect your own benchmark command, start from examples/scripts/adapter_template.py when wrapping an existing command, or examples/scripts/local_benchmark_template.py when writing the benchmark loop directly. Then review the adapter pattern in Local Real Benchmark Example Guide.

3. Compare Two Runs

Compare two registered runs after you have at least two successful run IDs. EdgeEnv prints the comparability judgement before any metric delta.

edgeenv bench run --target examples/profiles/local.yaml --config examples/benches/local_compare_a.yaml
edgeenv bench run --target examples/profiles/local.yaml --config examples/benches/local_compare_b.yaml
edgeenv runs list
edgeenv report compare <run_id_a> <run_id_b>

For the full flow, see Compare Workflow Guide.

4. Optional Resource And Sampler Evidence

Sampler wrapper examples show the first integration boundary for optional resource evidence.

edgeenv bench run --target examples/profiles/local.yaml --config examples/benches/local_sampler_wrapper.yaml
edgeenv bench run --target examples/profiles/local.yaml --config examples/benches/local_sampler_unavailable.yaml

On Jetson, use the tegrastats wrapper path from the repo root:

edgeenv bench run --target examples/profiles/jetson_nano_local.yaml --config examples/benches/jetson_tegrastats_local.yaml

For the sampler adapter lifecycle path on Jetson, use the sampled local profile and inspect sampler metadata without opening artifact files manually:

edgeenv bench run --target examples/profiles/jetson_nano_sampled_local.yaml --config examples/benches/jetson_sampled_local.yaml
edgeenv runs sampler show <run_id>

If a sampler is unavailable, the wrapper should omit EDGEENV_RESOURCE_METRICS_JSON= and preserve the successful primary benchmark run. If a wrapper emits malformed resource metrics, EdgeEnv writes a failed-run artifact and does not update the registry:

edgeenv bench run --target examples/profiles/local.yaml --config examples/benches/local_sampler_malformed_resource.yaml
edgeenv failed-runs list
edgeenv failed-runs show <failed_run_id>
edgeenv failed-runs export <failed_run_id> --output edgeenv-failed-run-<failed_run_id>.zip
edgeenv failed-runs import edgeenv-failed-run-<failed_run_id>.zip

5. Inspect Evidence

runs show reads the result artifact and includes resource evidence when the local command emits it:

edgeenv runs show <run_id>
edgeenv runs resources list --metric memory_peak_mb
edgeenv runs resources list --metric memory_peak_mb --json

{
  "resource_metrics": {
    "energy_j": 31.7,
    "memory_mean_mb": 420.5,
    "memory_peak_mb": 512.0,
    "power_mean_w": 8.2,
    "power_peak_w": 11.4,
    "source": "example-script",
    "temperature_peak_c": 72.0
  }
}

The fake target uses FakeRunner, so it does not execute a real model. Local benchmark configs may set timeout_seconds, working_directory, and uppercase extra_env keys for controlled command execution. The Python package is inferedge_env; the user-facing CLI command remains edgeenv.

Guide Map

English representative path:

InferEdgeEnv Portfolio Summary — 30-second role, boundary, and reviewer path for this repository
Documentation Language Guide — choose the English representative path or Korean entry path
EdgeEnv v0.1.5 Follow-up Note — current v1-complete release baseline and trusted starting point
Portfolio Demo Path — reviewer-facing fake/local/compare/export-import/bundle-summary demo path
Local Command Contract Guide — how to connect your own local benchmark command
Compare Workflow Guide — how to judge comparability before reading metric deltas
Export/Import Design — portable evidence bundle contract
Schema Versioning And Migration Policy — evidence compatibility and future-version rejection policy
Release Maintenance Checklist — repeatable local, clean-room, optional Jetson, tag, and GitHub Release gate

Operational records:

EdgeEnv v0.1.5 Release Rehearsal — clean-room source archive release gate and patch-candidate judgement
EdgeEnv v0.1.4 Follow-up Note — previous release quality baseline
EdgeEnv v0.1.4 Bilingual Docs Sanity Sweep — README, Korean README, and representative docs reading-path check
EdgeEnv v0.1.4 Release Rehearsal — release quality gate run before the v0.1.4 candidate
EdgeEnv v0.1.4 Post-release Sanity Sweep — post-release check of README, follow-up note, and GitHub Release wording
Release Quality Gate Refresh — local release smoke script and optional Jetson gate after the six-month quality roadmap
README Quickstart Clean-room Rehearsal — fresh source archive and venv validation of the README path
Jetson Measurement Operations Checklist — repeated hardware measurement procedure
Jetson Sampled Evidence Bundle Handoff — sampled bundle export/import and imported compare validation
EdgeEnv MVP v1 Handoff Status — current capability snapshot and future-work entry points
First-user Feedback Backlog — v0.1.5 candidate usability observations before new feature work

Design references:

Benchmark Config Example

name: yolov8n-fire-fake
command: python run_yolov8n.py --input fire.jpg
model_name: yolov8n-fire
model_version: "1.0"
model_format: onnx
model_path: models/yolov8n-fire.onnx
task: object-detection
input_shape: [1, 3, 640, 640]
input_dtype: float32
runtime: fake-runtime
execution_provider: fake-provider
precision: fp32
batch_size: 1
warmup_runs: 3
repeat_runs: 10
include_preprocess: true
include_postprocess: true
timeout_seconds: 30
working_directory: .
extra_env:
  LOCAL_DEMO_FLAG: enabled

Target Profile Example

target_name: local-fake
target_type: fake
board_name: local-dev-machine
os: macOS
runtime_tags:
  - fake
  - local

MVP v1 accepts fake and local target types. SSH is reserved for a later version.

Comparability Rules

Required same-condition fields:

model_hash
input_shape
input_dtype
task
precision
batch_size
warmup_runs
repeat_runs
include_preprocess
include_postprocess

If these fields match and runtime, execution provider, and target also match, EdgeEnv reports:

Comparable: Yes
Mode: same-condition

For same-condition comparisons only, report compare also prints supplemental latency and throughput deltas after the comparability judgement. Conditional and non-comparable reports do not print metric deltas, and EdgeEnv does not produce rankings or composite scores.

If required fields differ, EdgeEnv reports:

Comparable: No
Reason:
- Different model hash
- Different input shape

If required fields match but runtime, execution provider, or target differs, EdgeEnv reports:

Comparable: Conditional
Mode: runtime-comparison
Reason:
- Same model hash
- Same input shape
- Different runtime or execution provider

Local Registry Layout

.edgeenv/
  runs.db
  runs/
    <run_id>/
      result.json
      config.yaml
      target.yaml
      env.json
      stdout.log
      stderr.log
  failed-runs/
    <run_id>/
      failure.json
      config.yaml
      target.yaml
      env.json
      stdout.log
      stderr.log

runs.db is a local SQLite index. The run directory remains the evidence bundle. Failed local runs are stored under failed-runs/ for debugging and are not inserted into runs.db. Use edgeenv failed-runs list and edgeenv failed-runs show <run_id> to inspect failed-run artifacts safely.

Resource metrics remain canonical in result.json. runs.db also keeps a rebuildable resource_metric_index so edgeenv runs resources list --metric <name> can find runs by normalized memory, power, energy, or temperature evidence without turning those values into rankings or comparability gates. Add --json when scripts need the same supplemental lookup results with explicit filters, units, and source counts.

Use edgeenv runs export <run_id> --output edgeenv-run-<run_id>.zip to create a portable successful-run evidence bundle. Use edgeenv runs import edgeenv-run-<run_id>.zip to validate the bundle, copy it into .edgeenv/runs/, and rebuild the local registry row.

Use edgeenv failed-runs export <run_id> --output edgeenv-failed-run-<run_id>.zip and edgeenv failed-runs import edgeenv-failed-run-<run_id>.zip for portable failed-run diagnostic evidence. Failed-run import copies files into .edgeenv/failed-runs/ and does not update runs.db. The artifact-first zip contract is described in Export/Import Design.

Use edgeenv report bundle-summary --scenario <label>:<run_id_a>:<run_id_b> to generate a read-only Markdown handoff summary from imported successful runs and normal compare judgement. The summary is for human review only; it does not replace result.json, sampler artifacts, manifests, or report compare.

Relation To InferEdge And EdgeBench

InferEdge validates whether a model is deployable across build provenance, runtime execution, evaluation, comparison, optional diagnosis, and deployment decision reports.

In portfolio terms, InferEdgeLab is the validation / decision layer. InferEdgeEnv is the v0.1.5 v1-complete experiment hygiene / comparability layer.

InferEdgeEnv records whether benchmark evidence can be trusted and compared. Its scope is narrower and separate: local run artifacts, SQLite registry rows, portable evidence bundles, and comparability judgement.

In the top-level InferEdge ecosystem map, InferEdgeEnv is the v0.1.5 v1-complete experiment hygiene / comparability layer. It is not part of the pinned Core 4 validation path, but it has a completed role: preserving benchmark evidence and judging same-condition, conditional, or non-comparable runs before any metric delta is discussed.

InferEdgeOrchestrator is also separate: it is the post-deployment operation-control layer for scheduling, load shedding, telemetry, and runtime coordination after a model is already deployed. InferEdgeEnv does not control live inference operations; it records benchmark evidence and preserves honest comparison boundaries before or around review handoff.

EdgeBench is adjacent in benchmark motivation, but InferEdgeEnv is not a public leaderboard. It is a local-first run evidence registry and comparability checker, not a ranking surface.

MVP Scope

Included in MVP v1:

Python CLI skeleton
Typer-based CLI
Rich output
Pydantic benchmark config and target profile schemas
FakeRunner deterministic benchmark result
LocalRunner command execution with explicit metrics JSON capture
Local runtime adapter example for user-owned command integration
Result JSON and artifact directory creation
SQLite local registry
runs list and runs show
runs resources list
runs export
runs import
failed-runs list, failed-runs show, failed-runs export, and failed-runs import
Jetson tegrastats wrapper example for optional resource metrics
report compare comparability checker
report bundle-summary read-only Markdown handoff summary
pytest tests

Non-goals:

OS, VM, WSL, Docker, SSH target implementation
Cloud DB, auth, web dashboard, public leaderboard
Model or dataset upload service
Single-score model ranking

Name		Name	Last commit message	Last commit date
Latest commit History 125 Commits
.agents		.agents
.github		.github
docs		docs
examples		examples
inferedge_env		inferedge_env
scripts		scripts
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EdgeEnv

Start Here for v0.1.5

Problem

What EdgeEnv Is Not

Quickstart

1. Record a Fake Run

2. Record a Local Command Run

3. Compare Two Runs

4. Optional Resource And Sampler Evidence

5. Inspect Evidence

Guide Map

Benchmark Config Example

Target Profile Example

Comparability Rules

Local Registry Layout

Relation To InferEdge And EdgeBench

MVP Scope

Design Notes

About

Uh oh!

Releases 6

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

EdgeEnv

Start Here for v0.1.5

Problem

What EdgeEnv Is Not

Quickstart

1. Record a Fake Run

2. Record a Local Command Run

3. Compare Two Runs

4. Optional Resource And Sampler Evidence

5. Inspect Evidence

Guide Map

Benchmark Config Example

Target Profile Example

Comparability Rules

Local Registry Layout

Relation To InferEdge And EdgeBench

MVP Scope

Design Notes

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages