Releases: gwonxhj/InferEdgeOrchestrator
Releases · gwonxhj/InferEdgeOrchestrator
v0.1.2 - Docs and TensorRT evidence patch
Docs and TensorRT validation evidence patch release.
This release does not change runtime scheduler behavior.
Highlights:
- Records the TensorRT worker/inference path and Jetson TensorRT runtime telemetry evidence.
- Adds Jetson TensorRT contention evidence for scheduler/load-shedding behavior.
- Adds distinct generated detector/classifier TensorRT contention evidence with curated sample telemetry.
- Updates README and PORTFOLIO wording so the project is readable in 30 seconds without implying TensorRT/GPU throughput benchmark claims.
- Keeps raw reports, tegrastats logs, TensorRT plan files, and large model artifacts out of the repository.
Validation:
- Local full pytest: 60 passed.
- GitHub Actions pytest on Python 3.11: passed.
Boundary:
InferEdgeOrchestrator remains a lightweight edge runtime scheduler. v0.1.2 is not a Triton/DeepStream replacement release and not a TensorRT/GPU throughput benchmark.
InferEdgeOrchestrator v0.1.1
Docs and validation evidence patch release only.
No runtime scheduler behavior changes.
Added/packaged since v0.1.0:
- Portfolio brief in English and Korean
- Versioned sample telemetry artifacts
- Architecture documentation
- Validation evidence index
- Documentation link and language-pair pytest coverage
- Config guide documentation
- Tracked InferEdge handoff config sample
- Changelog promotion for v0.1.1
Validation:
- Local pytest: 29 passed
- GitHub Actions CI: pytest (Python 3.11) success
Key docs:
- CHANGELOG.md
- PORTFOLIO.md
- docs/validation_evidence.md
- docs/architecture.md
- configs/README.md
- examples/telemetry/README.md
InferEdgeOrchestrator v0.1.0
InferEdgeOrchestrator v0.1.0
Initial portfolio-ready release for InferEdgeOrchestrator, a lightweight edge inference runtime scheduler for priority/deadline-aware multi-task control, bounded queues, adaptive load shedding, ONNX Runtime workers, and Jetson telemetry.
Highlights
- Scheduler core with config-driven task registration, bounded per-task queues, priority/deadline-aware scheduling, dummy worker, load shedding, and telemetry JSON export.
- ONNX Runtime worker support with config-selectable worker execution and identity ONNX smoke validation.
- Synthetic overload comparison showing detector p95 end-to-end latency improvement from 782.0ms FIFO baseline to 8.0ms with scheduler + load shedding, while low-priority classifier work is intentionally dropped.
- Jetson Orin Nano dummy smoke validation on nano01 with telemetry generation, resource snapshots, and tegrastats parsing.
- Jetson ONNX Runtime smoke validation on nano01 using ONNX Runtime 1.23.2 and CPUExecutionProvider, with output metadata and resource snapshots recorded.
- InferEdge result.json file-based handoff for recommending Orchestrator task latency budgets without importing InferEdge internals.
- English/Korean documentation mirrors and GitHub Actions CI.
Validation
- Local pytest: 22 passed
- GitHub Actions CI: pytest (Python 3.11) success
- Latest main commit: 1b96bc2
Known Limitations
- This release is not a Triton or DeepStream replacement.
- Jetson ONNX Runtime smoke validates the worker path with CPUExecutionProvider; it is not TensorRT/GPU benchmark evidence.
- Jetson smoke artifacts are documented summaries, while raw reports remain ignored under reports/.