inference: add health-first backend benchmark helper by sheawinkler · Pull Request #243 · sheawinkler/ContextLattice

sheawinkler · 2026-05-30T10:54:48Z

Summary

upgrades the local inference benchmark helper to health-first endpoint probes
supports vLLM, vLLM Metal, MLX/mtplx, SGLang, llama.cpp, LM Studio, TGI, TensorRT-LLM, OpenAI-compatible, and Ollama endpoints
keeps generation optional with --chat so the script never pulls or launches models by default

inference: add health-first backend benchmark helper

7ca3313

sheawinkler merged commit 3eaf5c8 into main May 30, 2026
1 check passed

sheawinkler deleted the sync/public-inference-benchmark-helper branch May 30, 2026 10:55