Skip to content

inference: add health-first backend benchmark helper#243

Merged
sheawinkler merged 1 commit into
mainfrom
sync/public-inference-benchmark-helper
May 30, 2026
Merged

inference: add health-first backend benchmark helper#243
sheawinkler merged 1 commit into
mainfrom
sync/public-inference-benchmark-helper

Conversation

@sheawinkler
Copy link
Copy Markdown
Owner

Summary

  • upgrades the local inference benchmark helper to health-first endpoint probes
  • supports vLLM, vLLM Metal, MLX/mtplx, SGLang, llama.cpp, LM Studio, TGI, TensorRT-LLM, OpenAI-compatible, and Ollama endpoints
  • keeps generation optional with --chat so the script never pulls or launches models by default

Local verification

  • bash -n scripts/benchmark_inference_backends.sh
  • git diff --check

@sheawinkler sheawinkler merged commit 3eaf5c8 into main May 30, 2026
1 check passed
@sheawinkler sheawinkler deleted the sync/public-inference-benchmark-helper branch May 30, 2026 10:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant