Skip to content

henrycgbaker/llenergymeasure

Repository files navigation

LLenergMeasure

License: MIT Python 3.10+ Code style: Ruff

Measure the energy efficiency of LLM inference across different implementation configurations.

LLenergyMeasure is a Python framework for measuring the energy consumption, throughput, and computational cost (FLOPs) of LLM inference across different deployment configurations. It helps researchers compare the energy efficiency of different models, inference engines, and a wide range of implementation decisions — reproducibly and at publication quality.


Key Features

  • Multi-engine inference — Transformers, vLLM, TensorRT-LLM, SGLang (planned)
  • GPU energy measurement — NVML, Zeus, CodeCarbon, others
  • Smart sweep system — define parameter grids, run Cartesian product experiments automatically; intelligently managed sweep hierarchy scopes available config fields to appropriate engine/component, and ensures invalid combinations are removed
  • Docker isolation — launches per-experiment containers with full GPU passthrough; latest docker images for each engine in registry with full runner configurability and local mode also available. Every study pre-flight now verifies that each image's ExperimentConfig schema fingerprint matches the host's, aborting with an actionable rebuild hint on drift (llem doctor for a one-shot check).
  • Reproducibility — fixed seeds, cycle ordering, thermal management, environment snapshots, effective config recorded (add others)
  • Built-in datasets — AI Energy Score benchmark prompts included; custom JSONL datasets also supported

Quick Install

pip install "llenergymeasure[transformers]"

Run your first measurement:

llem run --model gpt2 --engine transformers

See Installation for system requirements, Docker setup, and available extras. See Getting Started to run and interpret your first experiment.


Documentation

Researcher Docs

Guide Description
Installation System requirements, pip install, Docker setup path
Getting Started First experiment, Transformers and Docker tracks
Docker Setup NVIDIA Container Toolkit walkthrough for vLLM
Engine Configuration Transformers vs vLLM, parameter support matrix
Study & Experiment Configuration YAML reference, sweeps, config schema
CLI Reference llem run, llem config, and llem doctor flags and options
Energy Measurement NVML, Zeus, CodeCarbon backends, measurement mechanics
Measurement Methodology Warmup, baseline, thermal management, reproducibility
Troubleshooting Common issues, invalid combinations, getting help

Policy Maker Guides

Guide Description
What We Measure Plain-language explanation of energy, throughput, and FLOPs
Interpreting Results How to read llenergymeasure output
Getting Started (Policy Maker) Minimal path to running a measurement
Comparison with Other Benchmarks MLPerf, AI Energy Score, CodeCarbon, Zeus context

Contributing

Contributions welcome. See the development install instructions to set up a local environment.


License

MIT

About

Research framework for measuring LLM inference efficiency: energy (J/token), throughput (tok/s), and FLOPs with MLPerf-style benchmarking.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors