justllm

Production LLM calls. Just the three lines.

from justllm import LLM

llm = LLM("anthropic/claude-opus-4-8")
llm("Summarize this contract.")

That call already does the work you'd normally wire up yourself, on by default:

Context compression. Headroom shrinks tool output by 50–95% before it reaches the model.
Prompt-cache optimization. Cache breakpoints go where each provider wants them (Anthropic, OpenAI, Google).
Reliability. Calls retry with backoff, then fail over to the next provider.

You don't call any of these yourself; they run inside llm(...). To turn them off per client: LLM(model, compress=False, cache="off").

pip install 'justllm[all]'

More, when you need it

You set up llm once (those three lines). After that, each of these is a single call on it. Reach for the ones you need and ignore the rest:

llm.stream("...")                    # token streaming
await llm.acall("...")               # async
llm.map(prompts, concurrency=8)      # many prompts at once, in order
llm.extract(Invoice, text)           # structured output (validated Pydantic)
llm.chat()                           # multi-turn, keeps history
llm.agent(system="...").run("...")   # tool-calling loop
llm.judge(output, criteria="...")    # LLM-as-judge score
llm.evaluate(cases)                  # run + grade a test set

Also there, all opt-in: llm.embed(...), routing (Router and Cascade), OpenTelemetry traces with the per-call dollar cost, Langfuse-backed prompts, and exact-match caching. Runnable versions of everything are in the cookbook.

Runnable recipes: cookbook

Why

The ecosystem splits two ways. You can have powerful but heavy (LiteLLM, LangChain), or simple but thin (aisuite, any-llm). justllm sits in the middle: every optimization is on, and the surface stays at three lines. Keeping it that small was most of the work.

	justllm	LiteLLM	aisuite
three-line call	yes	yes	yes
cross-provider fallback	on by default	config	no
context compression	on by default (Headroom)	manual trim	no
prompt-cache optimization	on by default	passthrough	no
structured output	yes (instructor)	passthrough	no
tool-calling agent	yes (minimal)	no	no
surface area	tiny	large	tiny

It runs on LiteLLM underneath, so think of it as the opinionated layer on top rather than a replacement.

Alpha. The wiring is tested on CI (Python 3.10–3.13) and the call paths are checked against live models.

Cookbook · Roadmap · Changelog · Contributing · MIT

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.github		.github
assets		assets
benchmarks		benchmarks
examples		examples
scripts		scripts
src/justllm		src/justllm
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

justllm

More, when you need it

Why

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

justllm

More, when you need it

Why

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages