Compare Models — Daily Leaderboard + Pricing Merge

Compare Models keeps a daily, machine-readable snapshot of top chatbot models by arena ranking and inference pricing.
It fetches fresh rankings from the Open LLM/Chatbot Arena and merges them with public pricing from OpenRouter and LiteLLM, producing a single CSV you can use in notebooks, dashboards, or apps.

Primary artifact: chatbot_arena_leaderboard_with_cost.csv (updated daily) A hosted Webview with cost calculator is also available here

What this repo contains

update_leaderboard.py — main script that:
- pulls the latest arena ranking
- fetches model pricing from OpenRouter & LightLLM
- normalizes model names
- merges everything into a single table
- writes chatbot_arena_leaderboard_with_cost.csv
chatbot_arena_leaderboard_with_cost.csv — the daily output (committed so it’s easy to consume)
.github/workflows/ — CI that runs the update daily and pushes changes
docs/ & markdown_preview.md — optional static preview material (for GitHub Pages or a simple hosted view)
requirements.txt — minimal Python deps

Why this exists

Comparing models is hard when quality and price live in different places. This repo gives you a single source of truth that answers:

Which models rank highly today?
What do they cost per input/output token?
What’s the best quality-per-€ tradeoff right now?

Quick start

1) Clone and install

git clone https://github.com/ZachLaik/compare-models.git
cd compare-models
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

2) Run the updater locally

python update_leaderboard.py

This will regenerate chatbot_arena_leaderboard_with_cost.csv in the repo root.

No API keys are required for the default sources. If you point the script to authenticated endpoints in the future, document the required env vars here.

⸻

Using the CSV

In Python (pandas)

import pandas as pd

df = pd.read_csv("chatbot_arena_leaderboard_with_cost.csv")

Examples:

Top 10 by arena score

print(df.sort_values("arena_score", ascending=False).head(10))

# Best value: score per 1€ of output tokens (example columns)
df["value_score"] = df["arena_score"] / (df["output_usd_per_1k_tokens"])
print(df.sort_values("value_score", ascending=False).head(10))

In JavaScript (browser/Node)

// Browser (with a raw link to the CSV in your GitHub or CDN)
const res = await fetch('chatbot_arena_leaderboard_with_cost.csv');
const text = await res.text();
// Parse with PapaParse or your favorite CSV lib

⸻

Data sources & update cadence • Ranking: Open LLM / Chatbot Arena (daily pull) • Pricing: OpenRouter & LightLLM public pricing pages/APIs (daily pull) • Schedule: GitHub Actions runs once per day and commits the refreshed CSV

⸻

GitHub Actions (CI)

The workflow in .github/workflows/: • creates a Python environment • runs update_leaderboard.py • commits the new CSV if anything changed

If you ever need to modify the cadence, update the schedule: block in the workflow file.

⸻

File schema (CSV)

Columns may evolve, but the table generally includes: • model — normalized model name • provider — e.g., OpenRouter/LightLLM • arena_rank / arena_score — model standing in the arena • input_usd_per_1k_tokens — input pricing • output_usd_per_1k_tokens — output pricing • source_rank_url / source_price_url — provenance (when available) • Additional helper columns used for joins and normalization

Check the current header row of chatbot_arena_leaderboard_with_cost.csv for the exact set.

⸻

Reproducibility & notes • The script applies simple name normalization so “model aliases” map together before merging. • If a model appears in rankings but not in pricing (or vice versa), it’s included with NaN for missing fields. • No historical backfill: This repo is a daily snapshot. If you need history, keep the CSVs per date (/snapshots/2025-08-19.csv, etc.) or log to a datastore.

⸻

Roadmap ideas • Publish a small JSON API (Cloudflare Workers / GitHub Pages + JS) for easy consumption. • Add value metrics (e.g., score per $), latency, context window size. • Expand pricing sources (vendor pages) and add validation checks. • Keep a /snapshots/ folder for historical trend charts.

⸻

Contributing

PRs welcome! Please keep the output CSV stable and documented, and avoid breaking column names without a migration note.

⸻

License

MIT

⸻

Acknowledgements • Open LLM/Chatbot Arena for community rankings. • OpenRouter and LiteLLM for public model pricing.

Name		Name	Last commit message	Last commit date
Latest commit History 346 Commits
.github/workflows		.github/workflows
attached_assets		attached_assets
docs		docs
.replit		.replit
README.md		README.md
chatbot_arena_leaderboard_with_cost.csv		chatbot_arena_leaderboard_with_cost.csv
markdown_preview.md		markdown_preview.md
replit.md		replit.md
requirements.txt		requirements.txt
update_leaderboard.py		update_leaderboard.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Compare Models — Daily Leaderboard + Pricing Merge

What this repo contains

Why this exists

Quick start

1) Clone and install

2) Run the updater locally

Examples:

Top 10 by arena score

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Compare Models — Daily Leaderboard + Pricing Merge

What this repo contains

Why this exists

Quick start

1) Clone and install

2) Run the updater locally

Examples:

Top 10 by arena score

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages