Skip to content

Popular repositories Loading

  1. benchflow benchflow Public

    AI benchmark runtime framework that allows you to integrate and evaluate AI tasks using Docker-based benchmarks.

    Python 168 15

  2. pokemon-gym pokemon-gym Public

    Python 86 7

  3. skillsbench skillsbench Public

    SkillsBench evaluates how well skills work and how effective agents are at using them

    Python 15 6

  4. jfkarena jfkarena Public

    TypeScript 7

  5. paperbench paperbench Public

    Python 5 1

  6. llm-builds-linux llm-builds-linux Public

    Python 4 1

Repositories

Showing 7 of 7 repositories
  • skillsbench Public

    SkillsBench evaluates how well skills work and how effective agents are at using them

    benchflow-ai/skillsbench’s past year of commit activity
    Python 15 Apache-2.0 6 0 1 Updated Jan 1, 2026
  • benchflow-ai/llm-builds-linux’s past year of commit activity
    Python 4 1 0 8 Updated Dec 20, 2025
  • benchflow Public

    AI benchmark runtime framework that allows you to integrate and evaluate AI tasks using Docker-based benchmarks.

    benchflow-ai/benchflow’s past year of commit activity
    Python 168 MIT 15 0 0 Updated Dec 19, 2025
  • pokemon-gym Public
    benchflow-ai/pokemon-gym’s past year of commit activity
    Python 86 7 0 0 Updated Jun 30, 2025
  • paperbench Public
    benchflow-ai/paperbench’s past year of commit activity
    Python 5 MIT 1 0 0 Updated Apr 15, 2025
  • jfkarena Public
    benchflow-ai/jfkarena’s past year of commit activity
    TypeScript 7 0 0 0 Updated Apr 1, 2025
  • jfk-ocr-demo Public
    benchflow-ai/jfk-ocr-demo’s past year of commit activity
    Python 0 0 0 0 Updated Mar 25, 2025

Most used topics

Loading…