Skip to content

Pull requests: openai/evals

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Fix EVALS_SHOW_EVAL_PROGRESS env var parsing
#1670 opened May 21, 2026 by LeSingh1 Loading…
Refresh contribution documentation references
#1662 opened May 15, 2026 by MukundaKatta Loading…
feat: add MiniMax provider support
#1661 opened May 13, 2026 by octo-patch Loading…
chore: remove obsolete GPT-4 PR guidance
#1659 opened May 11, 2026 by extrasmall0 Loading…
Add agent pre-action control eval
#1658 opened May 10, 2026 by mindbomber Loading…
13 tasks done
Add atr_prompt_injection eval (modelgraded safety, 16 multilingual samples)
#1657 opened May 10, 2026 by eeee2345 Loading…
13 tasks done
Add agent-tool-abstention eval (13 samples, Match template)
#1656 opened May 8, 2026 by MukundaKatta Loading…
5 tasks done
Add agent-tool-routing eval (12 samples, Match template)
#1655 opened May 8, 2026 by MukundaKatta Loading…
5 tasks done
Fix OpenAI completion args routing
#1653 opened Apr 23, 2026 by kayametehan Loading…
Add explain mode to HumanCliSolver
#1652 opened Apr 23, 2026 by kayametehan Loading…
Handle nested token usage details in oaieval
#1650 opened Apr 23, 2026 by kayametehan Loading…
Add Turkish proverbs eval
#1649 opened Apr 23, 2026 by kayametehan Loading…
eval: add RAIL Score responsible AI evaluation across 8 dimensions
#1640 opened Apr 2, 2026 by SumitVermakgp Loading…
12 tasks done
README: fix Evals starter guide link
#1623 opened Feb 19, 2026 by dcol91863 Loading…
Add Logic Stress Stress-test Suite (v2, v3)
#1622 opened Feb 16, 2026 by 14H034160212 Contributor Loading…
ProTip! Adding no:label will show everything without a label.