OpenReview pilot benchmark (ICLR 2025) + eval tooling by jwang1230 · Pull Request #65 · ChicagoHAI/OpenAIReview

jwang1230 · 2026-04-23T20:38:28Z

Adds benchmarks/openreview_benchmark: 10-paper ICLR 2025 pilot JSONL, scripts (collect/normalize/filter/PDFs/validate/eval), OPENREVIEW.md + REPORT.md, locked eval in reports/, eval_history.jsonl, and src/reviewer/evaluate_openreview.py for LLM-judge P/R/F1. Ignores openreview_raw, pdfs, and results/ locally.

- Pilot JSONL, scripts, docs, locked eval report and eval_history - evaluate_openreview.py; .gitignore for local OpenReview paths

Add OpenReview ICLR 2025 pilot benchmark and LLM-judge eval

6e3251f

- Pilot JSONL, scripts, docs, locked eval report and eval_history - evaluate_openreview.py; .gitignore for local OpenReview paths

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenReview pilot benchmark (ICLR 2025) + eval tooling#65

OpenReview pilot benchmark (ICLR 2025) + eval tooling#65
jwang1230 wants to merge 1 commit intoChicagoHAI:mainfrom
jwang1230:openreview-benchmark

jwang1230 commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jwang1230 commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant