-
Notifications
You must be signed in to change notification settings - Fork 479
Open
Description
Looking at the code below,
it looks like it can be evaluated using the "math_test" or "math_500_test" dataset.
Line 32 in ee3b031
| split: Literal["math_test", "math_500_test"] = "math_test", |
What dataset was used to obtain the scores listed in Benchmark Results in README.md?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels