Increment public benchmarks that are affected by #2169 #2244

KartikP · 2026-01-12T16:41:12Z

Follows #2169.

Summary of issue: Many benchmarks (e.g., all that use NeuralBenchmark() class) use explained_variance() to report ceiled score. As pointed out in #2169, ceiling is incorrectly squared. Following Spearman-Brown correction, it is already a variance (reliability) ceiling. If ceiling is already high, squaring only slightly reduces ceiling, but if ceiling is low, squaring dramatically lowers the ceiling. This results in an artificially lowered ceiling that inflates model scores --> biasing towards noise.

Order of operations

Ensure backup of database is made and can be restored if necessary.
Merge Corrected model-to-ceiling mapping under explained variance #2169
Merge Increment public benchmarks that are affected by #2169 #2244
Terminate vision scoring as a result of "benchmark change"
Manually trigger alexnet on all affected benchmarks to generate new brainscore_benchmarkinstance entries.
Add ceiling value for Coggan family benchmarks
Write new model scores to database based on on existing score_raw and ceiling (script/notebook to follow)
Manually trigger non-standard ceiling benchmarks.

Changes

Affected benchmarks here are incremented. This will create a new entry for appropriate benchmarks in brainscore_benchmarkinstance table with an incremented version.

New recalculated scores will use the new id from brainscore_benchmarkinstance table to reflect update.

Public benchmarks can for the most part be recomputed directly from existing score_raw and ceiling values in database. Private (visibility) benchmarks will not be recomputed. Public benchmarks with non-standard ceiling entries in the database will be recomputed. This last category covers papale, Herbert, and Gifford benchmarks which will be soon set to visible. These benchmarks have multiple ceilings across splits (can also be temporal bins) which are then summarized into a single ceiling value.

Full list of affected benchmark families

Change in score

Coggan family of benchmarks with the following ceilings are most affected:

tong.Coggan2024_fMRI.V1-rdm: 0.4477
tong.Coggan2024_fMRI.V2-rdm: 0.4493
tong.Coggan2024_fMRI.V4-rdm: 0.3348
tong.Coggan2024_fMRI.IT-rdm: 0.2397
tong.Coggan2024_behavior-ConditionWiseAccuracySimilarity: 0.6934

KartikP added 2 commits January 12, 2026 11:39

Increment public benchmarks that are affected by #2169

8959a9b

Increment private benchmarks that are affected by #2169

78f29fd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Increment public benchmarks that are affected by #2169 #2244

Increment public benchmarks that are affected by #2169 #2244

Uh oh!

KartikP commented Jan 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Increment public benchmarks that are affected by #2169 #2244

Are you sure you want to change the base?

Increment public benchmarks that are affected by #2169 #2244

Uh oh!

Conversation

KartikP commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Order of operations

Changes

Full list of affected benchmark families

Change in score

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

KartikP commented Jan 12, 2026 •

edited

Loading