Skip to content

Related open benchmark: Helium Model Worldview (values + cue-swap consistency) #3

Description

@connerlambden

Helium Trades released an open benchmark that may complement ValueBench on value consistency and political lean.

Helium Model Worldview Benchmark (304 prompts):

  • Stated priorities vs forced tradeoffs
  • Name-swap and cue-swap consistency
  • 50 balanced political Likert items
  • Safety refusal profiles across 12 models

Dataset: https://huggingface.co/datasets/HeliumTrades/helium-model-worldview-benchmark
Overview: https://heliumtrades.com/benchmarks/

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions