Skip to content

sanity-check always returns pass@1: 0.0 #57

@MikeSirb

Description

@MikeSirb

When running evaluate_functional_correctness with the provided example files (data/example_samples.jsonl and data/example_problem.jsonl), the output always shows pass@1: 0.0. According to the documentation, it should be around 0.5.

Command:
evaluate_functional_correctness data/example_samples.jsonl --problem_file=data/example_problem.jsonl

Result:
Reading samples...
6it [00:00, 2636.27it/s]
Running test suites...
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 7.23it/s]
Writing results to data/example_samples.jsonl_results.jsonl...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 11915.64it/s]
{'pass@1': 0.0}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions