Skip to content

Vicomtech/justeval-eus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Can Large Language Models Judge in Basque?

This repository contains the model responses and evaluations in the paper: Judging Instruction Responses in a Low-Resource Language: A Case Study on Basque.

The inferences were generated for the just-eval-eus dataset, available at: Vicomtech/just-eval-instruct-eus

Contents

.
├── multi
│   ├── justeval_scores_multi.csv
│   ├── sampled_instructions_common.json
│   ├── sampled_instructions_specific-A.json
│   ├── sampled_instructions_specific-B.json
│   ├── sampled_instructions_specific-D.json
│   ├── sampled_instructions_specific-E.json
│   └── sampled_instructions_specific-G.json
└── safety
    ├── sampled_instructions_safety.json
    └── scores_safety.csv

Files

multi/

Contains the general-purpose instruction-response samples and their judge evaluations.

  • sampled_instructions_common.json: sampled instructions evaluated by all human annotators.
  • sampled_instructions_specific-[A/B/D/E/G].json: sampled instructions evaluated by a single human annotator.
  • justeval_scores_multi.csv: judge scores for the multi subset.

safety/

Contains the safety-oriented instruction-response samples and their judge evaluations.

  • sampled_instructions_safety.json: sampled safety instructions and generated responses.
  • scores_safety.csv: judge scores for the safety subset.

Data

The sampled_instructions JSON files include the generated responses from the inference models described in the paper.

The score CSV files include the corresponding judge responses and scores.

Citation

If you use this repository, please cite the paper:

@inproceedings{ponce2026judging,
  title = {Judging Instruction Responses in a Low-Resource Language: A Case Study on Basque},
  author = {Ponce, David and Gete, Harritxu and Etchegoyhen, Thierry and Zubiaga, Irune and Soroa, Aitor},
  booktitle = {Proceedings of the 15th edition of the Language Resources and Evaluation Conference (LREC 2026)},
  note = {to appear}
  year = {2026}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors