Skip to content

Conversation

@leondz
Copy link
Collaborator

@leondz leondz commented Dec 10, 2025

Conduct a variety of checks and tests to assess the integrity of a garak report.jsonl file

This helps us identify where a report may be broken, deficient, or incorrectly assembled

Inventory of tests:

  • ✔️ garak version match between that used to create report and current garak used for checking
  • ✔️ report using a dev version of garak
  • ✔️ check using a dev version of garak
  • ✔️ inventory described by config's probe_spec matches probes present in attempts
  • ✔️ each attempt status 1 has matching status 2
  • ✔️ all attempts have enough unique generations
  • ✔️ run ID is in setup run IDs
  • ✔️ detection output has correct cardinality in attempt status 2s
  • ✔️ a summary digest object is present
  • ✔️ at least one z-score is listed in the digest
  • ✔️ probes present in summary matches probes requested in config
  • ✔️ the run was completed
  • ✔️ the run is <6 months old (calibration freshness)
  • ✔️ there is at least one eval statement for any probe attempted
  • ✔️ evals are performed over all status 2 attempts
  • ✔️ number of responses graded passed + nones is not more than total reponses graded in eval entries

@leondz leondz requested a review from jmartin-tech December 10, 2025 11:51
@leondz leondz added the reporting Reporting, analysis, and other per-run result functions label Dec 10, 2025
Copy link
Collaborator

@erickgalinkin erickgalinkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good. A couple of fixes needed to make it actually run and a few value fixes.

Comment on lines 121 to 124
if _is_dev_version(garak_version):
add_note(
f"report generated under development garak version {garak_version}, implementation will depend on branch+commit"
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not me triggering this constantly.

@leondz
Copy link
Collaborator Author

leondz commented Jan 27, 2026

NB. add documentation to match pattern in #1569

leondz and others added 9 commits January 28, 2026 09:46
Co-authored-by: Erick Galinkin <erick.galinkin@gmail.com>
Signed-off-by: Leon Derczynski <leonderczynski@gmail.com>
Co-authored-by: Erick Galinkin <erick.galinkin@gmail.com>
Signed-off-by: Leon Derczynski <leonderczynski@gmail.com>
Co-authored-by: Erick Galinkin <erick.galinkin@gmail.com>
Signed-off-by: Leon Derczynski <leonderczynski@gmail.com>
Co-authored-by: Erick Galinkin <erick.galinkin@gmail.com>
Signed-off-by: Leon Derczynski <leonderczynski@gmail.com>
@leondz leondz requested a review from erickgalinkin January 28, 2026 09:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

reporting Reporting, analysis, and other per-run result functions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants