#

semantic-reliability

Here is 1 public repository matching this topic...

Kevin-Li-2025 / order-delta-bench

Deterministic benchmark for measuring stateful semantic reliability and order sensitivity in LLM-powered ordering agents.

benchmark reproducibility structured-output llm-evaluation order-sensitivity semantic-reliability

Updated May 29, 2026
Python

Improve this page

Add a description, image, and links to the semantic-reliability topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the semantic-reliability topic, visit your repo's landing page and select "manage topics."