Binary evidence-sufficiency dissociation in reasoning-model hidden states for fixed-question, changed-context multi-hop QA.
interpretability llm reasoning-models hidden-state-probing multi-hop-qa evidence-sufficiency mechanistic-dissociation
-
Updated
Apr 18, 2026 - Python