Releases: shawnjason/Hallucinations
v2.0 — Machine-Checked Principal Results
v2.0 release of the Lean 4 proofs for "Language Model Hallucinations: An Impossibility Theorem and Its Architectural Consequences."
This release contains seven standalone Lean 4 proof files covering the principal formal results of the paper, organized into four groups:
Headline Impossibility
018_lm_hallucination_ceiling.lean— Theorem 1, Impossibility of Guaranteed Consistency (lm_hallucination_ceiling_deterministic,lm_hallucination_ceiling_stochastic): a language model with bounded context window of depth h is a forward-local system whose next-token selection is a function of the bounded context projection. When the global consistency constraint depends on information outside the context window, two prefixes can agree on the context projection while differing in admissibility — and no function of the context window alone determines which next tokens preserve global consistency in all such cases. TheSameContextHallucinationRiskpredicate is the LM specialization of the same-fiber conflict pattern; both decoder forms (deterministic next-token map and stochastic next-token distribution) are proved.
Certification-Depth Lower Bounds
099_multi_constraint_enumeration_delayed_failure.lean— Multi-Constraint Enumeration Delayed-Failure (multi_constraint_enumeration_delayed_failure): concrete witness for delayed-constraint failure under exact-count enumeration. A generation task imposing an exact-count constraint can be locally satisfied at every individual step (each emitted item passes the plausibility check) while becoming globally non-extendable once the running count exceeds the target. The witness uses target K = 3 with running count n = 4; emitted items cannot be retracted, so no completion can reduce the count back to K. Local item-level validity does not certify global compliance.114_terminal_sum_certification_depth.lean— Terminal-Sum Certification Depth Equals Sequence Length (certification_depth_exceeds_h_for_h_eq_1): quantitative certification-depth lower bound for the arithmetic chain. Under the terminal-sum constraint Σuₜ = K with action set U ⊇ {0, 1, 2}, the certification depth — the minimum trailing-window depth h at which extendability is recoverable from the projection Πh — equals the trajectory length T. For any h < T, the file constructs an explicit pair of prefixes τ1, τ2 with identical h-step trailing windows that differ in extendability, because the suppressed pre-window history can have accumulated different running sums while the trailing window remains identical. No bounded local certification with depth less than T suffices.
Mitigation Boundaries
115_arithmetic_cot_sufficiency_faithful_total.lean— Arithmetic CoT Sufficiency Requires Faithful Running Total (cot_certificate_implies_extendable,unfaithful_cot_gives_false_positive,arithmetic_cot_sufficiency_requires_faithfulness): companion to file 114 establishing the boundary at which chain-of-thought prompting can recover certification depth. With a faithful running total in the CoT scratchpad, extendability under the terminal-sum constraint becomes a function of just (running_total, remaining_steps) — a bounded local quantity, so certification depth collapses from T to a constant. Without faithfulness, an unfaithful CoT scratchpad gives false-positive certificates and provides no improvement over no scratchpad at all.116_grammar_insufficient_arithmetic.lean— Grammar Constraints Insufficient for Arithmetic Validity (grammar_insufficient_for_arithmetic_validity,grammar_valid_arithmetic_invalid_witness): companion to file 115 characterizing what syntactic mitigations cannot certify. Grammar-based decoding restricts output to valid syntactic productions but does not enforce global arithmetic constraints. Concrete witness[1, 1, 1]is grammar-valid but sums to 3 ≠ 2. Syntactic admissibility and semantic admissibility are independent properties: grammar covers the former, faithful running-total CoT (file 115) covers the latter, and neither covers the other.117_rerank_residual_positive.lean— Verifier-Rerank Cannot Attain Zero Structural Failure (rerank_residual_positive,rerank_residual_nonzero,rerank_residual_strictly_between): quantitative consequence of Theorem 1 under best-of-K reranking. A verifier-rerank scheme that draws K candidate completions from a base model and selects the highest-scoring candidate cannot attain zero structural failure rate at any finite K. If the per-candidate failure probability is bounded below by p > 0 (which the structural impossibility theorems establish for forward-local stochastic generation under non-local constraints), then the all-fail probability p^K decays geometrically in K but is strictly positive for every finite K. The file establishes three quantitative bounds, including the strictly-between bound 0 < p^K ≤ p.
Unified Hallucination Taxonomy
119_intrinsic_extrinsic_dcf_taxonomy.lean— Intrinsic and Extrinsic Hallucinations as DCF Taxonomy (intrinsic_is_dcf,extrinsic_is_dcf,hallucination_taxonomy_unified_under_dcf): synthesis result connecting the standard hallucination taxonomy to the delayed-constraint-failure (DCF) abstract structure. Both intrinsic hallucinations (output contradicts content verifiable in the input/context) and extrinsic hallucinations (output makes claims not verifiable from the input/context) are formalized as instances of admissibility failure relative to a global consistency constraint; they differ only in whether the relevant consistency information is inside or outside the model's context window. The file proves that intrinsic and extrinsic hallucinations are both DCF instances and that the standard taxonomy is unified under the DCF schema.
The headline impossibility (file 018) is the LM specialization of the projection-insufficiency theorem and is canonically also housed in the Projection-Insufficiency repository; it is reproduced here for self-contained verification. The certification-depth lower bounds (099, 114) translate the abstract impossibility into quantitative claims about specific generation tasks. The mitigation boundaries (115, 116, 117) then characterize what specific decoding-time interventions — faithful chain-of-thought, grammar constraints, verifier-rerank — can and cannot certify. The unified taxonomy file (119) closes the loop by showing the standard intrinsic / extrinsic distinction is two sub-cases of the same delayed-constraint-failure structure.
All files compile against current Mathlib.
Companion paper: https://doi.org/10.5281/zenodo.19715059