ML-DSA Aarch64 HOL-Light proof poly_use_hint_32 by jakemas · Pull Request #372 · awslabs/s2n-bignum

jakemas · 2026-03-13T05:49:57Z

Summary

With help, and love, from Claude.

Add formal verification proof for mldsa_poly_use_hint_32 (AArch64 NEON), establishing full functional correctness: the assembly output equals word(mldsa_use_hint_32_spec(val(a), val(h))) for each element, matching the FIPS 204 UseHint specification for GAMMA2 = (Q-1)/32 (ML-DSA security levels 3 and 5, l=5,7).

The companion _88 proof (security level 2, l=4) is in PR #375.

What the proof establishes

The proof verifies that hand-optimized NEON assembly — Barrett multiplication with magic constants, SIMD bit tricks, branchless conditional masking — produces exactly the same mathematical function as the FIPS 204 spec, for every valid input (8.4M x 2 input pairs).

Proof architecture

Barrett equivalence (BARRETT_EQUIV): The assembly computes a1 as ((2 * a * 1074791425) / 2^32 + 131072) / 262144. The spec computes it as ((a + 127) / 128 * 1025 + 2097152) / 4194304. These are completely different formulas. The proof establishes equivalence for all a < Q via 17-case analysis (one per output value 0..16), using DIV_MONO to sandwich both formulas to the same constant in each output interval.

Element correctness (ELEMENT_CORRECT): Proves val(asm_result) = spec(val(a), val(h)) for a single element. Splits into wrap (a1=16) and nowrap (a1<=15) cases. Uses bridge lemmas for native-mode real_gt/int_gt compatibility, and DIV_MONO-based upper bounds for the remainder.

SIMD endgame (MLDSA_USE_HINT_CORRECT): After 1006-step ARM symbolic execution, pushes word_subword through SIMD operations to extract per-element 32-bit results, then matches each against the element correctness theorem.

Shared helpers: WORD_AND_ONES_32, WORD_MUL_1_32, REAL_INT_GT_BRIDGE, REAL_INT_GT_BRIDGE_POS moved to common/mlkem_mldsa.ml for reuse by both _32 and _88 proofs. Unused intermediate lemmas from development removed.

Changes

arm/proofs/mldsa_poly_use_hint_32.ml — full proof with functional correctness postcondition, constant-time and memory safety
arm/mldsa/mldsa_poly_use_hint_32.S — assembly source
common/mlkem_mldsa.ml — shared helper lemmas
Supporting changes to Makefile, subroutine signatures, specifications, decode/instruction tables

Testing

Proof verified end-to-end in HOL Light (both interpreted MCP and native compilation CI). Also verified in the mldsa-native fork which runs native-mode CI. The _32 proof takes approximately 16 minutes in CI.

…ctness Add HOL-Light proofs for the ML-DSA poly_use_hint_32 AArch64 assembly implementation, covering parameter sets 65/87 (GAMMA2=(Q-1)/32). Includes: - Functional correctness proof: each output coefficient equals the FIPS 204 UseHint specification (mldsa_use_hint_32_spec), not just output bounds - Barrett reduction equivalence proof (assembly vs C decomposition) - Constant-time and memory safety proofs - Shared ML-DSA infrastructure lemmas in common/mlkem_mldsa.ml - CBMC-style test harness in tests/test.c The functional correctness proof establishes that the SIMD assembly computes exactly the same result as the reference UseHint algorithm for all valid inputs (a < Q, h in {0,1}). Signed-off-by: Jake Massimo <jakemas@amazon.com>

word_2smulh is defined in arm/proofs/instruction.ml which is not loaded by x86 proofs. Moving the lemmas inline into the proof file avoids breaking x86 proofs that load the common file.

Replace weak output-bounds postcondition with full functional correctness: each output word equals word(mldsa_use_hint_32_spec (val(x i)) (val(y i))). Key proof techniques: - Bridge lemmas (REAL_INT_GT_BRIDGE) resolve both real_gt and int_gt from a single NUM fact, avoiding backtick type mismatch - Inner if elimination via COND_CASES_TAC + A0_UPPER_32 contradiction - INT_MUL_LZERO + INT_SUB_RZERO for wrap case simplification

Remove 7 unused helper lemmas (WORD_ISHR_ROUND_18, SPEC_NOWRAP_32, HINT_H1_POS, HINT_H1_NEG, WORD_OR_NEG1_32, WORD_NEG1_VAL_32, SPEC_BOUND) and move 4 shared lemmas (WORD_AND_ONES_32, WORD_MUL_1_32, REAL_INT_GT_BRIDGE, REAL_INT_GT_BRIDGE_POS) to common/mlkem_mldsa.ml so they can be shared with the poly_use_hint_88 proof. Signed-off-by: Jake Massimo <jakemas@amazon.com>

Replace the 428-line case-split Barrett proof with a general DIV_SANDWICH + BARRETT_INTERVAL lemma approach. The general lemma proves that if both formulas are sandwiched in [k*d, (k+1)*d) for a given interval, then both equal k. The proof cascades through the 17-entry interval table, discharging numeric side conditions with NUM_REDUCE_CONV. Signed-off-by: Jake Massimo <jakemas@amazon.com>

Rename MLDSA_USE_HINT_EXEC/CORRECT/SUBROUTINE_CORRECT/SUBROUTINE_SAFE to include _32 suffix, matching the _88 naming convention. Add mldsa_poly_use_hint_32 entry to subroutine_signatures.ml for the constant-time and memory safety proof. Signed-off-by: Jake Massimo <jakemas@amazon.com>

Add FIPS 204 definitions (mldsa_cmod, mldsa_decompose_32, mldsa_use_hint_32) and equivalence proof connecting code-aligned spec to FIPS 204 Algorithm 40. Move shared arch-independent helpers (DIV_SANDWICH, INT_MOD_RESIDUE) to common/mlkem_mldsa.ml. ENSURES_STRENGTHEN_POST stays in the proof file as it references armstate. Signed-off-by: Jake Massimo <jakemas@amazon.com>

The proof body ended mid-THENL, missing the tactic list that discharges the `x DIV d < k + 1` subgoal. This caused camlp5 to fail parsing the file with: Parse error: [and_let_binding] expected after [first_let_binding] since the next `let INT_MOD_RESIDUE = ...` was read as a continuation of the incomplete `let DIV_SANDWICH` expression. Restore the missing line. Signed-off-by: Jake Massimo <jakemas@amazon.com>

Apply the same refactor that was made on pq-code-package/mldsa-native#1037 to keep the two repos consistent: - Rename the code-aligned per-coefficient spec mldsa_use_hint_32_spec to mldsa_use_hint_32_code and keep it local to the proof file. FIPS 204 mldsa_use_hint_32 (Algorithm 40) is the shared definition in common/mlkem_mldsa.ml. - Rename the code-aligned CORRECT / CORRECT_BOUND / SUBROUTINE_CORRECT to *_CODE suffixes to mark them as internal stepping stones. - Make MLDSA_USE_HINT_32_SUBROUTINE_CORRECT the single public theorem, stated in terms of FIPS 204 mldsa_use_hint_32 with the < 16 output bound as a corollary. Derived from _SUBROUTINE_CORRECT_CODE via MLDSA_USE_HINT_32_EQUIV. - Move safety proof to the end of the file. mk_safety_spec still uses _SUBROUTINE_CORRECT_CODE so the hardcoded safety goal matches. - Lift shared FIPS 204 definitions (mldsa_cmod, mldsa_decompose_32, decompose_32_r{0,1}, mldsa_use_hint_32) and helpers (LOWER/UPPER _NONWRAP_R1, MLDSA_USE_HINT_32_UNFOLD) into common/mlkem_mldsa.ml. - Keep ENSURES_STRENGTHEN_POST local to the proof file (armstate-specific). Signed-off-by: Jake Massimo <jakemas@amazon.com>

Apply the same refactor as PR awslabs#372 did for _32, mirroring pq-code-package/mldsa-native#1037 to keep the two repos consistent: - Rename the code-aligned per-coefficient spec mldsa_use_hint_88_spec to mldsa_use_hint_88_code and keep it local to the proof file. FIPS 204 mldsa_use_hint_88 (Algorithm 40) is the shared definition in common/mlkem_mldsa.ml. - Rename the code-aligned CORRECT / CORRECT_BOUND / SUBROUTINE_CORRECT to *_CODE suffixes to mark them as internal stepping stones. - Make MLDSA_USE_HINT_88_SUBROUTINE_CORRECT the single public theorem, stated in terms of FIPS 204 mldsa_use_hint_88 with the < 44 output bound as a corollary. Derived from _SUBROUTINE_CORRECT_CODE via MLDSA_USE_HINT_88_EQUIV. - Move safety proof to the end of the file. mk_safety_spec still uses _SUBROUTINE_CORRECT_CODE so the hardcoded safety goal matches. - Lift shared FIPS 204 definitions (mldsa_cmod, mldsa_decompose_88, decompose_88_r{0,1}, mldsa_use_hint_88) and helpers (LOWER/UPPER _NONWRAP_R1_88, MLDSA_USE_HINT_88_UNFOLD) into common/mlkem_mldsa.ml. - Keep ENSURES_STRENGTHEN_POST local to the proof file (armstate-specific). Signed-off-by: Jake Massimo <jakemas@amazon.com>

Resolves conflicts by including both mldsa_poly_use_hint_32 from the PR branch and the three new mldsa_pointwise_acc functions (l4, l5, l7) from main in all affected files.

jargh · 2026-05-19T04:32:43Z

I'm still thinking about the interesting aspects of this PR, but I noticed one oddity: arm/proofs/subroutine_signatures.ml seems to have an erroneous duplication at lines 4575-4578 (?)

jakemas · 2026-05-19T04:42:33Z

I'm still thinking about the interesting aspects of this PR, but I noticed one oddity: arm/proofs/subroutine_signatures.ml seems to have an erroneous duplication at lines 4575-4578 (?)

Hanno suggested a re-write to make the format in line with other proof styles: pq-code-package/mldsa-native#1037 (comment) which I will do

Using functionaltest(arm, ...) puts the test in the 'inapplicable' bucket on x86, which the summary equation successes + skipped == tested does not count, producing 'Testing all passed but is incomplete' and a non-zero exit. Switch to functionaltest(all, ...) so the function runs everywhere; the body is already gated with #ifdef __x86_64__ to return 0 without referencing the ARM-only symbol. Mirrors the pattern used for test_mldsa_poly_use_hint_32 in PR awslabs#372.

jakemas · 2026-05-29T20:31:43Z

I'm still thinking about the interesting aspects of this PR, but I noticed one oddity: arm/proofs/subroutine_signatures.ml seems to have an erroneous duplication at lines 4575-4578 (?)

made some changes in 3956e68

The poly_use_hint_32 proof exposed a non-standard _CORRECT / _CORRECT_CODE split that doesn't appear in any other proof in s2n-bignum (or in mlkem-native / mldsa-native). This consolidates the proof into a single FIPS 204-aligned _CORRECT theorem and a single _SUBROUTINE_CORRECT, matching the convention used by the rest of the codebase. Internal intermediates _CORRECT_CODE, _CORRECT_BOUND_CODE, and _SUBROUTINE_CORRECT_CODE are removed. The new _CORRECT places the input-bound antecedents (val(x i) < 8380417 /\ val(y i) <= 1) inside the postcondition (decompose-style), so the assembly's symbolic execution doesn't depend on input ranges. This matches the shape used by poly_decompose_32 and poly_decompose_88 in mldsa-native. Mechanics: - The FIPS 204 / code-aligned equivalence section (MLDSA_USE_HINT_32_EQUIV) moves above the correctness proof. - The new _CORRECT runs the 878-step ARM symbolic execution without using input bounds, then DISCH_THEN(CONJUNCTS_THEN ASSUME_TAC) picks up the bounds from the postcondition antecedent. - Per-element FIPS-equality is closed via a derived EC_FINAL helper after IMP_REWRITE_TAC[MLDSA_USE_HINT_32_EQUIV]. - The val < 16 corollary is closed by reducing val(word ..) to MOD via VAL_WORD and discharging via the bound on mldsa_use_hint_32_code. - _SUBROUTINE_CORRECT derives from _CORRECT via ARM_ADD_RETURN_NOSTACK_TAC. Verified end-to-end: the new proof closes in 25:10 wall time, exit 0, no axioms. Signed-off-by: Jake Massimo <jakemas@amazon.com>

jakemas mentioned this pull request Mar 13, 2026

HOL-Light: Prove AArch64 poly_use_hint pq-code-package/mldsa-native#486

Closed

jakemas marked this pull request as ready for review March 13, 2026 17:45

jakemas force-pushed the mldsa-poly-use-hint branch from 65aba57 to abffca5 Compare March 17, 2026 18:58

jakemas mentioned this pull request Apr 13, 2026

HOL-Light: Add HOL Light poly_use_hint proofs for AArch64 pq-code-package/mldsa-native#1037

Merged

jakemas force-pushed the mldsa-poly-use-hint branch 7 times, most recently from a5c79f4 to 354c915 Compare April 17, 2026 15:19

jakemas force-pushed the mldsa-poly-use-hint branch 6 times, most recently from ec26efe to 46a0f3a Compare April 24, 2026 06:21

jakemas mentioned this pull request Apr 24, 2026

ML-DSA Aarch64 HOL-Light proof poly_use_hint_88 #375

Open

jakemas added 6 commits April 28, 2026 18:41

Remove word_2smulh from common/mlkem_mldsa.ml, make proof self-contained

811a4c3

word_2smulh is defined in arm/proofs/instruction.ml which is not loaded by x86 proofs. Moving the lemmas inline into the proof file avoids breaking x86 proofs that load the common file.

jakemas force-pushed the mldsa-poly-use-hint branch from 7120b24 to 3176b7e Compare April 28, 2026 18:42

jakemas added 2 commits April 29, 2026 11:09

Merge branch 'main' into mldsa-poly-use-hint

dc55f26

jakemas force-pushed the mldsa-poly-use-hint branch from eda11cd to e46bea5 Compare May 1, 2026 07:52

jakemas added 2 commits May 1, 2026 18:07

jakemas added 4 commits May 13, 2026 18:32

Merge main into mldsa-poly-use-hint

a1e7429

Resolves conflicts by including both mldsa_poly_use_hint_32 from the PR branch and the three new mldsa_pointwise_acc functions (l4, l5, l7) from main in all affected files.

Merge branch 'main' into mldsa-poly-use-hint

a9ba531

Merge branch 'main' into mldsa-poly-use-hint

0eca96c

Merge branch 'main' into mldsa-poly-use-hint

47564ba

Merge branch 'main' into mldsa-poly-use-hint

7281edb

jakemas force-pushed the mldsa-poly-use-hint branch from 19cde16 to 3956e68 Compare May 29, 2026 20:31

jakemas force-pushed the mldsa-poly-use-hint branch from 3956e68 to 05e44d7 Compare May 30, 2026 02:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ML-DSA Aarch64 HOL-Light proof poly_use_hint_32#372

ML-DSA Aarch64 HOL-Light proof poly_use_hint_32#372
jakemas wants to merge 16 commits into
awslabs:mainfrom
jakemas:mldsa-poly-use-hint

jakemas commented Mar 13, 2026 •

edited

Loading

Uh oh!

jargh commented May 19, 2026

Uh oh!

jakemas commented May 19, 2026

Uh oh!

jakemas commented May 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jakemas commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What the proof establishes

Proof architecture

Changes

Testing

Uh oh!

jargh commented May 19, 2026

Uh oh!

jakemas commented May 19, 2026

Uh oh!

jakemas commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jakemas commented Mar 13, 2026 •

edited

Loading

jakemas commented May 29, 2026 •

edited

Loading