CBMC: Document and enforce 4 GiB limit of forall/exists quantifiers#1694
Conversation
764b4e0 to
dff9f6a
Compare
Acknowledgements: Thanks to @nmouha for identifying this issue. The forall/exists macros in cbmc.h declared the quantified variable as `unsigned`, which on 64-bit platforms is 32 bits wide. When applied to a size_t bound, this silently truncated the upper bound: the quantifier only ranged over indices up to UINT32_MAX rather than the full size_t range, leaving the predicate unchecked for any larger indices. This was demonstrated by @nmouha in #1690 with a contrived bug in mlk_ct_cmov_zero that corrupts the buffer at i == 2^32 yet still passes CBMC under the old macros. mlkem-native itself never operates on buffers that large, so no real bug is masked, but the quantifier definition is still highly misleading and could mask real bugs in the future. For now, make the limitation explicit rather than widening the quantifier: - Explicitly cast the lower and upper bounds to `uint32_t`, so passing a wider bound triggers CBMC's conversion check rather than silently truncating. - Tighten mlk_ct_cmov_zero's precondition from MLK_MAX_BUFFER_SIZE to UINT32_MAX (the only size_t-bounded buffer in mlkem-native that participates in a quantified contract). - Declare the quantified variable as `uint32_t` (was `unsigned`) as a robustness improvement towards the implementation-defined size of `unsigned`. We may still want to consider widening the quantifier variable as a follow-up, but this is a more intrusive change that prior experiments have proved to significantly impact proof performance. Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
dff9f6a to
47d4a36
Compare
CBMC Results (ML-KEM-768)Full Results (191 proofs)
|
CBMC Results (ML-KEM-512)Full Results (191 proofs)
|
CBMC Results (ML-KEM-1024)Full Results (191 proofs)
|
|
We now have data comparing the two approaches. #1694 (4 GiB limit) runs CBMC proofs in ~13 minutes. #1695 (widening the quantifier without limits) runs them in ~17 minutes. To put this in perspective, the Nix setup test alone in #1695 takes ~28 minutes. The CBMC proof time increase is modest, and if CI time is a concern, there are more impactful targets, such as reducing the Nix setup test duration. |
Thanks for the update! I think we are all in agreement that supporting
The |
This commit ports pq-code-package/mlkem-native#1694. The forall/exists macros in cbmc.h declared the quantified variable as `unsigned`, which on 64-bit platforms is 32 bits wide. When applied to a size_t bound, this silently truncated the upper bound: the quantifier only ranged over indices up to UINT32_MAX rather than the full size_t range, leaving the predicate unchecked for any larger indices. mldsa-native does accept arbitrarily large buffers at the top-level API: the message, context, and pre-string passed into mld_sign / mld_sign_verify / mld_H / mld_shake256_absorb are bounded only by MLD_MAX_BUFFER_SIZE = SIZE_MAX >> 12 (~2^52 on 64-bit), and keccak_absorb walks them in a size_t loop. However, those buffers never feed into a quantified predicate -- they appear only as arguments to memory_no_alias / memory_slice and as scalar loop invariants. Hence, the restricted scope of `forall` has no impact on the CBMC safety guarantees for mldsa-native's top-level API. The only size_t-bounded quantifier in the entire tree is mld_ct_memcmp in ct.h, and it already requires len <= UINT16_MAX. For now, make the limitation explicit rather than widening the quantifier: - Explicitly cast the lower and upper bounds to `uint32_t`, so passing a wider bound triggers CBMC's conversion check rather than silently truncating. - Declare the quantified variable as `uint32_t` (was `unsigned`) as a robustness improvement towards the implementation-defined size of `unsigned`. Note: the upstream PR also tightened mlk_ct_cmov_zero's precondition to UINT32_MAX. mldsa-native has no equivalent function, so no analogous change is needed. We may still want to consider widening the quantifier variable as a follow-up, but this is a more intrusive change that prior experiments have proved to significantly impact proof performance. Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
This is one possible way to approach #1690. We may also consider merging it as a temporary improvement while continuing to deliberate other approaches such as @nmouha's suggestion of switching to
size_tbounds and using CVC5.Acknowledgements: Thanks to @nmouha for identifying this issue.
The forall/exists macros in cbmc.h declared the quantified variable as
unsigned, which on 64-bit platforms is 32 bits wide. When applied toa size_t bound, this silently truncated the upper bound: the quantifier
only ranged over indices up to UINT32_MAX rather than the full size_t
range, leaving the predicate unchecked for any larger indices.
This was demonstrated by @nmouha in #1690
with a contrived bug in mlk_ct_cmov_zero that corrupts the buffer at
i == 2^32 yet still passes CBMC under the old macros. mlkem-native
itself never operates on buffers that large, so no real bug is masked,
but the quantifier definition is still highly misleading and could mask
real bugs in the future.
For now, make the limitation explicit rather than widening the quantifier:
uint32_t, sopassing a wider bound triggers CBMC's conversion check rather than
silently truncating.
UINT32_MAX (the only size_t-bounded buffer in mlkem-native that
participates in a quantified contract).
uint32_t(wasunsigned)as a robustness improvement towards the implementation-defined
size of
unsigned.We may still want to consider widening the quantifier variable
as a follow-up, but this is a more intrusive change that prior
experiments have proved to significantly impact proof performance.