Avoid mismatched comparison and fix error in forall/exists macros#1690
Avoid mismatched comparison and fix error in forall/exists macros#1690nmouha wants to merge 6 commits into
Conversation
Signed-off-by: Nicky Mouha <nmouha@users.noreply.github.com>
|
I think it's a good idea to flag comparisons with mismatched type, but I'm not convinced we have to make all types equal at the point of declaration. Instead, it seems to me that in a comparison with different types, one should down-cast the larger type to the smaller explicitly -- this would a) trigger CBMC to check that the cast is safe, b) require -- although we don't have a mechanical way to enforce this yet -- some |
|
Thank you, @hanno-becker, for your reply! I really appreciate your thoughts. If This is only an issue if So this is not a bug in mlkem-native, but arguably an enhancement to make |
|
Switching to |
|
Alternatively, you can leave |
Signed-off-by: Nicky Mouha <nmouha@users.noreply.github.com>
Signed-off-by: Nicky Mouha <nmouha@users.noreply.github.com>
|
Thank you, @rod-chapman, for chiming in! I really appreciate your insights. I just made some changes as I'm trying to build a stronger case for my PR.
So, I'm proposing not just to change @hanno-becker, @rod-chapman: Looking forward to your thoughts! |
Signed-off-by: Nicky Mouha <nmouha@users.noreply.github.com>
|
I fear this will blow up proof times. We have tried this before and found that quantifying over size_t instead of unsigned resulted in an SMT encoding where quantification occurs over 64-bit bit-vectors. This has a rather poor impact on proof times with Z3. |
|
You are right that there is an increase in proof times, but maybe it's still reasonable. Currently, the My PR increases the proof time on my laptop from 0.5 seconds to 2 seconds (for one SMT2 input). |
Signed-off-by: Nicky Mouha <nmouha@users.noreply.github.com>
|
Seems the proof time is halved when using cvc5 instead of Bitwuzla (from two seconds to one second). So, I've changed the Makefile to use cvc5 instead. |
|
Signed-off-by: Nicky Mouha <nicky@mouha.be>
af53dd3 to
3eda479
Compare
|
Thank you, @rod-chapman, for clarifying! I see that I had some tunnel vision as I was focusing only on Anyway, you're completely right that my proposed change to I'm proposing to solve two problems at once: complete cvc5 support (cvc5 is already in the Proof Guide and Makefile, just the Nix package is missing), and use cvc5 to avoid the timeouts. Looking forward to your thoughts! |
|
What versions of cbmc and cvc5 are you using? None of these changes work for me. |
I agree with that. This is an area that we have explored extensively, and found to be proof-time sensitive. Apart from the inconvenience of waiting for a PR to pass CI, the CBMC proofs also incur a meaningful CI cost -- we don't want to increase that unnecessarily. My stance from the original reply remains: I think it is good to flag those comparisons and potentially add a CodeQL (or similar) query to check for them, but I think the resolution should be an explicit down-cast at the time of comparison, which is then subject to the usual safety checks. That But, bottomline is still, I'm grateful you brought this up, and I think this is worth our awareness. But I don't think it justifies global changes such as changing the CBMC configuration. |
|
Thank you, @rod-chapman and @hanno-becker, for sharing your thoughts. I want to clarify that these changes are not cosmetic: they fix an error in mlkem-native's forall/exists macros. Currently, forall(i, 0, len, P(i)) does not quantify over all i in [0, len), because i is unsigned instead of size_t (so 32 bits instead of 64 bits on 64-bit platforms). This means proofs that rely on forall are unsound for values of len beyond what CBMC happened to explore. Concretely, here is an incorrect My PR fixes forall so that it actually quantifies over all indices, detecting these types of bugs. No bugs are found in mlkem-native itself, but the proofs are now sound. Regarding proof times: on the main branch I already see timeouts for mlk_indcpa_enc, mlk_keccak_squeeze_once, and poly_ntt_native on my laptop (using the Nix environment). The solver switch avoids these pre-existing timeouts for me. Therefore, I expect that #1691 (full CI) on my updated PR will show reduced solver times as well. On the TCB concern: does the current CI validate unsat cores for Z3 or Bitwuzla proofs? If not, then adding cvc5 doesn't change the trust model: unsound solver bugs are undetected either way. If solver diversity is the concern, adding a third solver actually improves confidence by reducing dependence on any single SMT solver implementation. On the len bound: the comment in verify.h says UINT16_MAX is "to control proof complexity only," but Bitwuzla proves correctness without this bound in ~2 seconds, so that justification doesn't hold. I'd be grateful if you'd consider running the full CI to check that the proof-time concerns are manageable, and consider correcting fixing the error in the forall/exists macros to ensure that the proofs are sound. |
|
@nmouha Thanks for persisting and for clarifying that you suspect a soundness issue, which is obviously much more serious. I ran your example: static MLK_INLINE void mlk_ct_cmov_zero(uint8_t *r, const uint8_t *x,
size_t len, uint8_t b)
__contract__(
requires(len <= MLK_MAX_BUFFER_SIZE)
requires(memory_no_alias(r, len))
requires(memory_no_alias(x, len))
assigns(memory_slice(r, len))
ensures(forall(i, 0, len, (r[i] == (b == 0 ? x[i] : old(r)[i])))))
{
size_t i;
for (i = 0; i < len; i++)
__loop__(
invariant(i <= len)
invariant(forall(k, 0, i, r[k] == (b == 0 ? x[k] : loop_entry(r)[k])))
decreases(len - i))
{
r[i] = mlk_ct_sel_uint8(r[i], x[i], b);
if (i == (1ull<<32)) {
r[i] ^= 1; // bug: corrupts at i == 2^16
}
}
}And I confirm that it passes CBMC. As I understand it, the issue here is not with CBMC but with our wrapper around In the case of mlkem-native, the distinction doesn't really matter because we don't deal with buffers beyond We need to improve this. We either remove the restricted scope of
I am slightly leaning towards the second option because it avoids us fighting with runtime issues while still forcing us to make the limitations of the current Thank you @nmouha again for flagging this -- I appreciate your detailed analysis and tenacity! 🙏 |
|
@mkannwischer correctly points out the the option 2 is not ideal for mldsa-native, as a buffer size limit of |
Acknowledgements: Thanks to @nmouha for identifying this issue. The forall/exists macros in cbmc.h declared the quantified variable as `unsigned`, which on 64-bit platforms is 32 bits wide. When applied to a size_t bound, this silently truncated the upper bound: the quantifier only ranged over indices up to UINT32_MAX rather than the full size_t range, leaving the predicate unchecked for any larger indices. This was demonstrated by @nmouha in #1690 with a contrived bug in mlk_ct_cmov_zero that corrupts the buffer at i == 2^32 yet still passes CBMC under the old macros. mlkem-native itself never operates on buffers that large, so no real bug is masked, but the quantifier definition was misleading and unsound for any future use beyond UINT32_MAX. Make the limitation explicit rather than widening the quantifier (which prior experiments showed blows up SMT proof times for size_t-quantified formulas): - Declare the quantified variable as `uint32_t` (was `unsigned`). - Explicitly cast the lower and upper bounds to `uint32_t`, so passing a wider bound triggers CBMC's conversion check rather than silently truncating. - Tighten mlk_ct_cmov_zero's precondition from MLK_MAX_BUFFER_SIZE to UINT32_MAX (the only size_t-bounded buffer in mlkem-native that participates in a quantified contract). - Change the loop counters in mlk_ct_memcmp and mlk_ct_cmov_zero to `uint32_t` to make the index width explicit at the use site. Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
Acknowledgements: Thanks to @nmouha for identifying this issue. The forall/exists macros in cbmc.h declared the quantified variable as `unsigned`, which on 64-bit platforms is 32 bits wide. When applied to a size_t bound, this silently truncated the upper bound: the quantifier only ranged over indices up to UINT32_MAX rather than the full size_t range, leaving the predicate unchecked for any larger indices. This was demonstrated by @nmouha in #1690 with a contrived bug in mlk_ct_cmov_zero that corrupts the buffer at i == 2^32 yet still passes CBMC under the old macros. mlkem-native itself never operates on buffers that large, so no real bug is masked, but the quantifier definition is still highly misleading and could mask real bugs in the future. For now, make the limitation explicit rather than widening the quantifier: - Explicitly cast the lower and upper bounds to `uint32_t`, so passing a wider bound triggers CBMC's conversion check rather than silently truncating. - Tighten mlk_ct_cmov_zero's precondition from MLK_MAX_BUFFER_SIZE to UINT32_MAX (the only size_t-bounded buffer in mlkem-native that participates in a quantified contract). - Declare the quantified variable as `uint32_t` (was `unsigned`) as a robustness improvement towards the implementation-defined size of `unsigned`. We may still want to consider widening the quantifier variable as a follow-up, but this is a more intrusive change that prior experiments have proved to significantly impact proof performance. Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
|
Thank you, @hanno-becker! I appreciate your detailed reply. I would like to reiterate my proposed PR not only uses Maybe running a full CI on my PR can alleviate your performance concerns? |
|
Oh, I see just now that you've started running a full CI on my PR already in #1695. Looking forward to the results! |
|
@nmouha Yes unfortunately we still need to start a PR from the main repo for full CI... |
|
@rod-chapman Do you have data regarding stability and trustworthiness of CVC5? |
Acknowledgements: Thanks to @nmouha for identifying this issue. The forall/exists macros in cbmc.h declared the quantified variable as `unsigned`, which on 64-bit platforms is 32 bits wide. When applied to a size_t bound, this silently truncated the upper bound: the quantifier only ranged over indices up to UINT32_MAX rather than the full size_t range, leaving the predicate unchecked for any larger indices. This was demonstrated by @nmouha in #1690 with a contrived bug in mlk_ct_cmov_zero that corrupts the buffer at i == 2^32 yet still passes CBMC under the old macros. mlkem-native itself never operates on buffers that large, so no real bug is masked, but the quantifier definition is still highly misleading and could mask real bugs in the future. For now, make the limitation explicit rather than widening the quantifier: - Explicitly cast the lower and upper bounds to `uint32_t`, so passing a wider bound triggers CBMC's conversion check rather than silently truncating. - Tighten mlk_ct_cmov_zero's precondition from MLK_MAX_BUFFER_SIZE to UINT32_MAX (the only size_t-bounded buffer in mlkem-native that participates in a quantified contract). - Declare the quantified variable as `uint32_t` (was `unsigned`) as a robustness improvement towards the implementation-defined size of `unsigned`. We may still want to consider widening the quantifier variable as a follow-up, but this is a more intrusive change that prior experiments have proved to significantly impact proof performance. Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
|
Closed automatically but unintentionally by the merge of #1694. Reopen. |
I searched for all comparisons where the left and right hand sides have different types.
If we exclude cases where either side is a constant (e.g.,
inlen > 0corresponds tosize_t > int), then there are only six results.Five results occur in
fips202.candfips202x4.c, and all of them involve upcastingrfromunsigned inttosize_t. So this can never result in an issue, even when verification is turned off.The sixth result occurs in
verify.h, where the comparisoni < lencorresponds tounsigned int < size_t. I am suggesting to avoid this mismatched comparison by declaringiassize_tinstead ofunsigned.This is not a bug, but a potential issue if
mlk_ct_memcmpis used to replacememcmpin an unrelated project where verification is turned off (otherwise the precondition onlenwill detect any issues).I am suggesting my proposed change not as a bug but as a code improvement, as it may benefit projects that reuse
mlk_ct_memcmpwithout verification.