[CI] Avoid mismatched comparison and fix error in forall/exists macros#1695
[CI] Avoid mismatched comparison and fix error in forall/exists macros#1695hanno-becker wants to merge 7 commits into
Conversation
Signed-off-by: Nicky Mouha <nmouha@users.noreply.github.com>
Signed-off-by: Nicky Mouha <nmouha@users.noreply.github.com>
Signed-off-by: Nicky Mouha <nmouha@users.noreply.github.com>
Signed-off-by: Nicky Mouha <nmouha@users.noreply.github.com>
Signed-off-by: Nicky Mouha <nmouha@users.noreply.github.com>
Signed-off-by: Nicky Mouha <nicky@mouha.be>
Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
CBMC Results (ML-KEM-512)
Full Results (191 proofs)
|
CBMC Results (ML-KEM-768)
Full Results (191 proofs)
|
|
It's good to see that CI passes for ML-KEM-512 and ML-KEM-768, but I'm surprised to see a failure for ML-KEM-1024. The failure is after 21m. I am assuming that the proof timeout is the default 30 minutes, so maybe this is an out-of-memory error? I'm not seeing an error on my side... So it could also be that this is just an unlucky run, in which case maybe you can run the CI again? @hanno-becker: Thanks for investigating what might have caused the CI failure! |
|
@nmouha NB I think this might be a good time to try to address #1603. If you have bandwidth, would you be interested in having a look (with AIs help, I suppose) how to adjust the CBMC runner scripts so one can configure, for each proof, a set of solvers+options that should be exercised? The results need to disambiguate the solver as well. So I'd imagine that in the proof summaries printed by the CI, we'd have a separate prover column. With that, one could more easily investigate the capabilities of, say, z3 vs bitwuzla vs cvc5 on the different proof workloads. |
That's my best guess -- the CI logs unfortunately don't give any clue. We have observed huge differences between CI performance and local performance (mostly AArch64 Mac for us) in the past -- it does not come as a surprise that we see different behavior here. |
|
I'm not sure I'm the right person to take on #1603. I don't have CI access, and the changes it would require are likely to be contentious: adding solvers (increasing the TCB) and running them in the CI (increasing the CI times). My time might be better spent on other improvements, such as fixing additional soundness bugs or optimizing the slowest proofs. If there is funding for this kind of work, please let me know. |
|
We can only consider this if we get the CI to pass. @rod-chapman Do you have time to take a look at why the CI fails here? |
I'll take care of #1603. |
|
@hanno-becker: Why not just run the CI again, perhaps on a larger EC2 instance (changing ec2_instance_type from r8g.xlarge to r8g.4xlarge in .github/workflows/cbmc.yml)? Perhaps this was just an unlucky run. But if the problem was the solver ran out of memory, then allocating more memory should do the trick... |
|
@nmouha Larger instances are more expensive, so we avoid scaling up if we can. (NB The nix runtime you observed in the other CI is a one-off because the nix config changed -- after that it's cached). But still, we'll take a look and try to understand why it fails. |
CBMC Results (ML-KEM-1024)
Full Results (191 proofs)
|
|
Looking at the CBMC results, only one proof appears to be failing. I ran it locally: The CI run hits the failure at 23m 47s, which is well within the 30-minute timeout. So, this does not seem to be a timeout issue. This points to an out-of-memory problem in the CI environment. Could we confirm the CI memory limit was reached? If so, we may just need to increase it for this proof, or consider alternative options to reduce the memory footprint. |
Copy of #1690 to run full CI.