Skip to content

Add ppc64le backend (supports p8 and above architectures) [full CI]#1677

Draft
mkannwischer wants to merge 29 commits into
mainfrom
ppc64le
Draft

Add ppc64le backend (supports p8 and above architectures) [full CI]#1677
mkannwischer wants to merge 29 commits into
mainfrom
ppc64le

Conversation

@mkannwischer
Copy link
Copy Markdown
Contributor

Running the full CI on #1648

@mkannwischer mkannwischer added the benchmark this PR should be benchmarked in CI label May 5, 2026
Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mac Mini (M1, 2020) benchmarks

Details
Benchmark suite Current: 12036a9 Previous: db75353 Ratio
ML-KEM-512 keypair 12319 cycles 12319 cycles 1
ML-KEM-512 encaps 14997 cycles 14997 cycles 1
ML-KEM-512 decaps 19549 cycles 19551 cycles 1.00
ML-KEM-768 keypair 21263 cycles 21264 cycles 1.00
ML-KEM-768 encaps 23873 cycles 23874 cycles 1.00
ML-KEM-768 decaps 30417 cycles 30425 cycles 1.00
ML-KEM-1024 keypair 30328 cycles 30327 cycles 1.00
ML-KEM-1024 encaps 34573 cycles 34573 cycles 1
ML-KEM-1024 decaps 44189 cycles 44190 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ppc64le (POWER10) benchmarks

Details
Benchmark suite Current: 12036a9 Previous: db75353 Ratio
ML-KEM-512 keypair 38059 cycles 59591 cycles 0.64
ML-KEM-512 encaps 43668 cycles 72335 cycles 0.60
ML-KEM-512 decaps 53803 cycles 92227 cycles 0.58
ML-KEM-768 keypair 67131 cycles 97410 cycles 0.69
ML-KEM-768 encaps 76341 cycles 114292 cycles 0.67
ML-KEM-768 decaps 90588 cycles 139751 cycles 0.65
ML-KEM-1024 keypair 108574 cycles 151020 cycles 0.72
ML-KEM-1024 encaps 119069 cycles 169141 cycles 0.70
ML-KEM-1024 decaps 137495 cycles 200776 cycles 0.68

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i)

Details
Benchmark suite Current: 12036a9 Previous: db75353 Ratio
ML-KEM-512 keypair 12048 cycles 12038 cycles 1.00
ML-KEM-512 encaps 13632 cycles 13787 cycles 0.99
ML-KEM-512 decaps 17778 cycles 17801 cycles 1.00
ML-KEM-768 keypair 21266 cycles 21014 cycles 1.01
ML-KEM-768 encaps 22146 cycles 22184 cycles 1.00
ML-KEM-768 decaps 28443 cycles 28329 cycles 1.00
ML-KEM-1024 keypair 29577 cycles 29959 cycles 0.99
ML-KEM-1024 encaps 31745 cycles 31722 cycles 1.00
ML-KEM-1024 decaps 39476 cycles 39346 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Intel Xeon 4th gen (c7i)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 5f3d329 Previous: 2ee902c Ratio
ML-KEM-512 keypair 12044 cycles 9751 cycles 1.24
ML-KEM-512 encaps 13624 cycles 11423 cycles 1.19
ML-KEM-512 decaps 17783 cycles 15570 cycles 1.14
ML-KEM-768 keypair 21292 cycles 16302 cycles 1.31
ML-KEM-768 encaps 22015 cycles 17954 cycles 1.23
ML-KEM-768 decaps 28023 cycles 23461 cycles 1.19
ML-KEM-1024 keypair 29562 cycles 22439 cycles 1.32
ML-KEM-1024 encaps 31715 cycles 24509 cycles 1.29
ML-KEM-1024 decaps 39395 cycles 32178 cycles 1.22

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a)

Details
Benchmark suite Current: 12036a9 Previous: db75353 Ratio
ML-KEM-512 keypair 14207 cycles 14376 cycles 0.99
ML-KEM-512 encaps 15984 cycles 16060 cycles 1.00
ML-KEM-512 decaps 21538 cycles 21627 cycles 1.00
ML-KEM-768 keypair 25114 cycles 24794 cycles 1.01
ML-KEM-768 encaps 25658 cycles 25550 cycles 1.00
ML-KEM-768 decaps 33523 cycles 33409 cycles 1.00
ML-KEM-1024 keypair 34848 cycles 37228 cycles 0.94
ML-KEM-1024 encaps 36114 cycles 37346 cycles 0.97
ML-KEM-1024 decaps 47236 cycles 46787 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a)

Details
Benchmark suite Current: 12036a9 Previous: db75353 Ratio
ML-KEM-512 keypair 12796 cycles 12777 cycles 1.00
ML-KEM-512 encaps 14285 cycles 14269 cycles 1.00
ML-KEM-512 decaps 19148 cycles 19117 cycles 1.00
ML-KEM-768 keypair 22525 cycles 22412 cycles 1.01
ML-KEM-768 encaps 23071 cycles 23051 cycles 1.00
ML-KEM-768 decaps 30094 cycles 30064 cycles 1.00
ML-KEM-1024 keypair 34224 cycles 32997 cycles 1.04
ML-KEM-1024 encaps 33002 cycles 33104 cycles 1.00
ML-KEM-1024 decaps 42405 cycles 42483 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i) (no-opt)

Details
Benchmark suite Current: 12036a9 Previous: db75353 Ratio
ML-KEM-512 keypair 28337 cycles 28237 cycles 1.00
ML-KEM-512 encaps 36782 cycles 36623 cycles 1.00
ML-KEM-512 decaps 45398 cycles 45123 cycles 1.01
ML-KEM-768 keypair 46215 cycles 46315 cycles 1.00
ML-KEM-768 encaps 55787 cycles 55593 cycles 1.00
ML-KEM-768 decaps 69875 cycles 69917 cycles 1.00
ML-KEM-1024 keypair 70417 cycles 70363 cycles 1.00
ML-KEM-1024 encaps 82459 cycles 82510 cycles 1.00
ML-KEM-1024 decaps 99343 cycles 99218 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'AMD EPYC 4th gen (c7a)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 12036a9 Previous: db75353 Ratio
ML-KEM-1024 keypair 34224 cycles 32997 cycles 1.04

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i)

Details
Benchmark suite Current: 12036a9 Previous: db75353 Ratio
ML-KEM-512 keypair 17603 cycles 17547 cycles 1.00
ML-KEM-512 encaps 19900 cycles 19938 cycles 1.00
ML-KEM-512 decaps 26420 cycles 26450 cycles 1.00
ML-KEM-768 keypair 31203 cycles 31168 cycles 1.00
ML-KEM-768 encaps 31989 cycles 32415 cycles 0.99
ML-KEM-768 decaps 41468 cycles 41536 cycles 1.00
ML-KEM-1024 keypair 43770 cycles 43998 cycles 0.99
ML-KEM-1024 encaps 45855 cycles 46270 cycles 0.99
ML-KEM-1024 decaps 58042 cycles 58266 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Intel Xeon 3rd gen (c6i)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 5f3d329 Previous: 2ee902c Ratio
ML-KEM-512 keypair 17588 cycles 16301 cycles 1.08
ML-KEM-512 encaps 19896 cycles 18736 cycles 1.06
ML-KEM-512 decaps 26412 cycles 25234 cycles 1.05
ML-KEM-768 keypair 31190 cycles 28649 cycles 1.09
ML-KEM-768 encaps 31768 cycles 30001 cycles 1.06
ML-KEM-1024 keypair 43790 cycles 37884 cycles 1.16
ML-KEM-1024 encaps 45790 cycles 40704 cycles 1.12
ML-KEM-1024 decaps 58108 cycles 54265 cycles 1.07

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4

Details
Benchmark suite Current: 12036a9 Previous: db75353 Ratio
ML-KEM-512 keypair 17686 cycles 17646 cycles 1.00
ML-KEM-512 encaps 20642 cycles 20608 cycles 1.00
ML-KEM-512 decaps 27077 cycles 27084 cycles 1.00
ML-KEM-768 keypair 29981 cycles 29899 cycles 1.00
ML-KEM-768 encaps 32757 cycles 32774 cycles 1.00
ML-KEM-768 decaps 42007 cycles 41962 cycles 1.00
ML-KEM-1024 keypair 43716 cycles 43745 cycles 1.00
ML-KEM-1024 encaps 48773 cycles 48719 cycles 1.00
ML-KEM-1024 decaps 61379 cycles 61386 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a) (no-opt)

Details
Benchmark suite Current: 12036a9 Previous: db75353 Ratio
ML-KEM-512 keypair 40197 cycles 40266 cycles 1.00
ML-KEM-512 encaps 48368 cycles 48417 cycles 1.00
ML-KEM-512 decaps 62501 cycles 62596 cycles 1.00
ML-KEM-768 keypair 63804 cycles 63800 cycles 1.00
ML-KEM-768 encaps 74937 cycles 74978 cycles 1.00
ML-KEM-768 decaps 93413 cycles 93631 cycles 1.00
ML-KEM-1024 keypair 95310 cycles 95102 cycles 1.00
ML-KEM-1024 encaps 109384 cycles 109294 cycles 1.00
ML-KEM-1024 decaps 132137 cycles 132065 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a) (no-opt)

Details
Benchmark suite Current: 12036a9 Previous: db75353 Ratio
ML-KEM-512 keypair 36602 cycles 36810 cycles 0.99
ML-KEM-512 encaps 43115 cycles 43058 cycles 1.00
ML-KEM-512 decaps 55703 cycles 55671 cycles 1.00
ML-KEM-768 keypair 58678 cycles 58693 cycles 1.00
ML-KEM-768 encaps 67624 cycles 67471 cycles 1.00
ML-KEM-768 decaps 84521 cycles 84392 cycles 1.00
ML-KEM-1024 keypair 89114 cycles 89088 cycles 1.00
ML-KEM-1024 encaps 99256 cycles 99346 cycles 1.00
ML-KEM-1024 decaps 120774 cycles 120756 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i) (no-opt)

Details
Benchmark suite Current: 12036a9 Previous: db75353 Ratio
ML-KEM-512 keypair 45749 cycles 45722 cycles 1.00
ML-KEM-512 encaps 54475 cycles 54376 cycles 1.00
ML-KEM-512 decaps 69855 cycles 69830 cycles 1.00
ML-KEM-768 keypair 74173 cycles 74187 cycles 1.00
ML-KEM-768 encaps 86050 cycles 86041 cycles 1.00
ML-KEM-768 decaps 106672 cycles 106532 cycles 1.00
ML-KEM-1024 keypair 112123 cycles 112130 cycles 1.00
ML-KEM-1024 encaps 124717 cycles 124654 cycles 1.00
ML-KEM-1024 decaps 150632 cycles 150714 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3

Details
Benchmark suite Current: 12036a9 Previous: db75353 Ratio
ML-KEM-512 keypair 18675 cycles 18639 cycles 1.00
ML-KEM-512 encaps 21889 cycles 21878 cycles 1.00
ML-KEM-512 decaps 28890 cycles 28864 cycles 1.00
ML-KEM-768 keypair 31630 cycles 31545 cycles 1.00
ML-KEM-768 encaps 34788 cycles 34776 cycles 1.00
ML-KEM-768 decaps 44839 cycles 44778 cycles 1.00
ML-KEM-1024 keypair 46069 cycles 46079 cycles 1.00
ML-KEM-1024 encaps 51492 cycles 51492 cycles 1
ML-KEM-1024 decaps 65005 cycles 65023 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4 (no-opt)

Details
Benchmark suite Current: 12036a9 Previous: db75353 Ratio
ML-KEM-512 keypair 35504 cycles 35410 cycles 1.00
ML-KEM-512 encaps 40175 cycles 40114 cycles 1.00
ML-KEM-512 decaps 51132 cycles 51139 cycles 1.00
ML-KEM-768 keypair 56800 cycles 56670 cycles 1.00
ML-KEM-768 encaps 64827 cycles 65147 cycles 1.00
ML-KEM-768 decaps 78931 cycles 79294 cycles 1.00
ML-KEM-1024 keypair 87846 cycles 87857 cycles 1.00
ML-KEM-1024 encaps 97109 cycles 96871 cycles 1.00
ML-KEM-1024 decaps 115956 cycles 115822 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A76 (Raspberry Pi 5) benchmarks

Details
Benchmark suite Current: 12036a9 Previous: db75353 Ratio
ML-KEM-512 keypair 28265 cycles 28220 cycles 1.00
ML-KEM-512 encaps 34157 cycles 34107 cycles 1.00
ML-KEM-512 decaps 44377 cycles 44335 cycles 1.00
ML-KEM-768 keypair 47618 cycles 47614 cycles 1.00
ML-KEM-768 encaps 53934 cycles 53937 cycles 1.00
ML-KEM-768 decaps 68340 cycles 68365 cycles 1.00
ML-KEM-1024 keypair 70248 cycles 70248 cycles 1
ML-KEM-1024 encaps 78733 cycles 78728 cycles 1.00
ML-KEM-1024 decaps 98416 cycles 98444 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3 (no-opt)

Details
Benchmark suite Current: 12036a9 Previous: db75353 Ratio
ML-KEM-512 keypair 39016 cycles 38886 cycles 1.00
ML-KEM-512 encaps 44562 cycles 44589 cycles 1.00
ML-KEM-512 decaps 56630 cycles 56665 cycles 1.00
ML-KEM-768 keypair 62456 cycles 62296 cycles 1.00
ML-KEM-768 encaps 71385 cycles 72308 cycles 0.99
ML-KEM-768 decaps 86856 cycles 87700 cycles 0.99
ML-KEM-1024 keypair 96224 cycles 96159 cycles 1.00
ML-KEM-1024 encaps 106363 cycles 106136 cycles 1.00
ML-KEM-1024 decaps 126811 cycles 126585 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2

Details
Benchmark suite Current: 12036a9 Previous: db75353 Ratio
ML-KEM-512 keypair 28238 cycles 28274 cycles 1.00
ML-KEM-512 encaps 34161 cycles 34125 cycles 1.00
ML-KEM-512 decaps 44340 cycles 44382 cycles 1.00
ML-KEM-768 keypair 47638 cycles 47672 cycles 1.00
ML-KEM-768 encaps 53920 cycles 53906 cycles 1.00
ML-KEM-768 decaps 68398 cycles 68361 cycles 1.00
ML-KEM-1024 keypair 70366 cycles 70253 cycles 1.00
ML-KEM-1024 encaps 78752 cycles 78754 cycles 1.00
ML-KEM-1024 decaps 98551 cycles 98440 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SpacemiT K1 8 (Banana Pi F3) benchmarks

Details
Benchmark suite Current: 12036a9 Previous: db75353 Ratio
ML-KEM-512 keypair 155506 cycles 155497 cycles 1.00
ML-KEM-512 encaps 163419 cycles 163389 cycles 1.00
ML-KEM-512 decaps 206655 cycles 206624 cycles 1.00
ML-KEM-768 keypair 249866 cycles 249882 cycles 1.00
ML-KEM-768 encaps 270396 cycles 270411 cycles 1.00
ML-KEM-768 decaps 332775 cycles 332827 cycles 1.00
ML-KEM-1024 keypair 395754 cycles 395617 cycles 1.00
ML-KEM-1024 encaps 423609 cycles 422610 cycles 1.00
ML-KEM-1024 decaps 507214 cycles 506225 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2 (no-opt)

Details
Benchmark suite Current: 12036a9 Previous: db75353 Ratio
ML-KEM-512 keypair 59230 cycles 59139 cycles 1.00
ML-KEM-512 encaps 68655 cycles 68634 cycles 1.00
ML-KEM-512 decaps 87379 cycles 87351 cycles 1.00
ML-KEM-768 keypair 95187 cycles 95327 cycles 1.00
ML-KEM-768 encaps 109224 cycles 109878 cycles 0.99
ML-KEM-768 decaps 134022 cycles 134352 cycles 1.00
ML-KEM-1024 keypair 146800 cycles 148090 cycles 0.99
ML-KEM-1024 encaps 162753 cycles 163969 cycles 0.99
ML-KEM-1024 decaps 194331 cycles 195624 cycles 0.99

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A55 (Snapdragon 888) benchmarks

Details
Benchmark suite Current: 12036a9 Previous: db75353 Ratio
ML-KEM-512 keypair 59765 cycles 59775 cycles 1.00
ML-KEM-512 encaps 67418 cycles 67520 cycles 1.00
ML-KEM-512 decaps 86122 cycles 86168 cycles 1.00
ML-KEM-768 keypair 97488 cycles 97434 cycles 1.00
ML-KEM-768 encaps 110849 cycles 110991 cycles 1.00
ML-KEM-768 decaps 138197 cycles 138336 cycles 1.00
ML-KEM-1024 keypair 155074 cycles 154826 cycles 1.00
ML-KEM-1024 encaps 172210 cycles 172753 cycles 1.00
ML-KEM-1024 decaps 209438 cycles 208701 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented May 5, 2026

CBMC Results (ML-KEM-1024)

Full Results (191 proofs)
Proof Status Current Previous Change
**TOTAL** 1248s 1172s +6.5%
mlk_indcpa_enc 145s 139s +4%
mlk_indcpa_keypair_derand 130s 120s +8%
mlk_rej_uniform_c 127s 113s +12%
mlk_polyvec_basemul_acc_montgomery_cached_c 81s 73s +11%
mlk_ntt_layer 33s 31s +6%
mlk_poly_rej_uniform 33s 29s +14%
polyvec_basemul_acc_montgomery_cached_native 33s 33s +0%
poly_ntt_native 27s 24s +12%
mlk_keccak_squeezeblocks_x4 25s 23s +9%
mlk_poly_reduce_native 20s 21s -5%
keccakf1600x4_permute_native_x4 18s 18s +0%
mlk_fqmul 15s 15s +0%
mlk_poly_decompress_d11_native 14s 14s +0%
mlk_poly_decompress_d5_native 13s 14s -7%
mlk_polyvec_add 12s 11s +9%
mlk_poly_frommsg 10s 8s +25%
mlk_poly_frombytes_native 9s 8s +12%
kem_dec 8s 4s +100%
mlk_keccak_squeezeblocks 8s 7s +14%
mlk_poly_compress_d11_c 8s 6s +33%
mlk_indcpa_dec 7s 7s +0%
mlk_keccak_absorb_once_x4 7s 5s +40%
mlk_keccak_squeeze_once 7s 7s +0%
mlk_ntt_butterfly_block 7s 8s -12%
mlk_gen_matrix 6s 5s +20%
mlk_invntt_layer 6s 5s +20%
mlk_keccakf1600_permute_c 6s 4s +50%
mlk_poly_decompress_d4_native 6s 3s +100%
mlk_poly_ntt 6s 4s +50%
mlk_polymat_permute_bitrev_to_custom 6s 5s +20%
poly_decompress_d5_native_x86_64 6s 6s +0%
rej_uniform_native_x86_64 6s 5s +20%
kem_keypair 5s 3s +67%
mlk_gen_matrix_serial 5s 4s +25%
mlk_poly_cbd_eta1 5s 4s +25%
mlk_poly_compress_d5_native 5s 1s +400%
mlk_poly_rej_uniform_x4 5s 7s -29%
mlk_polyvec_compress_du 5s 3s +67%
mlk_scalar_compress_d1 5s 2s +150%
mlk_shake256x4 5s 3s +67%
poly_decompress_d11_native_x86_64 5s 3s +67%
poly_frombytes_native_x86_64 5s 3s +67%
poly_getnoise_eta1122_4x_native 5s 2s +150%
polyvec_basemul_acc_montgomery_cached_k2_native_x86_64 5s 3s +67%
kem_check_pk 4s 2s +100%
mlk_matvec_mul 4s 5s -20%
mlk_poly_compress_d11_native 4s 3s +33%
mlk_poly_compress_d4 4s 2s +100%
mlk_poly_compress_d4_c 4s 2s +100%
mlk_poly_decompress_d10_c 4s 1s +300%
mlk_poly_mulcache_compute 4s 3s +33%
mlk_polyvec_basemul_acc_montgomery_cached 4s 5s -20%
mlk_polyvec_ntt 4s 3s +33%
mlk_sha3_512 4s 1s +300%
poly_tomont_native_aarch64 4s 2s +100%
polyvec_basemul_acc_montgomery_cached_k2_native_aarch64 4s 2s +100%
intt_native_aarch64 3s 2s +50%
keccak_f1600_x1_native_aarch64 3s 2s +50%
kem_enc 3s 1s +200%
mlk_barrett_reduce 3s 2s +50%
mlk_ct_cmask_neg_i16 3s 4s -25%
mlk_ct_cmov_zero 3s 2s +50%
mlk_ct_get_optblocker_i32 3s 1s +200%
mlk_ct_sel_uint8 3s 2s +50%
mlk_keccak_absorb_once 3s 3s +0%
mlk_keccakf1600_extract_bytes (big endian) 3s 2s +50%
mlk_keccakf1600_xor_bytes (big endian) 3s 1s +200%
mlk_keccakf1600x4_xor_bytes 3s 2s +50%
mlk_keypair_getnoise_eta1 3s 3s +0%
mlk_poly_compress_d10_c 3s 3s +0%
mlk_poly_compress_d11 3s 3s +0%
mlk_poly_compress_d5 3s 2s +50%
mlk_poly_compress_d5_c 3s 1s +200%
mlk_poly_decompress_d11 3s 2s +50%
mlk_poly_decompress_d11_c 3s 4s -25%
mlk_poly_decompress_d4 3s 3s +0%
mlk_poly_decompress_d4_c 3s 3s +0%
mlk_poly_decompress_dv 3s 1s +200%
mlk_poly_getnoise_eta1_4x_native 3s 2s +50%
mlk_poly_invntt_tomont 3s 1s +200%
mlk_poly_invntt_tomont_c 3s 2s +50%
mlk_poly_mulcache_compute_native 3s 3s +0%
mlk_poly_ntt_c 3s 4s -25%
mlk_poly_tomont 3s 2s +50%
mlk_poly_tomont_native 3s 1s +200%
mlk_polyvec_decompress_du 3s 2s +50%
mlk_polyvec_invntt_tomont 3s 2s +50%
mlk_scalar_compress_d11 3s 2s +50%
mlk_scalar_compress_d4 3s 2s +50%
mlk_scalar_decompress_d11 3s 2s +50%
mlk_scalar_decompress_d4 3s 2s +50%
mlk_scalar_signed_to_unsigned_q 3s 4s -25%
mlk_shake256 3s 1s +200%
mlk_value_barrier_u32 3s 2s +50%
mlk_value_barrier_u8 3s 2s +50%
ntt_native_aarch64 3s 4s -25%
ntt_native_x86_64 3s 2s +50%
nttunpack_native_x86_64 3s 2s +50%
poly_compress_d10_native_x86_64 3s 2s +50%
poly_compress_d11_native_x86_64 3s 1s +200%
poly_compress_d4_native_x86_64 3s 4s -25%
poly_compress_d5_native_x86_64 3s 3s +0%
poly_decompress_d4_native_x86_64 3s 1s +200%
poly_mulcache_compute_native_aarch64 3s 3s +0%
poly_tobytes_native_aarch64 3s 4s -25%
polyvec_basemul_acc_montgomery_cached_k4_native_aarch64 3s 1s +200%
rej_uniform_native 3s 1s +200%
rej_uniform_native_aarch64 3s 2s +50%
intt_native_x86_64 2s 3s -33%
keccak_f1600_x1_native_aarch64_v84a 2s 5s -60%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 2s 3s -33%
keccak_f1600_x4_native_avx2 2s 1s +100%
keccakf1600x4_xor_bytes_native 2s 2s +0%
kem_enc_derand 2s 2s +0%
mlk_check_pct 2s 3s -33%
mlk_ct_cmask_nonzero_u16 2s 3s -33%
mlk_ct_cmask_nonzero_u8 2s 5s -60%
mlk_ct_get_optblocker_u32 2s 1s +100%
mlk_enc_getnoise_eta1_eta2 2s 4s -50%
mlk_keccakf1600_extract_bytes 2s 1s +100%
mlk_keccakf1600_permute 2s 4s -50%
mlk_keccakf1600x4_extract_bytes 2s 4s -50%
mlk_keccakf1600x4_xor_bytes_c 2s 4s -50%
mlk_poly_cbd_eta2 2s 2s +0%
mlk_poly_compress_d10 2s 3s -33%
mlk_poly_compress_d10_native 2s 3s -33%
mlk_poly_compress_d4_native 2s 3s -33%
mlk_poly_decompress_d10 2s 1s +100%
mlk_poly_decompress_d10_native 2s 1s +100%
mlk_poly_decompress_d5 2s 3s -33%
mlk_poly_decompress_d5_c 2s 2s +0%
mlk_poly_decompress_du 2s 2s +0%
mlk_poly_getnoise_eta1122_4x 2s 3s -33%
mlk_poly_getnoise_eta1_4x 2s 3s -33%
mlk_poly_getnoise_eta2 2s 4s -50%
mlk_poly_mulcache_compute_c 2s 1s +100%
mlk_poly_reduce 2s 3s -33%
mlk_poly_reduce_c 2s 2s +0%
mlk_poly_sub 2s 2s +0%
mlk_poly_tobytes 2s 1s +100%
mlk_poly_tobytes_native 2s 1s +100%
mlk_poly_tomont_c 2s 2s +0%
mlk_polyvec_frombytes 2s 2s +0%
mlk_polyvec_mulcache_compute 2s 5s -60%
mlk_polyvec_permute_bitrev_to_custom 2s 3s -33%
mlk_polyvec_permute_bitrev_to_custom_native 2s 3s -33%
mlk_polyvec_reduce 2s 2s +0%
mlk_polyvec_tobytes 2s 3s -33%
mlk_polyvec_tomont 2s 2s +0%
mlk_rej_uniform 2s 2s +0%
mlk_scalar_compress_d10 2s 3s -33%
mlk_scalar_compress_d5 2s 1s +100%
mlk_scalar_decompress_d10 2s 2s +0%
mlk_sha3_256 2s 5s -60%
mlk_shake128_squeezeblocks 2s 1s +100%
mlk_shake128x4_absorb_once 2s 2s +0%
mlk_shake128x4_squeezeblocks 2s 2s +0%
poly_decompress_d10_native_x86_64 2s 3s -33%
poly_invntt_tomont_native 2s 3s -33%
poly_mulcache_compute_native_x86_64 2s 3s -33%
poly_tobytes_native_x86_64 2s 4s -50%
polyvec_basemul_acc_montgomery_cached_k3_native_x86_64 2s 2s +0%
polyvec_basemul_acc_montgomery_cached_k4_native_x86_64 2s 3s -33%
sys_check_capability 2s 2s +0%
keccak_f1600_x4_native_aarch64_v84a 1s 1s +0%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 1s 2s -50%
keccakf1600_permute_native 1s 3s -67%
keccakf1600x4_extract_bytes_native 1s 1s +0%
kem_check_sk 1s 3s -67%
kem_keypair_derand 1s 1s +0%
mlk_ct_get_optblocker_u8 1s 4s -75%
mlk_ct_memcmp 1s 2s -50%
mlk_ct_sel_int16 1s 2s -50%
mlk_keccakf1600_xor_bytes 1s 2s -50%
mlk_keccakf1600x4_extract_bytes_c 1s 3s -67%
mlk_keccakf1600x4_permute 1s 3s -67%
mlk_montgomery_reduce 1s 4s -75%
mlk_poly_add 1s 1s +0%
mlk_poly_compress_du 1s 2s -50%
mlk_poly_compress_dv 1s 5s -80%
mlk_poly_frombytes 1s 3s -67%
mlk_poly_frombytes_c 1s 3s -67%
mlk_poly_tobytes_c 1s 3s -67%
mlk_poly_tomsg 1s 1s +0%
mlk_scalar_decompress_d5 1s 1s +0%
mlk_shake128_absorb_once 1s 2s -50%
mlk_value_barrier_i32 1s 3s -67%
poly_reduce_native_aarch64 1s 2s -50%
poly_reduce_native_x86_64 1s 4s -75%
poly_tomont_native_x86_64 1s 2s -50%
polyvec_basemul_acc_montgomery_cached_k3_native_aarch64 1s 5s -80%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented May 5, 2026

CBMC Results (ML-KEM-512)

Full Results (191 proofs)
Proof Status Current Previous Change
**TOTAL** 1273s 1297s -1.9%
mlk_indcpa_keypair_derand 247s 237s +4%
mlk_indcpa_enc 163s 165s -1%
mlk_rej_uniform_c 115s 121s -5%
mlk_polyvec_basemul_acc_montgomery_cached_c 51s 50s +2%
mlk_ntt_layer 31s 32s -3%
mlk_poly_rej_uniform 30s 30s +0%
mlk_keccak_squeezeblocks_x4 25s 27s -7%
poly_ntt_native 23s 23s +0%
mlk_poly_reduce_native 21s 19s +11%
keccakf1600x4_permute_native_x4 17s 17s +0%
mlk_indcpa_dec 16s 14s +14%
mlk_poly_decompress_d10_native 15s 14s +7%
mlk_fqmul 14s 15s -7%
mlk_poly_decompress_d4_native 14s 15s -7%
mlk_poly_frommsg 10s 7s +43%
mlk_polyvec_add 10s 10s +0%
mlk_poly_ntt 9s 8s +12%
mlk_keccak_squeeze_once 8s 9s -11%
mlk_keccak_squeezeblocks 8s 9s -11%
mlk_ntt_butterfly_block 8s 8s +0%
mlk_poly_frombytes_native 8s 7s +14%
polyvec_basemul_acc_montgomery_cached_native 8s 6s +33%
mlk_keccak_absorb_once_x4 6s 6s +0%
mlk_poly_rej_uniform_x4 6s 6s +0%
poly_decompress_d10_native_x86_64 6s 5s +20%
rej_uniform_native_x86_64 6s 6s +0%
mlk_gen_matrix_serial 5s 2s +150%
mlk_invntt_layer 5s 5s +0%
mlk_keccakf1600x4_permute 5s 4s +25%
mlk_poly_cbd_eta2 5s 5s +0%
mlk_poly_compress_d10_c 5s 5s +0%
poly_compress_d4_native_x86_64 5s 2s +150%
kem_dec 4s 5s -20%
mlk_check_pct 4s 3s +33%
mlk_keccakf1600_permute_c 4s 4s +0%
mlk_poly_getnoise_eta1_4x_native 4s 3s +33%
mlk_poly_mulcache_compute_c 4s 5s -20%
mlk_poly_tomsg 4s 3s +33%
mlk_polyvec_invntt_tomont 4s 2s +100%
mlk_scalar_decompress_d11 4s 2s +100%
mlk_shake256x4 4s 5s -20%
mlk_value_barrier_i32 4s 4s +0%
poly_decompress_d4_native_x86_64 4s 4s +0%
rej_uniform_native_aarch64 4s 3s +33%
keccak_f1600_x1_native_aarch64_v84a 3s 2s +50%
kem_check_pk 3s 3s +0%
kem_check_sk 3s 3s +0%
kem_enc_derand 3s 1s +200%
kem_keypair_derand 3s 2s +50%
mlk_ct_cmask_neg_i16 3s 2s +50%
mlk_ct_cmov_zero 3s 3s +0%
mlk_ct_get_optblocker_u8 3s 3s +0%
mlk_ct_sel_int16 3s 1s +200%
mlk_gen_matrix 3s 3s +0%
mlk_keccak_absorb_once 3s 3s +0%
mlk_keccakf1600_xor_bytes 3s 2s +50%
mlk_keccakf1600x4_xor_bytes 3s 2s +50%
mlk_keypair_getnoise_eta1 3s 2s +50%
mlk_matvec_mul 3s 3s +0%
mlk_poly_compress_d10 3s 1s +200%
mlk_poly_compress_d10_native 3s 2s +50%
mlk_poly_compress_d11_native 3s 1s +200%
mlk_poly_compress_d4 3s 2s +50%
mlk_poly_compress_d4_c 3s 2s +50%
mlk_poly_compress_d4_native 3s 2s +50%
mlk_poly_compress_d5 3s 2s +50%
mlk_poly_decompress_d10_c 3s 2s +50%
mlk_poly_decompress_d4_c 3s 4s -25%
mlk_poly_decompress_d5_native 3s 1s +200%
mlk_poly_decompress_du 3s 2s +50%
mlk_poly_mulcache_compute 3s 2s +50%
mlk_poly_mulcache_compute_native 3s 2s +50%
mlk_poly_ntt_c 3s 4s -25%
mlk_poly_sub 3s 2s +50%
mlk_poly_tobytes_c 3s 1s +200%
mlk_poly_tobytes_native 3s 1s +200%
mlk_poly_tomont 3s 3s +0%
mlk_polymat_permute_bitrev_to_custom 3s 2s +50%
mlk_polyvec_basemul_acc_montgomery_cached 3s 5s -40%
mlk_polyvec_decompress_du 3s 5s -40%
mlk_polyvec_frombytes 3s 3s +0%
mlk_polyvec_tobytes 3s 3s +0%
mlk_scalar_compress_d5 3s 2s +50%
mlk_scalar_decompress_d5 3s 3s +0%
mlk_sha3_256 3s 2s +50%
mlk_shake128x4_absorb_once 3s 2s +50%
mlk_shake128x4_squeezeblocks 3s 3s +0%
mlk_shake256 3s 1s +200%
mlk_value_barrier_u32 3s 2s +50%
ntt_native_aarch64 3s 2s +50%
nttunpack_native_x86_64 3s 4s -25%
poly_compress_d5_native_x86_64 3s 1s +200%
poly_frombytes_native_x86_64 3s 5s -40%
poly_reduce_native_aarch64 3s 4s -25%
polyvec_basemul_acc_montgomery_cached_k2_native_x86_64 3s 4s -25%
polyvec_basemul_acc_montgomery_cached_k3_native_aarch64 3s 3s +0%
polyvec_basemul_acc_montgomery_cached_k4_native_aarch64 3s 3s +0%
intt_native_aarch64 2s 1s +100%
intt_native_x86_64 2s 4s -50%
keccak_f1600_x1_native_aarch64 2s 2s +0%
keccak_f1600_x4_native_aarch64_v84a 2s 2s +0%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 2s 4s -50%
keccak_f1600_x4_native_avx2 2s 1s +100%
keccakf1600_permute_native 2s 4s -50%
keccakf1600x4_xor_bytes_native 2s 2s +0%
kem_enc 2s 2s +0%
mlk_barrett_reduce 2s 2s +0%
mlk_ct_cmask_nonzero_u16 2s 1s +100%
mlk_ct_get_optblocker_i32 2s 2s +0%
mlk_ct_get_optblocker_u32 2s 4s -50%
mlk_enc_getnoise_eta1_eta2 2s 6s -67%
mlk_keccakf1600_permute 2s 3s -33%
mlk_keccakf1600_xor_bytes (big endian) 2s 4s -50%
mlk_keccakf1600x4_extract_bytes_c 2s 1s +100%
mlk_keccakf1600x4_xor_bytes_c 2s 3s -33%
mlk_montgomery_reduce 2s 1s +100%
mlk_poly_add 2s 2s +0%
mlk_poly_cbd_eta1 2s 4s -50%
mlk_poly_compress_d11 2s 1s +100%
mlk_poly_compress_d11_c 2s 3s -33%
mlk_poly_compress_d5_native 2s 3s -33%
mlk_poly_compress_du 2s 2s +0%
mlk_poly_compress_dv 2s 3s -33%
mlk_poly_decompress_d10 2s 2s +0%
mlk_poly_decompress_d11 2s 1s +100%
mlk_poly_decompress_d11_c 2s 3s -33%
mlk_poly_decompress_d11_native 2s 2s +0%
mlk_poly_decompress_d4 2s 2s +0%
mlk_poly_decompress_d5_c 2s 2s +0%
mlk_poly_decompress_dv 2s 2s +0%
mlk_poly_frombytes_c 2s 4s -50%
mlk_poly_getnoise_eta1_4x 2s 2s +0%
mlk_poly_getnoise_eta2 2s 2s +0%
mlk_poly_invntt_tomont 2s 1s +100%
mlk_poly_reduce_c 2s 1s +100%
mlk_poly_tomont_c 2s 3s -33%
mlk_poly_tomont_native 2s 3s -33%
mlk_polyvec_mulcache_compute 2s 3s -33%
mlk_polyvec_permute_bitrev_to_custom 2s 3s -33%
mlk_polyvec_permute_bitrev_to_custom_native 2s 2s +0%
mlk_polyvec_reduce 2s 4s -50%
mlk_scalar_compress_d1 2s 5s -60%
mlk_scalar_compress_d10 2s 1s +100%
mlk_scalar_compress_d11 2s 2s +0%
mlk_scalar_compress_d4 2s 2s +0%
mlk_scalar_decompress_d10 2s 3s -33%
mlk_scalar_signed_to_unsigned_q 2s 3s -33%
mlk_sha3_512 2s 1s +100%
mlk_shake128_absorb_once 2s 3s -33%
mlk_shake128_squeezeblocks 2s 2s +0%
mlk_value_barrier_u8 2s 1s +100%
ntt_native_x86_64 2s 3s -33%
poly_compress_d11_native_x86_64 2s 3s -33%
poly_decompress_d5_native_x86_64 2s 3s -33%
poly_mulcache_compute_native_aarch64 2s 4s -50%
poly_mulcache_compute_native_x86_64 2s 3s -33%
poly_reduce_native_x86_64 2s 3s -33%
poly_tobytes_native_x86_64 2s 1s +100%
poly_tomont_native_aarch64 2s 2s +0%
polyvec_basemul_acc_montgomery_cached_k2_native_aarch64 2s 2s +0%
polyvec_basemul_acc_montgomery_cached_k3_native_x86_64 2s 2s +0%
polyvec_basemul_acc_montgomery_cached_k4_native_x86_64 2s 2s +0%
rej_uniform_native 2s 3s -33%
sys_check_capability 2s 2s +0%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 1s 2s -50%
keccakf1600x4_extract_bytes_native 1s 3s -67%
kem_keypair 1s 2s -50%
mlk_ct_cmask_nonzero_u8 1s 3s -67%
mlk_ct_memcmp 1s 3s -67%
mlk_ct_sel_uint8 1s 5s -80%
mlk_keccakf1600_extract_bytes 1s 3s -67%
mlk_keccakf1600_extract_bytes (big endian) 1s 3s -67%
mlk_keccakf1600x4_extract_bytes 1s 1s +0%
mlk_poly_compress_d5_c 1s 4s -75%
mlk_poly_decompress_d5 1s 2s -50%
mlk_poly_frombytes 1s 3s -67%
mlk_poly_getnoise_eta1122_4x 1s 4s -75%
mlk_poly_invntt_tomont_c 1s 2s -50%
mlk_poly_reduce 1s 1s +0%
mlk_poly_tobytes 1s 2s -50%
mlk_polyvec_compress_du 1s 2s -50%
mlk_polyvec_ntt 1s 1s +0%
mlk_polyvec_tomont 1s 2s -50%
mlk_rej_uniform 1s 2s -50%
mlk_scalar_decompress_d4 1s 1s +0%
poly_compress_d10_native_x86_64 1s 2s -50%
poly_decompress_d11_native_x86_64 1s 5s -80%
poly_getnoise_eta1122_4x_native 1s 4s -75%
poly_invntt_tomont_native 1s 2s -50%
poly_tobytes_native_aarch64 1s 3s -67%
poly_tomont_native_x86_64 1s 3s -67%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented May 5, 2026

CBMC Results (ML-KEM-768)

Full Results (191 proofs)
Proof Status Current Previous Change
**TOTAL** 1185s 1265s -6.3%
mlk_indcpa_keypair_derand 186s 200s -7%
mlk_indcpa_enc 162s 174s -7%
mlk_rej_uniform_c 114s 142s -20%
mlk_polyvec_basemul_acc_montgomery_cached_c 42s 40s +5%
mlk_poly_rej_uniform 27s 30s -10%
mlk_ntt_layer 26s 27s -4%
mlk_keccak_squeezeblocks_x4 23s 24s -4%
poly_ntt_native 23s 30s -23%
mlk_poly_reduce_native 20s 23s -13%
mlk_fqmul 17s 17s +0%
polyvec_basemul_acc_montgomery_cached_native 17s 18s -6%
keccakf1600x4_permute_native_x4 15s 16s -6%
mlk_poly_decompress_d4_native 14s 16s -12%
mlk_indcpa_dec 13s 13s +0%
mlk_poly_decompress_d10_native 12s 13s -8%
mlk_polyvec_add 10s 13s -23%
mlk_keccak_squeeze_once 8s 8s +0%
mlk_keccak_squeezeblocks 8s 10s -20%
mlk_poly_frombytes_native 8s 8s +0%
mlk_poly_rej_uniform_x4 8s 6s +33%
mlk_poly_frommsg 7s 9s -22%
mlk_poly_invntt_tomont 7s 2s +250%
mlk_keccak_absorb_once_x4 6s 6s +0%
mlk_ntt_butterfly_block 6s 7s -14%
poly_decompress_d10_native_x86_64 6s 5s +20%
rej_uniform_native_x86_64 6s 5s +20%
kem_check_pk 5s 4s +25%
mlk_invntt_layer 5s 5s +0%
mlk_keccakf1600_permute_c 5s 6s -17%
mlk_poly_compress_d10_c 5s 3s +67%
poly_decompress_d4_native_x86_64 5s 6s -17%
poly_mulcache_compute_native_x86_64 5s 4s +25%
polyvec_basemul_acc_montgomery_cached_k4_native_aarch64 5s 3s +67%
keccakf1600x4_xor_bytes_native 4s 2s +100%
kem_keypair 4s 3s +33%
mlk_ct_cmask_nonzero_u8 4s 2s +100%
mlk_gen_matrix 4s 3s +33%
mlk_keccak_absorb_once 4s 4s +0%
mlk_keccakf1600_xor_bytes (big endian) 4s 2s +100%
mlk_poly_compress_d5_native 4s 5s -20%
mlk_poly_decompress_du 4s 2s +100%
mlk_poly_ntt 4s 8s -50%
mlk_poly_reduce_c 4s 1s +300%
mlk_polyvec_mulcache_compute 4s 3s +33%
mlk_polyvec_permute_bitrev_to_custom_native 4s 5s -20%
mlk_shake128x4_absorb_once 4s 3s +33%
mlk_shake256x4 4s 4s +0%
poly_frombytes_native_x86_64 4s 5s -20%
poly_reduce_native_aarch64 4s 1s +300%
intt_native_aarch64 3s 2s +50%
keccak_f1600_x1_native_aarch64 3s 2s +50%
kem_check_sk 3s 1s +200%
kem_dec 3s 5s -40%
kem_enc_derand 3s 2s +50%
mlk_barrett_reduce 3s 3s +0%
mlk_check_pct 3s 4s -25%
mlk_ct_cmask_nonzero_u16 3s 3s +0%
mlk_ct_get_optblocker_u32 3s 2s +50%
mlk_enc_getnoise_eta1_eta2 3s 2s +50%
mlk_keccakf1600_extract_bytes (big endian) 3s 2s +50%
mlk_keccakf1600_permute 3s 3s +0%
mlk_keypair_getnoise_eta1 3s 4s -25%
mlk_matvec_mul 3s 1s +200%
mlk_montgomery_reduce 3s 1s +200%
mlk_poly_cbd_eta2 3s 3s +0%
mlk_poly_compress_d10_native 3s 1s +200%
mlk_poly_compress_d11 3s 4s -25%
mlk_poly_compress_d11_native 3s 1s +200%
mlk_poly_compress_d4 3s 3s +0%
mlk_poly_compress_d4_native 3s 2s +50%
mlk_poly_compress_du 3s 2s +50%
mlk_poly_decompress_d11_c 3s 1s +200%
mlk_poly_decompress_d4 3s 2s +50%
mlk_poly_decompress_d5_native 3s 3s +0%
mlk_poly_getnoise_eta1122_4x 3s 2s +50%
mlk_poly_getnoise_eta1_4x_native 3s 1s +200%
mlk_poly_mulcache_compute_native 3s 1s +200%
mlk_poly_ntt_c 3s 3s +0%
mlk_poly_tomont_c 3s 2s +50%
mlk_polymat_permute_bitrev_to_custom 3s 3s +0%
mlk_polyvec_tomont 3s 2s +50%
mlk_scalar_compress_d10 3s 2s +50%
mlk_scalar_compress_d4 3s 2s +50%
mlk_sha3_256 3s 2s +50%
ntt_native_x86_64 3s 4s -25%
nttunpack_native_x86_64 3s 4s -25%
poly_compress_d10_native_x86_64 3s 2s +50%
poly_compress_d5_native_x86_64 3s 2s +50%
poly_decompress_d11_native_x86_64 3s 3s +0%
poly_reduce_native_x86_64 3s 3s +0%
polyvec_basemul_acc_montgomery_cached_k2_native_x86_64 3s 3s +0%
polyvec_basemul_acc_montgomery_cached_k4_native_x86_64 3s 3s +0%
intt_native_x86_64 2s 2s +0%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 2s 3s -33%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 2s 1s +100%
keccak_f1600_x4_native_avx2 2s 2s +0%
keccakf1600_permute_native 2s 2s +0%
keccakf1600x4_extract_bytes_native 2s 2s +0%
mlk_ct_cmask_neg_i16 2s 3s -33%
mlk_ct_cmov_zero 2s 3s -33%
mlk_ct_get_optblocker_u8 2s 3s -33%
mlk_ct_sel_uint8 2s 2s +0%
mlk_gen_matrix_serial 2s 5s -60%
mlk_keccakf1600x4_extract_bytes 2s 2s +0%
mlk_keccakf1600x4_permute 2s 2s +0%
mlk_keccakf1600x4_xor_bytes 2s 1s +100%
mlk_keccakf1600x4_xor_bytes_c 2s 2s +0%
mlk_poly_cbd_eta1 2s 1s +100%
mlk_poly_compress_d10 2s 1s +100%
mlk_poly_compress_d11_c 2s 2s +0%
mlk_poly_compress_d4_c 2s 2s +0%
mlk_poly_compress_d5 2s 3s -33%
mlk_poly_compress_d5_c 2s 1s +100%
mlk_poly_compress_dv 2s 2s +0%
mlk_poly_decompress_d10 2s 3s -33%
mlk_poly_decompress_d11 2s 1s +100%
mlk_poly_decompress_d11_native 2s 3s -33%
mlk_poly_decompress_d5 2s 2s +0%
mlk_poly_decompress_d5_c 2s 4s -50%
mlk_poly_decompress_dv 2s 3s -33%
mlk_poly_getnoise_eta1_4x 2s 2s +0%
mlk_poly_getnoise_eta2 2s 3s -33%
mlk_poly_mulcache_compute_c 2s 4s -50%
mlk_poly_reduce 2s 1s +100%
mlk_poly_tobytes 2s 1s +100%
mlk_poly_tobytes_native 2s 2s +0%
mlk_poly_tomont_native 2s 3s -33%
mlk_poly_tomsg 2s 2s +0%
mlk_polyvec_basemul_acc_montgomery_cached 2s 3s -33%
mlk_polyvec_compress_du 2s 3s -33%
mlk_polyvec_decompress_du 2s 1s +100%
mlk_polyvec_frombytes 2s 2s +0%
mlk_polyvec_invntt_tomont 2s 3s -33%
mlk_polyvec_ntt 2s 1s +100%
mlk_polyvec_permute_bitrev_to_custom 2s 1s +100%
mlk_polyvec_reduce 2s 2s +0%
mlk_polyvec_tobytes 2s 4s -50%
mlk_rej_uniform 2s 1s +100%
mlk_scalar_compress_d1 2s 2s +0%
mlk_scalar_compress_d5 2s 1s +100%
mlk_scalar_signed_to_unsigned_q 2s 1s +100%
mlk_shake128_absorb_once 2s 2s +0%
mlk_shake128_squeezeblocks 2s 2s +0%
mlk_shake128x4_squeezeblocks 2s 1s +100%
mlk_value_barrier_u8 2s 2s +0%
ntt_native_aarch64 2s 4s -50%
poly_compress_d4_native_x86_64 2s 3s -33%
poly_decompress_d5_native_x86_64 2s 2s +0%
poly_getnoise_eta1122_4x_native 2s 1s +100%
poly_invntt_tomont_native 2s 2s +0%
poly_mulcache_compute_native_aarch64 2s 2s +0%
poly_tobytes_native_aarch64 2s 2s +0%
poly_tobytes_native_x86_64 2s 3s -33%
poly_tomont_native_aarch64 2s 2s +0%
polyvec_basemul_acc_montgomery_cached_k2_native_aarch64 2s 2s +0%
polyvec_basemul_acc_montgomery_cached_k3_native_aarch64 2s 4s -50%
polyvec_basemul_acc_montgomery_cached_k3_native_x86_64 2s 2s +0%
rej_uniform_native 2s 4s -50%
sys_check_capability 2s 4s -50%
keccak_f1600_x1_native_aarch64_v84a 1s 2s -50%
keccak_f1600_x4_native_aarch64_v84a 1s 2s -50%
kem_enc 1s 3s -67%
kem_keypair_derand 1s 4s -75%
mlk_ct_get_optblocker_i32 1s 2s -50%
mlk_ct_memcmp 1s 2s -50%
mlk_ct_sel_int16 1s 2s -50%
mlk_keccakf1600_extract_bytes 1s 2s -50%
mlk_keccakf1600_xor_bytes 1s 2s -50%
mlk_keccakf1600x4_extract_bytes_c 1s 2s -50%
mlk_poly_add 1s 2s -50%
mlk_poly_decompress_d10_c 1s 2s -50%
mlk_poly_decompress_d4_c 1s 2s -50%
mlk_poly_frombytes 1s 1s +0%
mlk_poly_frombytes_c 1s 2s -50%
mlk_poly_invntt_tomont_c 1s 1s +0%
mlk_poly_mulcache_compute 1s 2s -50%
mlk_poly_sub 1s 3s -67%
mlk_poly_tobytes_c 1s 2s -50%
mlk_poly_tomont 1s 1s +0%
mlk_scalar_compress_d11 1s 2s -50%
mlk_scalar_decompress_d10 1s 3s -67%
mlk_scalar_decompress_d11 1s 2s -50%
mlk_scalar_decompress_d4 1s 2s -50%
mlk_scalar_decompress_d5 1s 2s -50%
mlk_sha3_512 1s 2s -50%
mlk_shake256 1s 2s -50%
mlk_value_barrier_i32 1s 3s -67%
mlk_value_barrier_u32 1s 1s +0%
poly_compress_d11_native_x86_64 1s 3s -67%
poly_tomont_native_x86_64 1s 2s -50%
rej_uniform_native_aarch64 1s 1s +0%

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A72 (Raspberry Pi 4) benchmarks

Details
Benchmark suite Current: 12036a9 Previous: db75353 Ratio
ML-KEM-512 keypair 50711 cycles 50835 cycles 1.00
ML-KEM-512 encaps 59128 cycles 58693 cycles 1.01
ML-KEM-512 decaps 75104 cycles 74946 cycles 1.00
ML-KEM-768 keypair 86483 cycles 86333 cycles 1.00
ML-KEM-768 encaps 94967 cycles 95399 cycles 1.00
ML-KEM-768 decaps 117949 cycles 119161 cycles 0.99
ML-KEM-1024 keypair 129328 cycles 130734 cycles 0.99
ML-KEM-1024 encaps 141943 cycles 143257 cycles 0.99
ML-KEM-1024 decaps 174407 cycles 173329 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

This commit introduce separate proof for:
- mlk_keccakf1600_permute_c()
- mlk_keccakf1600x4_extract_bytes_c()
- mlk_keccakf1600x4_xor_bytes_c()

For arithmetic function that have a native implementation,
we have 3 CBMC proofs:

1. Proof for the pure C implementation names XXX_c()
2. Proof for the wrapper function on top of the C implementation
3. Proof for the wrapper function on top of the native function
   (with C fallback).

This commit seperate current proofs for these three functions follow
above structure.

For each function, the following steps performed:

- Add the corresponding CBMC contract, copied from the wrapper function.
- Create a dedicated CBMC proof for the pure C implementation.
- Update the existing wrapper CBMC proof Makefiles by adding XXX_C to
  USE_FUNCTION_CONTRACTS, and apply the same change to the native proof
  configuration.

Signed-off-by: willieyz <willie.zhao@chelpis.com>
Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
mkannwischer and others added 27 commits May 19, 2026 04:31
I recently switched from Chelpis to zeroRISC meaning our MAINTAINERS.md
is outdated.
This commit removes affilations and Discord identifers from the file as they
are unnecessary.

Signed-off-by: Matthias J. Kannwischer <matthias@zerorisc.com>
Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
Signed-off-by: Matthias J. Kannwischer <matthias@zerorisc.com>
Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
…res and tested.

1. Run scripts/autogen and scripts/lint on Mac but not sure if it runs for ppc64le.
2. Run simpasm on Red Hat Linux.
3. Added detailed comments on NTT and INTT implementations.
4. Used C type symbols to improve readability.
5. Fixed some typos.

Signed-off-by: Danny Tsen <dtsen@us.ibm.com>

The following tests were run on p10.

[09:28] danny@ltcden12-lp1 new_ppc64le_mlkem % ./scripts/tests func
INFO  > Functional Test    Compile     (native no_opt):  make func OPT=0 AUTO=1 -j40
INFO  > Functional Test    ML-KEM-512  (native no_opt):  make run_func_512 -j40
INFO  > Functional Test    ML-KEM-768  (native no_opt):  make run_func_768 -j40
INFO  > Functional Test    ML-KEM-1024 (native no_opt):  make run_func_1024 -j40
INFO  > Functional Test    Compile     (native opt):     make func OPT=1 AUTO=1 -j40
INFO  > Functional Test    ML-KEM-512  (native opt):     make run_func_512 -j40
INFO  > Functional Test    ML-KEM-768  (native opt):     make run_func_768 -j40
INFO  > Functional Test    ML-KEM-1024 (native opt):     make run_func_1024 -j40
All good!

[09:28] danny@ltcden12-lp1 new_ppc64le_mlkem % ./scripts/tests bench -c PERF
INFO  > Benchmark          Compile     (native no_opt):  make bench OPT=0 AUTO=1 CYCLES=PERF -j40
INFO  > Benchmark          ML-KEM-512  (native no_opt):  make run_bench_512
INFO  > Benchmark          ML-KEM-512  (native no_opt):  test/build/mlkem512/bin/bench_mlkem512
   keypair cycles = 66982
    encaps cycles = 78820
    decaps cycles = 100923

           percentile      1     10     20     30     40     50     60     70     80     90     99
   keypair percentiles:  66438  66690  66791  66857  66920  66982  67043  67122  67218  67306  71905
    encaps percentiles:  78322  78516  78618  78687  78752  78820  78878  78933  79012  79116  83825
    decaps percentiles: 100427 100634 100733 100804 100869 100923 100985 101056 101131 101253 105852

INFO  > Benchmark          ML-KEM-768  (native no_opt):  make run_bench_768
INFO  > Benchmark          ML-KEM-768  (native no_opt):  test/build/mlkem768/bin/bench_mlkem768
   keypair cycles = 111380
    encaps cycles = 125891
    decaps cycles = 154364

           percentile      1     10     20     30     40     50     60     70     80     90     99
   keypair percentiles: 110575 110914 111083 111192 111291 111380 111496 111617 111821 112414 116725
    encaps percentiles: 125081 125403 125526 125655 125776 125891 125998 126122 126293 126771 131358
    decaps percentiles: 153575 153870 154008 154131 154261 154364 154487 154630 154782 155313 159863

INFO  > Benchmark          ML-KEM-1024 (native no_opt):  make run_bench_1024
INFO  > Benchmark          ML-KEM-1024 (native no_opt):  test/build/mlkem1024/bin/bench_mlkem1024
   keypair cycles = 166809
    encaps cycles = 185315
    decaps cycles = 220229

           percentile      1     10     20     30     40     50     60     70     80     90     99
   keypair percentiles: 165339 165995 166236 166435 166616 166809 167007 167200 167505 171058 175606
    encaps percentiles: 183839 184563 184778 184951 185158 185315 185518 185744 186123 189637 192014
    decaps percentiles: 218911 219430 219705 219841 220027 220229 220436 220673 221029 224484 226901

INFO  > Benchmark          Compile     (native opt):     make bench OPT=1 AUTO=1 CYCLES=PERF -j40
INFO  > Benchmark          ML-KEM-512  (native opt):     make run_bench_512
INFO  > Benchmark          ML-KEM-512  (native opt):     test/build/mlkem512/bin/bench_mlkem512
   keypair cycles = 45750
    encaps cycles = 50661
    decaps cycles = 63561

           percentile      1     10     20     30     40     50     60     70     80     90     99
   keypair percentiles:  45248  45469  45546  45620  45690  45750  45806  45886  45954  46063  50703
    encaps percentiles:  50192  50367  50468  50542  50600  50661  50710  50771  50858  50954  55652
    decaps percentiles:  63091  63276  63381  63436  63497  63561  63623  63679  63743  63857  68437

INFO  > Benchmark          ML-KEM-768  (native opt):     make run_bench_768
INFO  > Benchmark          ML-KEM-768  (native opt):     test/build/mlkem768/bin/bench_mlkem768
   keypair cycles = 79045
    encaps cycles = 86455
    decaps cycles = 103878

           percentile      1     10     20     30     40     50     60     70     80     90     99
   keypair percentiles:  78313  78578  78742  78847  78954  79045  79169  79285  79470  79978  84430
    encaps percentiles:  85628  86009  86172  86272  86363  86455  86592  86711  86879  87292  92038
    decaps percentiles: 103041 103399 103540 103676 103788 103878 103993 104104 104274 104736 109361

INFO  > Benchmark          ML-KEM-1024 (native opt):     make run_bench_1024
INFO  > Benchmark          ML-KEM-1024 (native opt):     test/build/mlkem1024/bin/bench_mlkem1024
   keypair cycles = 124072
    encaps cycles = 134500
    decaps cycles = 157090

           percentile      1     10     20     30     40     50     60     70     80     90     99
   keypair percentiles: 122727 123259 123515 123720 123929 124072 124253 124527 125009 128466 133334
    encaps percentiles: 133064 133681 133933 134129 134320 134500 134711 134933 135346 138753 141067
    decaps percentiles: 155503 156261 156510 156694 156894 157090 157285 157605 158014 161592 166723

All good!

Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
   Power Systems support ISA 2.07 and above.
2. Fixed typo, headers and return (MLK_MUST_CHECK_RETURN_VALUE).

Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
(venv) [9:51][@MacBookPro] mlkem_test/ % ./scripts/autogen
✓ Generate citations (0.2s)
✓ Generate OQS META.yml files (0.0s)
– Generate SLOTHY optimized assembly (0.0s)
✓ Check assembly register aliases (0.1s)
✓ Check assembly loop labels (0.1s)
✓ Normalize assembly macro syntax (0.3s)
✓ Generate zeta and lookup tables (0.0s)
✓ Generate HOL Light assembly (1.7s)
✓ Synchronize backends (1.4s)
✓ Generate header guards (0.1s)
✓ Complete final backend synchronization (0.6s)
– Update HOL Light bytecode (0.0s)
✓ Generate monolithic source files (1.6s)
✓ Generate undefs (1.4s)
✓ Generate test configs (0.0s)
✓ Check macro typos (0.3s)
/Users/danny/my_repo/ws/docker_mlkem/mlkem_test/./scripts/autogen:500: PyparsingDeprecationWarning: 'parseString' deprecated - use
'parse_string'
  exp = self.parser.parseString(exp, parseAll=True).as_list()[0]
✓ Generate preprocessor comments (1.6s)
✓ Format files (2.2s)
updated BIBLIOGRAPHY.md
updated mlkem/src/native/ppc64le/src/reduce.S
updated mlkem/src/native/ppc64le/src/poly_tomont.S
updated dev/ppc64le/src/ntt_ppc.S
updated dev/ppc64le/src/reduce.S
updated mlkem/src/native/ppc64le/src/intt_ppc.S
updated dev/ppc64le/src/poly_tomont.S
updated mlkem/mlkem_native.c
updated integration/liboqs/config_ppc64le.h
updated dev/ppc64le/src/intt_ppc.S
updated mlkem/src/native/ppc64le/src/ntt_ppc.S
updated mlkem/mlkem_native_asm.S
✓ Finalize and write files (0.0s)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:11  Done ✓

Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
…added

  ppc64le assembly files and consts.c in mlkem_native_asm.S and mlkem_native.c
  since autogen did not added these files for ppc64le.

Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
Signed-off-by: Basil Hess <bhe@zurich.ibm.com>
Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
Signed-off-by: Basil Hess <bhe@zurich.ibm.com>
Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
and fall back to default implementation if not.

Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
1. Fixed typos and minor comments.
2. Removed IZETA_NTT_OFFSET127 in consts_intt.inc.
3. Fixed _asm subfix in backend name.
4. Fixed MLK_PPC_ prefix in constants.

Manually fixed the backend names of all assembly files in
mlkem/src/native/ppc64le/src/ to run the the test since the
simpasm can not be run properly for ppc64le in my env.

Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
…s correct.

  ./scripts/tests func runs fine.

Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
1. Fixed wrong path in ML-KEM-768_META.yml.
2. Added __POWER8_VECTOR__ guard in asm files and meta.h.
3. Fixed capitalization in asm macros and misc.
4. Renamed asm files to _ppc_asm.S.

Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
… twisted zetas.

Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
Signed-off-by: Basil Hess <bhe@zurich.ibm.com>
ppc64le: address review comments from mkannwischer
Signed-off-by: Basil Hess <bhe@zurich.ibm.com>
Signed-off-by: Basil Hess <bhe@zurich.ibm.com>
- ruff format scripts/autogen (formatting fix)
- Add check-magic annotation for array size 2072 in consts.c and consts.h
  (7 groups of 8 base constants + 4 twiddle tables * 63 rows * 8 values)

Signed-off-by: Basil Hess <bhe@zurich.ibm.com>
…power8 (nix libc is compiled for power8 and otherwise causes illegal instructions.

Avoids unused data parameter errors in the fallback code path.

Signed-off-by: Basil Hess <bhe@zurich.ibm.com>
Signed-off-by: Basil Hess <bhe@zurich.ibm.com>
@mkannwischer mkannwischer added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

benchmark this PR should be benchmarked in CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants