Skip to content

RFC: Add incremental encaps API to support ML-KEM Braid#1619

Draft
mkannwischer wants to merge 13 commits into
mainfrom
incremental-enc-api
Draft

RFC: Add incremental encaps API to support ML-KEM Braid#1619
mkannwischer wants to merge 13 commits into
mainfrom
incremental-enc-api

Conversation

@mkannwischer
Copy link
Copy Markdown
Contributor

Split ML-KEM encapsulation into two phases (mlk_kem_enc_derand_u / mlk_kem_enc_v) to support protocols like Braid that need to interleave encapsulation with other operations between computing the u- and v-components of the ciphertext. The first phase only requires the public seed and H(pk), not the full public key vector. Internally, K-PKE.Encrypt is refactored into mlk_indcpa_enc_u + mlk_indcpa_enc_v. The non-incremental KEM path calls mlk_indcpa_enc directly to avoid serialization overhead. The intermediate noise polynomial epp is serialized as 4-bit nibbles (128 bytes) - this is primarily done to not require a pre-condition on the allowed values.

@mkannwischer mkannwischer force-pushed the incremental-enc-api branch 2 times, most recently from 325ab51 to 285fc8a Compare March 12, 2026 05:37
@mkannwischer mkannwischer added the benchmark this PR should be benchmarked in CI label Mar 12, 2026
Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i)

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 11667 cycles 11774 cycles 0.99
ML-KEM-512 encaps 13401 cycles 13356 cycles 1.00
ML-KEM-512 decaps 17333 cycles 17522 cycles 0.99
ML-KEM-768 keypair 20339 cycles 20211 cycles 1.01
ML-KEM-768 encaps 21438 cycles 21480 cycles 1.00
ML-KEM-768 decaps 27521 cycles 27490 cycles 1.00
ML-KEM-1024 keypair 28756 cycles 28747 cycles 1.00
ML-KEM-1024 encaps 30828 cycles 30705 cycles 1.00
ML-KEM-1024 decaps 38764 cycles 38459 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ppc64le (POWER10) benchmarks

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 59376 cycles 59560 cycles 1.00
ML-KEM-512 encaps 72055 cycles 72057 cycles 1.00
ML-KEM-512 decaps 91812 cycles 91947 cycles 1.00
ML-KEM-768 keypair 98208 cycles 98659 cycles 1.00
ML-KEM-768 encaps 114736 cycles 115076 cycles 1.00
ML-KEM-768 decaps 140432 cycles 140831 cycles 1.00
ML-KEM-1024 keypair 148862 cycles 148847 cycles 1.00
ML-KEM-1024 encaps 167902 cycles 167928 cycles 1.00
ML-KEM-1024 decaps 198941 cycles 199093 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a)

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 13939 cycles 13907 cycles 1.00
ML-KEM-512 encaps 15689 cycles 15691 cycles 1.00
ML-KEM-512 decaps 21157 cycles 21253 cycles 1.00
ML-KEM-768 keypair 23701 cycles 23709 cycles 1.00
ML-KEM-768 encaps 25099 cycles 25155 cycles 1.00
ML-KEM-768 decaps 33133 cycles 33007 cycles 1.00
ML-KEM-1024 keypair 33205 cycles 33204 cycles 1.00
ML-KEM-1024 encaps 35665 cycles 35641 cycles 1.00
ML-KEM-1024 decaps 46453 cycles 46195 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'AMD EPYC 3rd gen (c6a)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: a4e4e31 Previous: 2bf8e59 Ratio
ML-KEM-512 encaps 16707 cycles 15974 cycles 1.05
ML-KEM-768 decaps 35711 cycles 33345 cycles 1.07
ML-KEM-1024 decaps 50650 cycles 46735 cycles 1.08

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i) (no-opt)

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 28423 cycles 28218 cycles 1.01
ML-KEM-512 encaps 35312 cycles 36635 cycles 0.96
ML-KEM-512 decaps 45241 cycles 45192 cycles 1.00
ML-KEM-768 keypair 46322 cycles 46296 cycles 1.00
ML-KEM-768 encaps 55233 cycles 55812 cycles 0.99
ML-KEM-768 decaps 69681 cycles 69913 cycles 1.00
ML-KEM-1024 keypair 70870 cycles 70293 cycles 1.01
ML-KEM-1024 encaps 83960 cycles 82553 cycles 1.02
ML-KEM-1024 decaps 101882 cycles 98932 cycles 1.03

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a)

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 12697 cycles 12706 cycles 1.00
ML-KEM-512 encaps 14226 cycles 14177 cycles 1.00
ML-KEM-512 decaps 19050 cycles 19036 cycles 1.00
ML-KEM-768 keypair 21894 cycles 21905 cycles 1.00
ML-KEM-768 encaps 22989 cycles 22946 cycles 1.00
ML-KEM-768 decaps 30055 cycles 29897 cycles 1.01
ML-KEM-1024 keypair 30714 cycles 30697 cycles 1.00
ML-KEM-1024 encaps 32722 cycles 32787 cycles 1.00
ML-KEM-1024 decaps 42327 cycles 42190 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'AMD EPYC 4th gen (c7a)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: a4e4e31 Previous: 2bf8e59 Ratio
ML-KEM-512 keypair 13236 cycles 12779 cycles 1.04
ML-KEM-512 encaps 15642 cycles 14273 cycles 1.10
ML-KEM-768 decaps 32957 cycles 30058 cycles 1.10
ML-KEM-1024 keypair 34340 cycles 32987 cycles 1.04
ML-KEM-1024 decaps 47071 cycles 42393 cycles 1.11

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i)

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 17471 cycles 17431 cycles 1.00
ML-KEM-512 encaps 19845 cycles 19836 cycles 1.00
ML-KEM-512 decaps 26406 cycles 26354 cycles 1.00
ML-KEM-768 keypair 29863 cycles 29796 cycles 1.00
ML-KEM-768 encaps 31769 cycles 31052 cycles 1.02
ML-KEM-768 decaps 41439 cycles 41419 cycles 1.00
ML-KEM-1024 keypair 42329 cycles 42318 cycles 1.00
ML-KEM-1024 encaps 45595 cycles 45892 cycles 0.99
ML-KEM-1024 decaps 59304 cycles 61098 cycles 0.97

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Intel Xeon 3rd gen (c6i)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: a4e4e31 Previous: 2bf8e59 Ratio
ML-KEM-512 encaps 20660 cycles 19953 cycles 1.04
ML-KEM-768 keypair 32264 cycles 31153 cycles 1.04
ML-KEM-1024 decaps 61128 cycles 58193 cycles 1.05

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a) (no-opt)

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 40231 cycles 40276 cycles 1.00
ML-KEM-512 encaps 48480 cycles 48441 cycles 1.00
ML-KEM-512 decaps 62705 cycles 62607 cycles 1.00
ML-KEM-768 keypair 63832 cycles 63754 cycles 1.00
ML-KEM-768 encaps 74842 cycles 75005 cycles 1.00
ML-KEM-768 decaps 93488 cycles 93641 cycles 1.00
ML-KEM-1024 keypair 95299 cycles 95232 cycles 1.00
ML-KEM-1024 encaps 109171 cycles 109421 cycles 1.00
ML-KEM-1024 decaps 132011 cycles 132194 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a) (no-opt)

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 36582 cycles 36601 cycles 1.00
ML-KEM-512 encaps 43100 cycles 43070 cycles 1.00
ML-KEM-512 decaps 55713 cycles 55708 cycles 1.00
ML-KEM-768 keypair 58695 cycles 58652 cycles 1.00
ML-KEM-768 encaps 67682 cycles 67635 cycles 1.00
ML-KEM-768 decaps 84507 cycles 84425 cycles 1.00
ML-KEM-1024 keypair 89091 cycles 88991 cycles 1.00
ML-KEM-1024 encaps 99378 cycles 99229 cycles 1.00
ML-KEM-1024 decaps 121053 cycles 120563 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A76 (Raspberry Pi 5) benchmarks

Details
Benchmark suite Current: a4e4e31 Previous: 2bf8e59 Ratio
ML-KEM-512 keypair 28285 cycles 28220 cycles 1.00
ML-KEM-512 encaps 34092 cycles 34106 cycles 1.00
ML-KEM-512 decaps 44329 cycles 44333 cycles 1.00
ML-KEM-768 keypair 47645 cycles 47614 cycles 1.00
ML-KEM-768 encaps 53834 cycles 53939 cycles 1.00
ML-KEM-768 decaps 68301 cycles 68365 cycles 1.00
ML-KEM-1024 keypair 70227 cycles 70253 cycles 1.00
ML-KEM-1024 encaps 78707 cycles 78729 cycles 1.00
ML-KEM-1024 decaps 98290 cycles 98443 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 17676 cycles 17646 cycles 1.00
ML-KEM-512 encaps 20593 cycles 20606 cycles 1.00
ML-KEM-512 decaps 27028 cycles 27084 cycles 1.00
ML-KEM-768 keypair 29923 cycles 29905 cycles 1.00
ML-KEM-768 encaps 32788 cycles 32773 cycles 1.00
ML-KEM-768 decaps 41939 cycles 41963 cycles 1.00
ML-KEM-1024 keypair 43711 cycles 43739 cycles 1.00
ML-KEM-1024 encaps 48758 cycles 48736 cycles 1.00
ML-KEM-1024 decaps 61406 cycles 61382 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i) (no-opt)

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 45684 cycles 45722 cycles 1.00
ML-KEM-512 encaps 54598 cycles 54423 cycles 1.00
ML-KEM-512 decaps 69928 cycles 69779 cycles 1.00
ML-KEM-768 keypair 73225 cycles 74154 cycles 0.99
ML-KEM-768 encaps 86160 cycles 86032 cycles 1.00
ML-KEM-768 decaps 106234 cycles 106582 cycles 1.00
ML-KEM-1024 keypair 112133 cycles 112073 cycles 1.00
ML-KEM-1024 encaps 124870 cycles 124711 cycles 1.00
ML-KEM-1024 decaps 150839 cycles 150591 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4 (no-opt)

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 35448 cycles 35408 cycles 1.00
ML-KEM-512 encaps 41305 cycles 40111 cycles 1.03
ML-KEM-512 decaps 51288 cycles 51135 cycles 1.00
ML-KEM-768 keypair 56738 cycles 56671 cycles 1.00
ML-KEM-768 encaps 64836 cycles 65149 cycles 1.00
ML-KEM-768 decaps 79062 cycles 79291 cycles 1.00
ML-KEM-1024 keypair 88013 cycles 87860 cycles 1.00
ML-KEM-1024 encaps 97113 cycles 96876 cycles 1.00
ML-KEM-1024 decaps 116135 cycles 115825 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 18674 cycles 18640 cycles 1.00
ML-KEM-512 encaps 21835 cycles 21878 cycles 1.00
ML-KEM-512 decaps 28794 cycles 28869 cycles 1.00
ML-KEM-768 keypair 31593 cycles 31542 cycles 1.00
ML-KEM-768 encaps 34796 cycles 34773 cycles 1.00
ML-KEM-768 decaps 44735 cycles 44779 cycles 1.00
ML-KEM-1024 keypair 46064 cycles 46077 cycles 1.00
ML-KEM-1024 encaps 51462 cycles 51494 cycles 1.00
ML-KEM-1024 decaps 65067 cycles 65017 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 28337 cycles 28270 cycles 1.00
ML-KEM-512 encaps 34209 cycles 34120 cycles 1.00
ML-KEM-512 decaps 44538 cycles 44375 cycles 1.00
ML-KEM-768 keypair 47612 cycles 47674 cycles 1.00
ML-KEM-768 encaps 53936 cycles 53909 cycles 1.00
ML-KEM-768 decaps 68333 cycles 68363 cycles 1.00
ML-KEM-1024 keypair 70349 cycles 70257 cycles 1.00
ML-KEM-1024 encaps 78617 cycles 78760 cycles 1.00
ML-KEM-1024 decaps 98461 cycles 98451 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3 (no-opt)

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 38934 cycles 38890 cycles 1.00
ML-KEM-512 encaps 46774 cycles 44600 cycles 1.05
ML-KEM-512 decaps 56788 cycles 56685 cycles 1.00
ML-KEM-768 keypair 62284 cycles 62295 cycles 1.00
ML-KEM-768 encaps 71210 cycles 72323 cycles 0.98
ML-KEM-768 decaps 86947 cycles 87695 cycles 0.99
ML-KEM-1024 keypair 96359 cycles 96156 cycles 1.00
ML-KEM-1024 encaps 106402 cycles 106137 cycles 1.00
ML-KEM-1024 decaps 126922 cycles 126582 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2 (no-opt)

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 59254 cycles 59136 cycles 1.00
ML-KEM-512 encaps 69196 cycles 68627 cycles 1.01
ML-KEM-512 decaps 87340 cycles 87348 cycles 1.00
ML-KEM-768 keypair 95410 cycles 95336 cycles 1.00
ML-KEM-768 encaps 110535 cycles 109885 cycles 1.01
ML-KEM-768 decaps 134324 cycles 134360 cycles 1.00
ML-KEM-1024 keypair 145962 cycles 147936 cycles 0.99
ML-KEM-1024 encaps 161958 cycles 163772 cycles 0.99
ML-KEM-1024 decaps 193999 cycles 195429 cycles 0.99

This comment was automatically generated by workflow using github-action-benchmark.

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented Mar 12, 2026

CBMC Results (ML-KEM-512)

⚠️ Attention Required

Proof Status Current Previous Change
mlk_enc_derand_u - - -
mlk_enc_v - - -
mlk_indcpa_enc - 162s -
mlk_indcpa_enc_u - - -
mlk_indcpa_enc_v - - -
Full Results (194 proofs)
Proof Status Current Previous Change
**TOTAL** 1245s 1248s -0.2%
mlk_indcpa_keypair_derand 268s 238s +13%
mlk_rej_uniform_c 152s 109s +39%
mlk_polyvec_basemul_acc_montgomery_cached_c 56s 47s +19%
mlk_ntt_layer 40s 29s +38%
mlk_poly_rej_uniform 34s 29s +17%
mlk_keccak_squeezeblocks_x4 32s 25s +28%
poly_ntt_native 26s 25s +4%
mlk_poly_reduce_native 25s 20s +25%
keccakf1600x4_permute_native_x4 20s 16s +25%
mlk_fqmul 20s 15s +33%
mlk_indcpa_dec 18s 14s +29%
mlk_poly_decompress_d4_native 17s 12s +42%
mlk_poly_decompress_d10_native 16s 14s +14%
mlk_polyvec_add 15s 11s +36%
mlk_keccak_squeezeblocks 11s 8s +38%
mlk_poly_frommsg 11s 9s +22%
mlk_poly_frombytes_native 9s 7s +29%
mlk_poly_rej_uniform_x4 9s 7s +29%
mlk_keccak_absorb_once_x4 8s 5s +60%
mlk_poly_ntt 8s 8s +0%
polyvec_basemul_acc_montgomery_cached_k4_native_x86_64 8s 1s +700%
polyvec_basemul_acc_montgomery_cached_native 8s 7s +14%
mlk_poly_cbd_eta2 7s 4s +75%
mlk_keccak_squeeze_once 6s 8s -25%
mlk_ntt_butterfly_block 6s 7s -14%
mlk_poly_decompress_d5_c 6s 1s +500%
mlk_poly_getnoise_eta1_4x 6s 3s +100%
mlk_polyvec_tomont 6s 3s +100%
poly_decompress_d10_native_x86_64 6s 5s +20%
poly_decompress_d4_native_x86_64 6s 3s +100%
kem_enc_derand 5s 3s +67%
mlk_check_pct 5s 3s +67%
mlk_invntt_layer 5s 5s +0%
mlk_keccak_absorb_once 5s 4s +25%
mlk_polyvec_frombytes 5s 2s +150%
mlk_scalar_compress_d11 5s 1s +400%
nttunpack_native_x86_64 5s 3s +67%
poly_frombytes_native_x86_64 5s 5s +0%
rej_uniform_native_x86_64 5s 5s +0%
keccak_f1600_x1_native_aarch64 4s 1s +300%
keccakf1600x4_extract_bytes_native 4s 6s -33%
kem_dec 4s 4s +0%
mlk_barrett_reduce 4s 4s +0%
mlk_gen_matrix_serial 4s 4s +0%
mlk_keccakf1600_permute_c 4s 4s +0%
mlk_keccakf1600x4_extract_bytes_c 4s 1s +300%
mlk_poly_mulcache_compute_c 4s 3s +33%
mlk_poly_mulcache_compute_native 4s 2s +100%
mlk_poly_tomont 4s 2s +100%
mlk_polyvec_permute_bitrev_to_custom 4s 2s +100%
mlk_shake128_squeezeblocks 4s 2s +100%
mlk_shake256x4 4s 4s +0%
mlk_value_barrier_u32 4s 1s +300%
poly_compress_d10_native_x86_64 4s 5s -20%
rej_uniform_native 4s 2s +100%
intt_native_x86_64 3s 3s +0%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 3s 2s +50%
keccakf1600x4_xor_bytes_native 3s 3s +0%
kem_enc 3s 3s +0%
mlk_ct_cmask_neg_i16 3s 2s +50%
mlk_ct_cmask_nonzero_u8 3s 1s +200%
mlk_ct_memcmp 3s 2s +50%
mlk_ct_sel_int16 3s 2s +50%
mlk_gen_matrix 3s 2s +50%
mlk_keccakf1600_xor_bytes 3s 1s +200%
mlk_keccakf1600_xor_bytes (big endian) 3s 3s +0%
mlk_keccakf1600x4_xor_bytes 3s 2s +50%
mlk_keccakf1600x4_xor_bytes_c 3s 2s +50%
mlk_matvec_mul 3s 3s +0%
mlk_poly_compress_d10_c 3s 3s +0%
mlk_poly_compress_d4_c 3s 3s +0%
mlk_poly_compress_d4_native 3s 3s +0%
mlk_poly_compress_d5_c 3s 2s +50%
mlk_poly_compress_d5_native 3s 2s +50%
mlk_poly_decompress_d11 3s 1s +200%
mlk_poly_decompress_d4_c 3s 2s +50%
mlk_poly_decompress_d5_native 3s 4s -25%
mlk_poly_decompress_du 3s 1s +200%
mlk_poly_invntt_tomont_c 3s 2s +50%
mlk_poly_ntt_c 3s 3s +0%
mlk_poly_tobytes_native 3s 3s +0%
mlk_polyvec_compress_du 3s 4s -25%
mlk_polyvec_invntt_tomont 3s 1s +200%
mlk_polyvec_ntt 3s 2s +50%
mlk_polyvec_permute_bitrev_to_custom_native 3s 3s +0%
mlk_polyvec_reduce 3s 2s +50%
mlk_scalar_compress_d10 3s 1s +200%
mlk_scalar_compress_d4 3s 2s +50%
mlk_sha3_256 3s 3s +0%
mlk_shake128x4_absorb_once 3s 1s +200%
mlk_shake128x4_squeezeblocks 3s 1s +200%
mlk_value_barrier_u8 3s 2s +50%
ntt_native_x86_64 3s 3s +0%
poly_getnoise_eta1122_4x_native 3s 2s +50%
poly_reduce_native_aarch64 3s 3s +0%
poly_reduce_native_x86_64 3s 2s +50%
intt_native_aarch64 2s 2s +0%
keccak_f1600_x1_native_aarch64_v84a 2s 3s -33%
keccak_f1600_x4_native_aarch64_v84a 2s 1s +100%
keccak_f1600_x4_native_avx2 2s 3s -33%
keccakf1600_permute_native 2s 1s +100%
kem_check_pk 2s 2s +0%
kem_keypair 2s 2s +0%
kem_keypair_derand 2s 5s -60%
mlk_ct_cmask_nonzero_u16 2s 2s +0%
mlk_ct_get_optblocker_u32 2s 2s +0%
mlk_ct_sel_uint8 2s 2s +0%
mlk_keccakf1600_extract_bytes 2s 3s -33%
mlk_keccakf1600_permute 2s 4s -50%
mlk_keccakf1600x4_permute 2s 1s +100%
mlk_keypair_getnoise_eta1 2s 3s -33%
mlk_montgomery_reduce 2s 2s +0%
mlk_poly_cbd_eta1 2s 3s -33%
mlk_poly_compress_d10_native 2s 2s +0%
mlk_poly_compress_d11_native 2s 3s -33%
mlk_poly_compress_d4 2s 2s +0%
mlk_poly_compress_d5 2s 1s +100%
mlk_poly_compress_du 2s 2s +0%
mlk_poly_compress_dv 2s 2s +0%
mlk_poly_decompress_d10 2s 3s -33%
mlk_poly_decompress_d10_c 2s 2s +0%
mlk_poly_decompress_d11_c 2s 4s -50%
mlk_poly_decompress_d5 2s 3s -33%
mlk_poly_frombytes 2s 2s +0%
mlk_poly_getnoise_eta1122_4x 2s 2s +0%
mlk_poly_getnoise_eta1_4x_native 2s 4s -50%
mlk_poly_invntt_tomont 2s 2s +0%
mlk_poly_mulcache_compute 2s 3s -33%
mlk_poly_sub 2s 3s -33%
mlk_poly_tobytes 2s 2s +0%
mlk_poly_tomont_c 2s 1s +100%
mlk_polymat_permute_bitrev_to_custom 2s 4s -50%
mlk_polyvec_basemul_acc_montgomery_cached 2s 3s -33%
mlk_polyvec_decompress_du 2s 3s -33%
mlk_polyvec_mulcache_compute 2s 3s -33%
mlk_polyvec_tobytes 2s 2s +0%
mlk_rej_uniform 2s 2s +0%
mlk_scalar_compress_d1 2s 3s -33%
mlk_scalar_compress_d5 2s 3s -33%
mlk_scalar_decompress_d10 2s 2s +0%
mlk_scalar_decompress_d11 2s 2s +0%
mlk_scalar_decompress_d4 2s 3s -33%
mlk_scalar_decompress_d5 2s 3s -33%
mlk_scalar_signed_to_unsigned_q 2s 4s -50%
mlk_shake128_absorb_once 2s 1s +100%
mlk_shake256 2s 2s +0%
mlk_value_barrier_i32 2s 3s -33%
ntt_native_aarch64 2s 2s +0%
poly_compress_d4_native_x86_64 2s 4s -50%
poly_decompress_d11_native_x86_64 2s 2s +0%
poly_decompress_d5_native_x86_64 2s 1s +100%
poly_tobytes_native_aarch64 2s 3s -33%
poly_tobytes_native_x86_64 2s 2s +0%
polyvec_basemul_acc_montgomery_cached_k2_native_x86_64 2s 3s -33%
polyvec_basemul_acc_montgomery_cached_k3_native_x86_64 2s 1s +100%
polyvec_basemul_acc_montgomery_cached_k4_native_aarch64 2s 3s -33%
mlk_enc_derand_u - - -
mlk_enc_v - - -
mlk_indcpa_enc - 162s -
mlk_indcpa_enc_u - - -
mlk_indcpa_enc_v - - -
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 1s 4s -75%
kem_check_sk 1s 2s -50%
mlk_ct_cmov_zero 1s 2s -50%
mlk_ct_get_optblocker_i32 1s 2s -50%
mlk_ct_get_optblocker_u8 1s 1s +0%
mlk_keccakf1600_extract_bytes (big endian) 1s 2s -50%
mlk_keccakf1600x4_extract_bytes 1s 4s -75%
mlk_poly_add 1s 3s -67%
mlk_poly_compress_d10 1s 3s -67%
mlk_poly_compress_d11 1s 1s +0%
mlk_poly_compress_d11_c 1s 2s -50%
mlk_poly_decompress_d11_native 1s 3s -67%
mlk_poly_decompress_d4 1s 3s -67%
mlk_poly_decompress_dv 1s 2s -50%
mlk_poly_frombytes_c 1s 2s -50%
mlk_poly_getnoise_eta2 1s 2s -50%
mlk_poly_reduce 1s 3s -67%
mlk_poly_reduce_c 1s 2s -50%
mlk_poly_tobytes_c 1s 1s +0%
mlk_poly_tomont_native 1s 1s +0%
mlk_poly_tomsg 1s 3s -67%
mlk_sha3_512 1s 1s +0%
poly_compress_d11_native_x86_64 1s 3s -67%
poly_compress_d5_native_x86_64 1s 1s +0%
poly_invntt_tomont_native 1s 4s -75%
poly_mulcache_compute_native_aarch64 1s 2s -50%
poly_mulcache_compute_native_x86_64 1s 2s -50%
poly_tomont_native_aarch64 1s 3s -67%
poly_tomont_native_x86_64 1s 2s -50%
polyvec_basemul_acc_montgomery_cached_k2_native_aarch64 1s 3s -67%
polyvec_basemul_acc_montgomery_cached_k3_native_aarch64 1s 1s +0%
rej_uniform_native_aarch64 1s 3s -67%
sys_check_capability 1s 3s -67%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented Mar 12, 2026

CBMC Results (ML-KEM-768)

⚠️ Attention Required

Proof Status Current Previous Change
mlk_enc_derand_u - - -
mlk_enc_v - - -
mlk_indcpa_enc - 160s -
mlk_indcpa_enc_u - - -
mlk_indcpa_enc_v - - -
mlk_ntt_layer ⚠️ 42s 28s +50%
Full Results (194 proofs)
Proof Status Current Previous Change
**TOTAL** 1150s 1167s -1.5%
mlk_indcpa_keypair_derand 204s 182s +12%
mlk_rej_uniform_c 150s 113s +33%
mlk_polyvec_basemul_acc_montgomery_cached_c 48s 40s +20%
mlk_ntt_layer ⚠️ 42s 28s +50%
mlk_poly_rej_uniform 33s 31s +6%
poly_ntt_native 28s 20s +40%
mlk_keccak_squeezeblocks_x4 26s 22s +18%
mlk_poly_reduce_native 20s 19s +5%
polyvec_basemul_acc_montgomery_cached_native 20s 16s +25%
mlk_fqmul 17s 14s +21%
mlk_poly_decompress_d4_native 16s 13s +23%
keccakf1600x4_permute_native_x4 15s 15s +0%
mlk_poly_decompress_d10_native 15s 12s +25%
mlk_indcpa_dec 13s 11s +18%
mlk_polyvec_add 11s 7s +57%
mlk_keccak_squeezeblocks 10s 6s +67%
mlk_poly_frombytes_native 9s 7s +29%
mlk_poly_frommsg 9s 8s +12%
mlk_poly_ntt 9s 6s +50%
mlk_keccak_squeeze_once 8s 9s -11%
mlk_keccak_absorb_once_x4 7s 5s +40%
mlk_ntt_butterfly_block 7s 6s +17%
poly_decompress_d4_native_x86_64 7s 4s +75%
mlk_keccakf1600_permute_c 6s 7s -14%
mlk_poly_rej_uniform_x4 6s 5s +20%
poly_decompress_d10_native_x86_64 6s 5s +20%
rej_uniform_native_x86_64 6s 5s +20%
mlk_gen_matrix 5s 1s +400%
mlk_invntt_layer 5s 3s +67%
mlk_keccak_absorb_once 5s 4s +25%
mlk_polymat_permute_bitrev_to_custom 5s 3s +67%
mlk_shake256x4 5s 3s +67%
poly_frombytes_native_x86_64 5s 3s +67%
intt_native_aarch64 4s 2s +100%
keccakf1600x4_extract_bytes_native 4s 2s +100%
kem_dec 4s 5s -20%
mlk_poly_compress_d11 4s 2s +100%
mlk_poly_compress_d4_c 4s 3s +33%
mlk_poly_decompress_d11_native 4s 2s +100%
mlk_poly_reduce_c 4s 1s +300%
mlk_poly_tobytes 4s 4s +0%
mlk_poly_tomsg 4s 3s +33%
mlk_polyvec_decompress_du 4s 4s +0%
mlk_polyvec_permute_bitrev_to_custom 4s 3s +33%
mlk_scalar_compress_d5 4s 1s +300%
mlk_scalar_decompress_d11 4s 3s +33%
mlk_sha3_512 4s 2s +100%
mlk_value_barrier_u8 4s 2s +100%
polyvec_basemul_acc_montgomery_cached_k3_native_x86_64 4s 2s +100%
keccak_f1600_x4_native_aarch64_v84a 3s 2s +50%
keccak_f1600_x4_native_avx2 3s 1s +200%
keccakf1600_permute_native 3s 2s +50%
keccakf1600x4_xor_bytes_native 3s 1s +200%
kem_check_sk 3s 2s +50%
kem_enc 3s 3s +0%
kem_enc_derand 3s 1s +200%
mlk_ct_cmask_nonzero_u8 3s 2s +50%
mlk_ct_memcmp 3s 3s +0%
mlk_gen_matrix_serial 3s 3s +0%
mlk_keccakf1600_extract_bytes 3s 3s +0%
mlk_keccakf1600x4_extract_bytes 3s 2s +50%
mlk_matvec_mul 3s 6s -50%
mlk_montgomery_reduce 3s 2s +50%
mlk_poly_add 3s 2s +50%
mlk_poly_compress_d10 3s 2s +50%
mlk_poly_compress_d10_c 3s 3s +0%
mlk_poly_compress_d4 3s 3s +0%
mlk_poly_compress_d5 3s 3s +0%
mlk_poly_compress_du 3s 4s -25%
mlk_poly_compress_dv 3s 2s +50%
mlk_poly_decompress_d10_c 3s 1s +200%
mlk_poly_decompress_d4_c 3s 3s +0%
mlk_poly_decompress_dv 3s 2s +50%
mlk_poly_frombytes_c 3s 1s +200%
mlk_poly_getnoise_eta1122_4x 3s 2s +50%
mlk_poly_getnoise_eta1_4x 3s 1s +200%
mlk_poly_getnoise_eta1_4x_native 3s 2s +50%
mlk_poly_invntt_tomont 3s 1s +200%
mlk_poly_mulcache_compute 3s 4s -25%
mlk_poly_mulcache_compute_c 3s 2s +50%
mlk_poly_mulcache_compute_native 3s 3s +0%
mlk_poly_ntt_c 3s 1s +200%
mlk_poly_sub 3s 3s +0%
mlk_poly_tobytes_c 3s 4s -25%
mlk_poly_tobytes_native 3s 3s +0%
mlk_poly_tomont_native 3s 3s +0%
mlk_polyvec_frombytes 3s 2s +50%
mlk_polyvec_ntt 3s 1s +200%
mlk_polyvec_permute_bitrev_to_custom_native 3s 1s +200%
mlk_rej_uniform 3s 1s +200%
mlk_scalar_compress_d1 3s 3s +0%
mlk_scalar_compress_d10 3s 2s +50%
mlk_scalar_signed_to_unsigned_q 3s 3s +0%
mlk_shake128_absorb_once 3s 2s +50%
mlk_shake128x4_squeezeblocks 3s 4s -25%
ntt_native_aarch64 3s 1s +200%
nttunpack_native_x86_64 3s 4s -25%
poly_invntt_tomont_native 3s 4s -25%
poly_mulcache_compute_native_x86_64 3s 2s +50%
poly_tomont_native_aarch64 3s 3s +0%
polyvec_basemul_acc_montgomery_cached_k2_native_aarch64 3s 1s +200%
polyvec_basemul_acc_montgomery_cached_k4_native_aarch64 3s 2s +50%
polyvec_basemul_acc_montgomery_cached_k4_native_x86_64 3s 4s -25%
rej_uniform_native 3s 2s +50%
sys_check_capability 3s 3s +0%
intt_native_x86_64 2s 4s -50%
keccak_f1600_x1_native_aarch64_v84a 2s 4s -50%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 2s 1s +100%
kem_keypair 2s 2s +0%
kem_keypair_derand 2s 2s +0%
mlk_barrett_reduce 2s 4s -50%
mlk_ct_cmask_neg_i16 2s 3s -33%
mlk_ct_cmask_nonzero_u16 2s 1s +100%
mlk_ct_get_optblocker_i32 2s 2s +0%
mlk_ct_get_optblocker_u8 2s 3s -33%
mlk_ct_sel_uint8 2s 1s +100%
mlk_keccakf1600_xor_bytes 2s 3s -33%
mlk_keccakf1600x4_permute 2s 3s -33%
mlk_keccakf1600x4_xor_bytes 2s 2s +0%
mlk_keccakf1600x4_xor_bytes_c 2s 3s -33%
mlk_keypair_getnoise_eta1 2s 3s -33%
mlk_poly_cbd_eta1 2s 2s +0%
mlk_poly_cbd_eta2 2s 2s +0%
mlk_poly_compress_d10_native 2s 3s -33%
mlk_poly_compress_d11_c 2s 2s +0%
mlk_poly_compress_d11_native 2s 1s +100%
mlk_poly_compress_d4_native 2s 3s -33%
mlk_poly_compress_d5_native 2s 1s +100%
mlk_poly_decompress_d10 2s 2s +0%
mlk_poly_decompress_d11 2s 2s +0%
mlk_poly_decompress_d4 2s 2s +0%
mlk_poly_decompress_d5 2s 2s +0%
mlk_poly_decompress_d5_c 2s 3s -33%
mlk_poly_decompress_du 2s 2s +0%
mlk_poly_frombytes 2s 2s +0%
mlk_poly_invntt_tomont_c 2s 1s +100%
mlk_poly_reduce 2s 2s +0%
mlk_polyvec_basemul_acc_montgomery_cached 2s 3s -33%
mlk_polyvec_compress_du 2s 2s +0%
mlk_polyvec_tobytes 2s 4s -50%
mlk_polyvec_tomont 2s 4s -50%
mlk_scalar_compress_d4 2s 2s +0%
mlk_scalar_decompress_d4 2s 2s +0%
mlk_scalar_decompress_d5 2s 2s +0%
mlk_sha3_256 2s 1s +100%
mlk_shake128_squeezeblocks 2s 2s +0%
mlk_shake256 2s 3s -33%
mlk_value_barrier_i32 2s 3s -33%
ntt_native_x86_64 2s 1s +100%
poly_compress_d10_native_x86_64 2s 4s -50%
poly_compress_d11_native_x86_64 2s 2s +0%
poly_compress_d5_native_x86_64 2s 2s +0%
poly_decompress_d11_native_x86_64 2s 1s +100%
poly_decompress_d5_native_x86_64 2s 3s -33%
poly_getnoise_eta1122_4x_native 2s 2s +0%
poly_mulcache_compute_native_aarch64 2s 3s -33%
poly_reduce_native_x86_64 2s 3s -33%
poly_tobytes_native_aarch64 2s 1s +100%
polyvec_basemul_acc_montgomery_cached_k2_native_x86_64 2s 2s +0%
polyvec_basemul_acc_montgomery_cached_k3_native_aarch64 2s 2s +0%
rej_uniform_native_aarch64 2s 4s -50%
mlk_enc_derand_u - - -
mlk_enc_v - - -
mlk_indcpa_enc - 160s -
mlk_indcpa_enc_u - - -
mlk_indcpa_enc_v - - -
keccak_f1600_x1_native_aarch64 1s 2s -50%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 1s 2s -50%
kem_check_pk 1s 3s -67%
mlk_check_pct 1s 2s -50%
mlk_ct_cmov_zero 1s 1s +0%
mlk_ct_get_optblocker_u32 1s 1s +0%
mlk_ct_sel_int16 1s 1s +0%
mlk_keccakf1600_extract_bytes (big endian) 1s 2s -50%
mlk_keccakf1600_permute 1s 2s -50%
mlk_keccakf1600_xor_bytes (big endian) 1s 1s +0%
mlk_keccakf1600x4_extract_bytes_c 1s 3s -67%
mlk_poly_compress_d5_c 1s 4s -75%
mlk_poly_decompress_d11_c 1s 3s -67%
mlk_poly_decompress_d5_native 1s 2s -50%
mlk_poly_getnoise_eta2 1s 2s -50%
mlk_poly_tomont 1s 3s -67%
mlk_poly_tomont_c 1s 3s -67%
mlk_polyvec_invntt_tomont 1s 1s +0%
mlk_polyvec_mulcache_compute 1s 2s -50%
mlk_polyvec_reduce 1s 2s -50%
mlk_scalar_compress_d11 1s 2s -50%
mlk_scalar_decompress_d10 1s 1s +0%
mlk_shake128x4_absorb_once 1s 6s -83%
mlk_value_barrier_u32 1s 2s -50%
poly_compress_d4_native_x86_64 1s 4s -75%
poly_reduce_native_aarch64 1s 3s -67%
poly_tobytes_native_x86_64 1s 2s -50%
poly_tomont_native_x86_64 1s 3s -67%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented Mar 12, 2026

CBMC Results (ML-KEM-1024)

⚠️ Attention Required

Proof Status Current Previous Change
mlk_enc_derand_u - - -
mlk_enc_v - - -
mlk_indcpa_enc - 143s -
mlk_indcpa_enc_u - - -
mlk_indcpa_enc_v - - -
Full Results (194 proofs)
Proof Status Current Previous Change
**TOTAL** 1108s 1264s -12.3%
mlk_rej_uniform_c 137s 132s +4%
mlk_indcpa_keypair_derand 125s 131s -5%
mlk_polyvec_basemul_acc_montgomery_cached_c 82s 81s +1%
polyvec_basemul_acc_montgomery_cached_native 34s 36s -6%
mlk_ntt_layer 32s 35s -9%
mlk_poly_rej_uniform 29s 32s -9%
poly_ntt_native 29s 27s +7%
mlk_keccak_squeezeblocks_x4 27s 27s +0%
mlk_poly_reduce_native 20s 22s -9%
keccakf1600x4_permute_native_x4 17s 18s -6%
mlk_fqmul 17s 15s +13%
mlk_poly_decompress_d11_native 15s 13s +15%
mlk_poly_decompress_d5_native 15s 16s -6%
mlk_polyvec_add 14s 13s +8%
mlk_poly_frommsg 11s 9s +22%
mlk_poly_frombytes_native 10s 9s +11%
mlk_indcpa_dec 9s 9s +0%
mlk_keccak_squeeze_once 9s 9s +0%
mlk_polymat_permute_bitrev_to_custom 9s 7s +29%
mlk_keccak_squeezeblocks 8s 9s -11%
mlk_invntt_layer 7s 5s +40%
mlk_ntt_butterfly_block 7s 8s -12%
mlk_poly_ntt 7s 7s +0%
mlk_keccak_absorb_once 6s 4s +50%
mlk_keccak_absorb_once_x4 6s 6s +0%
mlk_keccakf1600_permute_c 6s 5s +20%
mlk_poly_rej_uniform_x4 6s 9s -33%
rej_uniform_native_x86_64 6s 8s -25%
kem_dec 5s 7s -29%
mlk_gen_matrix 5s 6s -17%
mlk_gen_matrix_serial 5s 5s +0%
mlk_poly_compress_d11_c 5s 5s +0%
nttunpack_native_x86_64 5s 3s +67%
poly_compress_d5_native_x86_64 5s 4s +25%
poly_decompress_d11_native_x86_64 5s 5s +0%
poly_frombytes_native_x86_64 5s 5s +0%
rej_uniform_native_aarch64 5s 3s +67%
mlk_ct_sel_uint8 4s 3s +33%
mlk_keccakf1600_permute 4s 2s +100%
mlk_poly_cbd_eta1 4s 1s +300%
mlk_poly_compress_d4 4s 4s +0%
mlk_poly_decompress_d4 4s 3s +33%
mlk_poly_getnoise_eta1_4x 4s 3s +33%
mlk_poly_ntt_c 4s 3s +33%
mlk_poly_reduce_c 4s 1s +300%
mlk_polyvec_compress_du 4s 3s +33%
mlk_polyvec_permute_bitrev_to_custom_native 4s 4s +0%
mlk_scalar_decompress_d11 4s 1s +300%
mlk_scalar_decompress_d5 4s 3s +33%
ntt_native_x86_64 4s 4s +0%
poly_compress_d10_native_x86_64 4s 2s +100%
poly_decompress_d5_native_x86_64 4s 3s +33%
poly_mulcache_compute_native_aarch64 4s 3s +33%
poly_tobytes_native_x86_64 4s 2s +100%
polyvec_basemul_acc_montgomery_cached_k2_native_x86_64 4s 3s +33%
polyvec_basemul_acc_montgomery_cached_k4_native_x86_64 4s 2s +100%
intt_native_x86_64 3s 2s +50%
keccak_f1600_x1_native_aarch64 3s 2s +50%
keccak_f1600_x4_native_aarch64_v84a 3s 3s +0%
kem_check_sk 3s 2s +50%
kem_enc 3s 2s +50%
kem_keypair_derand 3s 3s +0%
mlk_check_pct 3s 2s +50%
mlk_ct_get_optblocker_i32 3s 1s +200%
mlk_keccakf1600_extract_bytes 3s 1s +200%
mlk_keccakf1600x4_extract_bytes 3s 1s +200%
mlk_keccakf1600x4_extract_bytes_c 3s 2s +50%
mlk_keccakf1600x4_permute 3s 1s +200%
mlk_poly_compress_d11_native 3s 2s +50%
mlk_poly_compress_d4_c 3s 3s +0%
mlk_poly_compress_d4_native 3s 3s +0%
mlk_poly_compress_d5_native 3s 1s +200%
mlk_poly_compress_du 3s 3s +0%
mlk_poly_decompress_d11 3s 2s +50%
mlk_poly_decompress_d11_c 3s 3s +0%
mlk_poly_decompress_d4_c 3s 2s +50%
mlk_poly_decompress_d4_native 3s 1s +200%
mlk_poly_decompress_d5_c 3s 3s +0%
mlk_poly_decompress_dv 3s 2s +50%
mlk_poly_frombytes 3s 2s +50%
mlk_poly_getnoise_eta1_4x_native 3s 2s +50%
mlk_poly_mulcache_compute_c 3s 5s -40%
mlk_poly_tomont_c 3s 1s +200%
mlk_poly_tomsg 3s 3s +0%
mlk_polyvec_decompress_du 3s 1s +200%
mlk_polyvec_mulcache_compute 3s 2s +50%
mlk_polyvec_ntt 3s 2s +50%
mlk_polyvec_tobytes 3s 3s +0%
mlk_scalar_compress_d1 3s 2s +50%
mlk_scalar_compress_d10 3s 3s +0%
mlk_scalar_signed_to_unsigned_q 3s 1s +200%
mlk_shake128_absorb_once 3s 2s +50%
mlk_shake256 3s 3s +0%
mlk_value_barrier_i32 3s 2s +50%
ntt_native_aarch64 3s 1s +200%
poly_compress_d11_native_x86_64 3s 2s +50%
poly_compress_d4_native_x86_64 3s 3s +0%
poly_getnoise_eta1122_4x_native 3s 3s +0%
poly_reduce_native_aarch64 3s 2s +50%
polyvec_basemul_acc_montgomery_cached_k2_native_aarch64 3s 2s +50%
polyvec_basemul_acc_montgomery_cached_k3_native_aarch64 3s 3s +0%
polyvec_basemul_acc_montgomery_cached_k4_native_aarch64 3s 3s +0%
intt_native_aarch64 2s 5s -60%
keccak_f1600_x1_native_aarch64_v84a 2s 2s +0%
keccak_f1600_x4_native_avx2 2s 3s -33%
keccakf1600x4_extract_bytes_native 2s 4s -50%
kem_enc_derand 2s 3s -33%
kem_keypair 2s 2s +0%
mlk_barrett_reduce 2s 4s -50%
mlk_ct_cmask_neg_i16 2s 3s -33%
mlk_ct_cmask_nonzero_u8 2s 2s +0%
mlk_ct_memcmp 2s 4s -50%
mlk_ct_sel_int16 2s 1s +100%
mlk_keccakf1600_extract_bytes (big endian) 2s 2s +0%
mlk_keccakf1600_xor_bytes 2s 2s +0%
mlk_keccakf1600x4_xor_bytes_c 2s 4s -50%
mlk_keypair_getnoise_eta1 2s 1s +100%
mlk_montgomery_reduce 2s 2s +0%
mlk_poly_add 2s 5s -60%
mlk_poly_cbd_eta2 2s 3s -33%
mlk_poly_compress_d10 2s 1s +100%
mlk_poly_compress_d10_c 2s 2s +0%
mlk_poly_compress_d10_native 2s 2s +0%
mlk_poly_compress_d11 2s 3s -33%
mlk_poly_compress_d5 2s 2s +0%
mlk_poly_compress_d5_c 2s 5s -60%
mlk_poly_compress_dv 2s 2s +0%
mlk_poly_decompress_d10 2s 3s -33%
mlk_poly_decompress_d10_c 2s 2s +0%
mlk_poly_decompress_d5 2s 1s +100%
mlk_poly_decompress_du 2s 4s -50%
mlk_poly_frombytes_c 2s 2s +0%
mlk_poly_getnoise_eta1122_4x 2s 1s +100%
mlk_poly_getnoise_eta2 2s 1s +100%
mlk_poly_invntt_tomont_c 2s 3s -33%
mlk_poly_mulcache_compute_native 2s 5s -60%
mlk_poly_reduce 2s 2s +0%
mlk_poly_sub 2s 4s -50%
mlk_poly_tobytes_native 2s 2s +0%
mlk_poly_tomont 2s 2s +0%
mlk_polyvec_frombytes 2s 3s -33%
mlk_polyvec_permute_bitrev_to_custom 2s 3s -33%
mlk_polyvec_reduce 2s 3s -33%
mlk_polyvec_tomont 2s 4s -50%
mlk_rej_uniform 2s 1s +100%
mlk_scalar_compress_d11 2s 1s +100%
mlk_scalar_compress_d5 2s 1s +100%
mlk_scalar_decompress_d10 2s 3s -33%
mlk_sha3_256 2s 3s -33%
mlk_sha3_512 2s 2s +0%
mlk_shake128x4_absorb_once 2s 1s +100%
mlk_shake128x4_squeezeblocks 2s 2s +0%
mlk_shake256x4 2s 3s -33%
mlk_value_barrier_u32 2s 2s +0%
mlk_value_barrier_u8 2s 2s +0%
poly_decompress_d10_native_x86_64 2s 2s +0%
poly_decompress_d4_native_x86_64 2s 3s -33%
poly_invntt_tomont_native 2s 3s -33%
poly_reduce_native_x86_64 2s 3s -33%
poly_tobytes_native_aarch64 2s 2s +0%
poly_tomont_native_aarch64 2s 2s +0%
poly_tomont_native_x86_64 2s 3s -33%
rej_uniform_native 2s 4s -50%
sys_check_capability 2s 3s -33%
mlk_enc_derand_u - - -
mlk_enc_v - - -
mlk_indcpa_enc - 143s -
mlk_indcpa_enc_u - - -
mlk_indcpa_enc_v - - -
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 1s 2s -50%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 1s 1s +0%
keccakf1600_permute_native 1s 1s +0%
keccakf1600x4_xor_bytes_native 1s 3s -67%
kem_check_pk 1s 4s -75%
mlk_ct_cmask_nonzero_u16 1s 2s -50%
mlk_ct_cmov_zero 1s 4s -75%
mlk_ct_get_optblocker_u32 1s 3s -67%
mlk_ct_get_optblocker_u8 1s 4s -75%
mlk_keccakf1600_xor_bytes (big endian) 1s 1s +0%
mlk_keccakf1600x4_xor_bytes 1s 1s +0%
mlk_matvec_mul 1s 3s -67%
mlk_poly_decompress_d10_native 1s 3s -67%
mlk_poly_invntt_tomont 1s 1s +0%
mlk_poly_mulcache_compute 1s 3s -67%
mlk_poly_tobytes 1s 2s -50%
mlk_poly_tobytes_c 1s 2s -50%
mlk_poly_tomont_native 1s 3s -67%
mlk_polyvec_basemul_acc_montgomery_cached 1s 3s -67%
mlk_polyvec_invntt_tomont 1s 2s -50%
mlk_scalar_compress_d4 1s 3s -67%
mlk_scalar_decompress_d4 1s 2s -50%
mlk_shake128_squeezeblocks 1s 3s -67%
poly_mulcache_compute_native_x86_64 1s 3s -67%
polyvec_basemul_acc_montgomery_cached_k3_native_x86_64 1s 4s -75%

@hanno-becker hanno-becker added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels Mar 13, 2026
Copy link
Copy Markdown
Contributor

@hanno-becker hanno-becker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of 0a01cc4? Tests also serve as documentation, and using internal constants rather than public ones sets a wrong example.

If this is needed, can it be done in a preparatory PR? It seems unrelated to this PR.

@mkannwischer
Copy link
Copy Markdown
Contributor Author

mkannwischer commented Mar 13, 2026

What's the purpose of 0a01cc4? Tests also serve as documentation, and using internal constants rather than public ones sets a wrong example.

If this is needed, can it be done in a preparatory PR? It seems unrelated to this PR.

The main question here is if we want to add the new API in mlkem_native.h or not. If we don't, we can't test the API in the standard test_mlkem.c, but we could add it in a separate test that includes kem.h, but not mlkem_native.h.
The purpose of 0a01cc4 was to get something to work first, so we can discuss how we want to proceed.

I agree with you that we don't want to keep it as is right now.

@hanno-becker
Copy link
Copy Markdown
Contributor

Seeing that you also observed a slowdown on x86, I wonder if we should treat the incremental API as internal by default and only expose it in the public API if some new option MLK_CONFIG_ENABLE_MLKEM_BRAID it set?

@hanno-becker hanno-becker added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels Mar 17, 2026
@mkannwischer mkannwischer force-pushed the incremental-enc-api branch 2 times, most recently from 4f0ace1 to 732adb5 Compare May 7, 2026 05:35
@mkannwischer mkannwischer added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels May 7, 2026
Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mac Mini (M1, 2020) benchmarks

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 12320 cycles 12320 cycles 1
ML-KEM-512 encaps 15047 cycles 14999 cycles 1.00
ML-KEM-512 decaps 19599 cycles 19552 cycles 1.00
ML-KEM-768 keypair 21264 cycles 21264 cycles 1
ML-KEM-768 encaps 23880 cycles 23870 cycles 1.00
ML-KEM-768 decaps 30427 cycles 30414 cycles 1.00
ML-KEM-1024 keypair 30323 cycles 30327 cycles 1.00
ML-KEM-1024 encaps 34616 cycles 34573 cycles 1.00
ML-KEM-1024 decaps 44229 cycles 44193 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A55 (Snapdragon 888) benchmarks

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 59787 cycles 59728 cycles 1.00
ML-KEM-512 encaps 67447 cycles 67429 cycles 1.00
ML-KEM-512 decaps 86139 cycles 86125 cycles 1.00
ML-KEM-768 keypair 97408 cycles 97470 cycles 1.00
ML-KEM-768 encaps 110758 cycles 110896 cycles 1.00
ML-KEM-768 decaps 137357 cycles 138405 cycles 0.99
ML-KEM-1024 keypair 154780 cycles 154989 cycles 1.00
ML-KEM-1024 encaps 171299 cycles 172090 cycles 1.00
ML-KEM-1024 decaps 207123 cycles 209372 cycles 0.99

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A72 (Raspberry Pi 4) benchmarks

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 50693 cycles 51223 cycles 0.99
ML-KEM-512 encaps 58494 cycles 59547 cycles 0.98
ML-KEM-512 decaps 74583 cycles 75793 cycles 0.98
ML-KEM-768 keypair 85700 cycles 86166 cycles 0.99
ML-KEM-768 encaps 93550 cycles 94272 cycles 0.99
ML-KEM-768 decaps 117423 cycles 117661 cycles 1.00
ML-KEM-1024 keypair 130295 cycles 129800 cycles 1.00
ML-KEM-1024 encaps 141861 cycles 142914 cycles 0.99
ML-KEM-1024 decaps 173922 cycles 174806 cycles 0.99

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SpacemiT K1 8 (Banana Pi F3) benchmarks

Details
Benchmark suite Current: 856b540 Previous: c0fb232 Ratio
ML-KEM-512 keypair 155501 cycles 155510 cycles 1.00
ML-KEM-512 encaps 163235 cycles 163424 cycles 1.00
ML-KEM-512 decaps 206715 cycles 206679 cycles 1.00
ML-KEM-768 keypair 249857 cycles 249912 cycles 1.00
ML-KEM-768 encaps 270337 cycles 270404 cycles 1.00
ML-KEM-768 decaps 332607 cycles 332257 cycles 1.00
ML-KEM-1024 keypair 395706 cycles 396307 cycles 1.00
ML-KEM-1024 encaps 423713 cycles 423343 cycles 1.00
ML-KEM-1024 decaps 505216 cycles 507057 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@mkannwischer mkannwischer force-pushed the incremental-enc-api branch from 1ce787b to a4e4e31 Compare May 7, 2026 06:44
@mkannwischer mkannwischer added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels May 7, 2026
mkannwischer and others added 13 commits May 24, 2026 15:09
Split K-PKE.Encrypt and ML-KEM.Encaps into two phases (u and v) to
support protocols like MLKEMBraid that transmit large KEM components
in parallel over bandwidth-constrained channels.

CPA level (indcpa):
- mlk_indcpa_enc_u: computes ct_u from ek_seed, outputs intermediate
  state (sp, epp)
- mlk_indcpa_enc_v: computes ct_v from ek_vector using intermediate
  state from enc_u

CCA KEM level (kem):
- mlk_kem_enc_derand_u: FO transform + enc_u, outputs shared secret
  and intermediate state; only needs ek_seed and H(pk)
- mlk_kem_enc_v: modulus check on ek_vector + enc_v; only needs
  ek_vector

The test verifies that the incremental API produces identical
ciphertexts and shared secrets as the standard API across all three
parameter sets.

Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Use mlk_kem_enc_derand_u + mlk_kem_enc_v as the single implementation
for both the standard and incremental encapsulation API. Serialize the
intermediate state (sp, epp) via 16-bit little-endian encoding into
separate buffers sp_serial[MLKEM_POLYVEC16_BYTES] and
epp_serial[MLKEM_POLY16_BYTES].

Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Add CBMC contracts for mlk_indcpa_enc_u and mlk_indcpa_enc_v, including
an epp coefficient bound postcondition on enc_u (array_abs_bound ETA2+1)
and a matching precondition on enc_v (array_abs_bound 16).

Serialize epp as 4-bit nibbles (ETA2 - x) in 128 bytes instead of
16-bit LE (512 bytes), providing a natural coefficient bound on
deserialization. Revert mlk_kem_enc_derand to call mlk_indcpa_enc
directly, avoiding unnecessary serialization overhead.

Add CBMC proofs for indcpa_enc_u, indcpa_enc_v, kem_enc_derand_u,
and kem_enc_v. Update the indcpa_enc proof to compose enc_u and enc_v.

Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Change mlk_kem_enc_derand_u and mlk_kem_enc_v from MLK_INTERNAL_API
to MLK_EXTERNAL_API so they are not static in monolithic builds.
Add -Wno-unused-function to the monolithic_build_multilevel_native
example (matching mldsa-native) since those examples don't exercise
the incremental API.

Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
Signed-off-by: Matthias J. Kannwischer <matthias@zerorisc.com>
@mkannwischer mkannwischer force-pushed the incremental-enc-api branch from a4e4e31 to 856b540 Compare May 24, 2026 07:13
@mkannwischer mkannwischer added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels May 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

benchmark this PR should be benchmarked in CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants