Skip to content

x86_64: 32-byte align Keccak x4 AVX2 stack frame#1124

Merged
mkannwischer merged 1 commit into
mainfrom
keccak_stack_align
May 21, 2026
Merged

x86_64: 32-byte align Keccak x4 AVX2 stack frame#1124
mkannwischer merged 1 commit into
mainfrom
keccak_stack_align

Conversation

@hanno-becker
Copy link
Copy Markdown
Contributor

For better performance, align stack to 32-byte in the AVX2 x4 Keccak backend implementation.

Updates HOL-Light proofs accordingly. Unfortunately, no existing tactic for the extension from the 'core' to the 'subroutine' proofs can handle the pattern we use, so we fall back to some ad-hoc tactics. As and when s2n-bignum adds support for stack alignment to its automation, this can hopefully be removed.

We also extend scripts/cfify to track the CFA register. A mov rsp, %REG re-anchors the CFA on REG, and subsequent modifications to the RSP do not require CFI directives. We handle this by conditioning the rules subq/addq->cfi_adjust_cfa_offset on the operand being the current CFA reg.

For better performance, align stack to 32-byte in the
AVX2 x4 Keccak backend implementation.

Updates HOL-Light proofs accordingly. Unfortunately, no existing tactic
for the extension from the 'core' to the 'subroutine' proofs can handle
the pattern we use, so we fall back to some ad-hoc tactics. As and when
s2n-bignum adds support for stack alignment to its automation, this can
hopefully be removed.

We also extend scripts/cfify to track the CFA register. A `mov rsp, %REG`
re-anchors the CFA on REG, and subsequent modifications to the RSP do not
require CFI directives. We handle this by conditioning the rules
subq/addq->cfi_adjust_cfa_offset on the operand being the current CFA reg.

Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented May 21, 2026

CBMC Results (ML-DSA-44, REDUCE-RAM)

Full Results (199 proofs)
Proof Status Current Previous Change
**TOTAL** 1393s 1412s -1.3%
poly_pointwise_montgomery_c 170s 164s +4%
rej_uniform_native 103s 107s -4%
mld_invntt_layer 99s 98s +1%
polyvec_matrix_pointwise_montgomery_yvec 89s 90s -1%
mld_ct_memcmp 67s 65s +3%
mld_ntt_layer 42s 41s +2%
fqmul 29s 27s +7%
mld_attempt_signature_generation 27s 26s +4%
sign_verify_internal 24s 23s +4%
keccakf1600x4_permute_native 21s 22s -5%
rej_uniform 19s 19s +0%
rej_uniform_c 19s 19s +0%
polyeta_unpack 16s 15s +7%
mld_ntt_butterfly_block 15s 14s +7%
poly_chknorm_c 14s 15s -7%
mld_check_pct 13s 16s -19%
poly_add 13s 12s +8%
polyz_unpack_c 13s 12s +8%
poly_uniform_eta_4x 11s 10s +10%
polyt0_unpack 10s 12s -17%
polyveck_chknorm 10s 11s -9%
keccak_absorb_once_x4 9s 10s -10%
poly_caddq_c 9s 8s +12%
poly_decompose_c 8s 8s +0%
poly_power2round 8s 6s +33%
polyvec_matrix_pointwise_montgomery_row 8s 7s +14%
keccak_absorb 7s 5s +40%
keccak_squeezeblocks_x4 7s 4s +75%
sign 7s 8s -12%
compute_pack_t0_t1 6s 6s +0%
keccakf1600x4_extract_bytes_native 6s 4s +50%
mld_compute_pack_z 6s 6s +0%
mld_keccakf1600_permute_c 6s 6s +0%
poly_invntt_tomont_c 6s 8s -25%
poly_uniform 6s 3s +100%
polyt0_pack 6s 3s +100%
sign_pk_from_sk 6s 4s +50%
sign_verify_pre_hash_internal 6s 3s +100%
mld_h 5s 6s -17%
pointwise_acc_native_x86_64 5s 5s +0%
poly_chknorm 5s 1s +400%
poly_decompose_32_native_aarch64 5s 3s +67%
poly_invntt_tomont 5s 3s +67%
poly_shiftl 5s 5s +0%
polyveck_invntt_tomont 5s 4s +25%
polyvecl_ntt 5s 4s +25%
polyz_unpack_17_native_aarch64 5s 3s +67%
sign_keypair 5s 4s +25%
sign_open 5s 6s -17%
sign_verify_extmu 5s 5s +0%
decompose 4s 5s -20%
keccakf1600_xor_bytes (big endian) 4s 2s +100%
keccakf1600x4_extract_bytes 4s 3s +33%
keccakf1600x4_permute 4s 3s +33%
mld_keccakf1600x4_extract_bytes_c 4s 4s +0%
mld_sample_s1_s2_serial 4s 4s +0%
pointwise_acc_native_aarch64 4s 4s +0%
pointwise_native_aarch64 4s 2s +100%
pointwise_native_x86_64 4s 4s +0%
poly_caddq 4s 3s +33%
poly_caddq_native 4s 3s +33%
poly_challenge 4s 3s +33%
poly_ntt 4s 3s +33%
poly_reduce 4s 1s +300%
poly_uniform_eta 4s 5s -20%
poly_uniform_gamma1 4s 2s +100%
poly_use_hint_native 4s 2s +100%
polyt1_unpack 4s 3s +33%
polyveck_reduce 4s 7s -43%
polyvecl_chknorm 4s 3s +33%
sign_keypair_internal 4s 5s -20%
sign_signature 4s 7s -43%
sign_signature_internal 4s 5s -20%
sign_signature_pre_hash_internal 4s 6s -33%
sign_signature_pre_hash_shake256 4s 2s +100%
sign_verify_pre_hash_shake256 4s 2s +100%
sk_s2hat_get_poly 4s 4s +0%
sys_check_capability 4s 3s +33%
intt_native_aarch64 3s 4s -25%
intt_native_x86_64 3s 6s -50%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 3s 2s +50%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 3s 4s -25%
keccak_finalize 3s 2s +50%
keccak_init 3s 2s +50%
make_hint 3s 4s -25%
mld_ct_get_optblocker_u32 3s 2s +50%
mld_polymat_expand_entry 3s 2s +50%
mld_sample_s1_s2 3s 3s +0%
montgomery_reduce 3s 3s +0%
ntt_native_x86_64 3s 4s -25%
nttunpack_native_x86_64 3s 2s +50%
pack_sig_h 3s 5s -40%
poly_chknorm_native_aarch64 3s 3s +0%
poly_invntt_tomont_native 3s 4s -25%
poly_ntt_native 3s 2s +50%
poly_permute_bitrev_to_custom_optional_native 3s 2s +50%
poly_sub 3s 2s +50%
poly_uniform_4x 3s 2s +50%
poly_uniform_gamma1_4x 3s 3s +0%
poly_use_hint_c 3s 3s +0%
polyvec_matrix_expand_serial 3s 3s +0%
polyveck_caddq 3s 6s -50%
polyveck_decompose 3s 8s -62%
polyveck_pack_eta 3s 4s -25%
polyveck_unpack_eta 3s 2s +50%
polyvecl_pointwise_acc_montgomery 3s 3s +0%
polyvecl_uniform_gamma1_serial 3s 2s +50%
polyvecl_unpack_eta 3s 2s +50%
polyw1_pack 3s 2s +50%
polyz_pack 3s 4s -25%
reduce32 3s 3s +0%
rej_eta 3s 3s +0%
rej_eta_c 3s 3s +0%
shake128_absorb 3s 4s -25%
shake128_init 3s 1s +200%
shake128_release 3s 2s +50%
shake128x4_squeezeblocks 3s 2s +50%
shake256 3s 2s +50%
shake256_finalize 3s 3s +0%
shake256_release 3s 2s +50%
shake256_squeeze 3s 2s +50%
shake256x4_absorb_once 3s 1s +200%
sig_unpack_hints 3s 3s +0%
sign_signature_extmu 3s 5s -40%
sign_verify 3s 3s +0%
sk_s1hat_get_poly 3s 2s +50%
sk_t0hat_get_poly 3s 3s +0%
unpack_sk_s1hat 3s 3s +0%
yvec_init 3s 3s +0%
caddq 2s 2s +0%
fqscale 2s 4s -50%
keccak_f1600_x1_native_aarch64 2s 3s -33%
keccak_f1600_x4_native_aarch64_v84a 2s 5s -60%
keccak_f1600_x4_native_avx2 2s 2s +0%
keccak_squeeze 2s 3s -33%
keccakf1600_permute 2s 2s +0%
keccakf1600_permute_native 2s 2s +0%
keccakf1600x4_xor_bytes 2s 4s -50%
keccakf1600x4_xor_bytes_native 2s 3s -33%
mld_ct_abs_i32 2s 4s -50%
mld_ct_cmask_neg_i32 2s 1s +100%
mld_ct_cmask_nonzero_u32 2s 2s +0%
mld_ct_cmask_nonzero_u8 2s 3s -33%
mld_ct_get_optblocker_i64 2s 4s -50%
mld_ct_get_optblocker_u8 2s 2s +0%
mld_keccakf1600_extract_bytes 2s 4s -50%
mld_prepare_domain_separation_prefix 2s 7s -71%
mld_value_barrier_i64 2s 2s +0%
mld_value_barrier_u32 2s 2s +0%
mld_value_barrier_u8 2s 2s +0%
ntt_native_aarch64 2s 2s +0%
pack_sig_c 2s 4s -50%
pack_sig_z 2s 3s -33%
pack_sk_rho_key_tr_s2 2s 2s +0%
pack_sk_s1 2s 3s -33%
poly_caddq_native_aarch64 2s 2s +0%
poly_decompose 2s 4s -50%
poly_decompose_88_native_aarch64 2s 5s -60%
poly_decompose_native 2s 3s -33%
poly_ntt_c 2s 3s -33%
poly_permute_bitrev_to_custom_optional 2s 2s +0%
poly_pointwise_montgomery_native 2s 2s +0%
poly_use_hint 2s 3s -33%
poly_use_hint_native_aarch64 2s 3s -33%
polyeta_pack 2s 5s -60%
polyt1_pack 2s 2s +0%
polyvec_matrix_expand 2s 4s -50%
polyveck_pack_w1 2s 3s -33%
polyvecl_pack_eta 2s 4s -50%
polyvecl_pointwise_acc_montgomery_c 2s 3s -33%
polyvecl_uniform_gamma1 2s 2s +0%
polyvecl_unpack_z 2s 2s +0%
polyz_unpack_19_native_aarch64 2s 6s -67%
polyz_unpack_native 2s 2s +0%
power2round 2s 3s -33%
rej_eta_native 2s 3s -33%
shake128_finalize 2s 1s +100%
shake128_squeeze 2s 3s -33%
shake128x4_absorb_once 2s 2s +0%
shake256_absorb 2s 2s +0%
shake256x4_squeezeblocks 2s 1s +100%
unpack_pk_t1 2s 2s +0%
unpack_sk 2s 3s -33%
unpack_sk_t0hat 2s 2s +0%
use_hint 2s 3s -33%
yvec_get_poly 2s 4s -50%
keccak_f1600_x1_native_aarch64_v84a 1s 2s -50%
keccakf1600_extract_bytes (big endian) 1s 2s -50%
keccakf1600_xor_bytes 1s 3s -67%
mld_ct_sel_int32 1s 1s +0%
mld_keccakf1600x4_xor_bytes_c 1s 3s -67%
poly_chknorm_native 1s 3s -67%
poly_pointwise_montgomery 1s 4s -75%
polyveck_ntt 1s 2s -50%
polyvecl_pointwise_acc_montgomery_native 1s 2s -50%
polyz_unpack 1s 2s -50%
rej_uniform_native_aarch64 1s - new
shake256_init 1s 2s -50%
unpack_sk_s2hat 1s 2s -50%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented May 21, 2026

CBMC Results (ML-DSA-87, REDUCE-RAM)

Full Results (199 proofs)
Proof Status Current Previous Change
**TOTAL** 1557s 1563s -0.4%
poly_pointwise_montgomery_c 189s 198s -5%
polyvec_matrix_pointwise_montgomery_yvec 135s 135s +0%
mld_invntt_layer 112s 113s -1%
rej_uniform_native 109s 116s -6%
mld_ct_memcmp 72s 70s +3%
mld_ntt_layer 45s 46s -2%
sign_verify_internal 38s 37s +3%
fqmul 31s 27s +15%
mld_attempt_signature_generation 26s 27s -4%
keccakf1600x4_permute_native 23s 24s -4%
rej_uniform 21s 23s -9%
rej_uniform_c 21s 19s +11%
mld_ntt_butterfly_block 16s 15s +7%
polyeta_unpack 16s 15s +7%
polyveck_decompose 16s 15s +7%
mld_check_pct 13s 12s +8%
poly_chknorm_c 13s 15s -13%
poly_uniform_eta_4x 12s 11s +9%
polyt0_unpack 11s 12s -8%
poly_add 10s 14s -29%
keccak_absorb_once_x4 9s 10s -10%
pointwise_acc_native_x86_64 9s 6s +50%
poly_caddq_c 9s 8s +12%
polyvec_matrix_pointwise_montgomery_row 9s 10s -10%
sign_pk_from_sk 9s 7s +29%
compute_pack_t0_t1 8s 6s +33%
mld_keccakf1600_permute_c 8s 6s +33%
pointwise_acc_native_aarch64 8s 9s -11%
keccak_absorb 7s 6s +17%
mld_sample_s1_s2 7s 6s +17%
mld_sample_s1_s2_serial 7s 6s +17%
poly_power2round 7s 9s -22%
polyz_unpack_c 7s 7s +0%
rej_eta_native 7s 4s +75%
ntt_native_aarch64 6s 2s +200%
poly_decompose_c 6s 4s +50%
poly_invntt_tomont_c 6s 8s -25%
poly_ntt 6s 5s +20%
poly_pointwise_montgomery 6s 2s +200%
polyveck_caddq 6s 6s +0%
polyveck_chknorm 6s 7s -14%
polyveck_reduce 6s 7s -14%
polyvecl_ntt 6s 9s -33%
rej_uniform_native_aarch64 6s - new
sign 6s 10s -40%
sign_keypair_internal 6s 5s +20%
sign_open 6s 3s +100%
intt_native_aarch64 5s 2s +150%
keccak_squeezeblocks_x4 5s 4s +25%
mld_compute_pack_z 5s 5s +0%
pack_sk_s1 5s 2s +150%
poly_ntt_c 5s 4s +25%
poly_permute_bitrev_to_custom_optional_native 5s 3s +67%
poly_shiftl 5s 6s -17%
polyt1_unpack 5s 3s +67%
polyveck_invntt_tomont 5s 5s +0%
rej_eta 5s 2s +150%
rej_eta_c 5s 4s +25%
sign_keypair 5s 3s +67%
sign_signature_internal 5s 4s +25%
sign_verify_pre_hash_internal 5s 4s +25%
sign_verify_pre_hash_shake256 5s 6s -17%
sk_t0hat_get_poly 5s 3s +67%
unpack_pk_t1 5s 2s +150%
keccak_f1600_x1_native_aarch64 4s 2s +100%
keccakf1600x4_extract_bytes_native 4s 1s +300%
make_hint 4s 2s +100%
mld_ct_cmask_nonzero_u8 4s 4s +0%
mld_value_barrier_i64 4s 1s +300%
ntt_native_x86_64 4s 3s +33%
pointwise_native_aarch64 4s 2s +100%
poly_caddq 4s 5s -20%
poly_chknorm_native 4s 4s +0%
poly_ntt_native 4s 2s +100%
poly_permute_bitrev_to_custom_optional 4s 3s +33%
poly_uniform_4x 4s 4s +0%
poly_uniform_gamma1 4s 3s +33%
poly_uniform_gamma1_4x 4s 3s +33%
poly_use_hint 4s 3s +33%
polyveck_ntt 4s 5s -20%
polyveck_unpack_eta 4s 4s +0%
polyvecl_chknorm 4s 5s -20%
polyz_unpack 4s 1s +300%
polyz_unpack_19_native_aarch64 4s 3s +33%
shake256_absorb 4s 4s +0%
sig_unpack_hints 4s 2s +100%
sign_signature 4s 4s +0%
sign_signature_pre_hash_internal 4s 4s +0%
sk_s1hat_get_poly 4s 3s +33%
yvec_get_poly 4s 3s +33%
caddq 3s 2s +50%
decompose 3s 3s +0%
fqscale 3s 3s +0%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 3s 3s +0%
keccak_f1600_x4_native_avx2 3s 2s +50%
keccak_finalize 3s 2s +50%
keccak_squeeze 3s 5s -40%
keccakf1600_extract_bytes (big endian) 3s 2s +50%
keccakf1600_permute 3s 2s +50%
mld_ct_get_optblocker_u32 3s 1s +200%
mld_prepare_domain_separation_prefix 3s 5s -40%
pack_sig_h 3s 2s +50%
pack_sk_rho_key_tr_s2 3s 1s +200%
pointwise_native_x86_64 3s 3s +0%
poly_caddq_native_aarch64 3s 4s -25%
poly_challenge 3s 5s -40%
poly_chknorm_native_aarch64 3s 2s +50%
poly_invntt_tomont 3s 3s +0%
poly_invntt_tomont_native 3s 4s -25%
poly_pointwise_montgomery_native 3s 4s -25%
poly_reduce 3s 2s +50%
poly_uniform 3s 2s +50%
poly_uniform_eta 3s 5s -40%
poly_use_hint_native 3s 2s +50%
polyvec_matrix_expand_serial 3s 3s +0%
polyvecl_pointwise_acc_montgomery_c 3s 2s +50%
polyvecl_uniform_gamma1_serial 3s 3s +0%
polyvecl_unpack_z 3s 2s +50%
polyz_pack 3s 3s +0%
polyz_unpack_17_native_aarch64 3s 4s -25%
polyz_unpack_native 3s 4s -25%
power2round 3s 1s +200%
reduce32 3s 3s +0%
shake128_absorb 3s 3s +0%
shake128_finalize 3s 2s +50%
shake128_release 3s 4s -25%
shake256_release 3s 4s -25%
sign_signature_pre_hash_shake256 3s 3s +0%
sign_verify_extmu 3s 2s +50%
sys_check_capability 3s 3s +0%
unpack_sk 3s 2s +50%
unpack_sk_s2hat 3s 5s -40%
intt_native_x86_64 2s 5s -60%
keccak_f1600_x1_native_aarch64_v84a 2s 1s +100%
keccak_f1600_x4_native_aarch64_v84a 2s 3s -33%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 2s 1s +100%
keccak_init 2s 3s -33%
keccakf1600_xor_bytes (big endian) 2s 3s -33%
keccakf1600x4_extract_bytes 2s 2s +0%
keccakf1600x4_permute 2s 2s +0%
keccakf1600x4_xor_bytes 2s 2s +0%
keccakf1600x4_xor_bytes_native 2s 2s +0%
mld_ct_abs_i32 2s 2s +0%
mld_ct_cmask_neg_i32 2s 1s +100%
mld_ct_cmask_nonzero_u32 2s 4s -50%
mld_ct_get_optblocker_i64 2s 1s +100%
mld_ct_get_optblocker_u8 2s 3s -33%
mld_ct_sel_int32 2s 2s +0%
mld_h 2s 4s -50%
mld_keccakf1600_extract_bytes 2s 5s -60%
mld_keccakf1600x4_extract_bytes_c 2s 3s -33%
mld_keccakf1600x4_xor_bytes_c 2s 3s -33%
mld_polymat_expand_entry 2s 3s -33%
mld_value_barrier_u8 2s 2s +0%
montgomery_reduce 2s 4s -50%
nttunpack_native_x86_64 2s 4s -50%
pack_sig_c 2s 2s +0%
poly_caddq_native 2s 4s -50%
poly_chknorm 2s 3s -33%
poly_decompose_32_native_aarch64 2s 2s +0%
poly_decompose_native 2s 1s +100%
poly_sub 2s 5s -60%
poly_use_hint_c 2s 4s -50%
poly_use_hint_native_aarch64 2s 2s +0%
polyeta_pack 2s 3s -33%
polyt0_pack 2s 3s -33%
polyt1_pack 2s 1s +100%
polyvec_matrix_expand 2s 3s -33%
polyveck_pack_eta 2s 3s -33%
polyveck_pack_w1 2s 3s -33%
polyvecl_pack_eta 2s 4s -50%
polyvecl_pointwise_acc_montgomery_native 2s 3s -33%
polyvecl_uniform_gamma1 2s 2s +0%
polyvecl_unpack_eta 2s 2s +0%
polyw1_pack 2s 2s +0%
shake128_init 2s 5s -60%
shake128_squeeze 2s 3s -33%
shake256 2s 4s -50%
shake256_init 2s 2s +0%
shake256_squeeze 2s 4s -50%
shake256x4_absorb_once 2s 1s +100%
shake256x4_squeezeblocks 2s 2s +0%
sign_signature_extmu 2s 4s -50%
sign_verify 2s 4s -50%
sk_s2hat_get_poly 2s 4s -50%
unpack_sk_s1hat 2s 3s -33%
unpack_sk_t0hat 2s 2s +0%
use_hint 2s 5s -60%
yvec_init 2s 4s -50%
keccakf1600_permute_native 1s 2s -50%
keccakf1600_xor_bytes 1s 2s -50%
mld_value_barrier_u32 1s 2s -50%
pack_sig_z 1s 2s -50%
poly_decompose 1s 3s -67%
poly_decompose_88_native_aarch64 1s 2s -50%
polyvecl_pointwise_acc_montgomery 1s 2s -50%
shake128x4_absorb_once 1s 4s -75%
shake128x4_squeezeblocks 1s 1s +0%
shake256_finalize 1s 3s -67%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented May 21, 2026

CBMC Results (ML-DSA-65, REDUCE-RAM)

Full Results (199 proofs)
Proof Status Current Previous Change
**TOTAL** 1596s 1497s +6.6%
poly_pointwise_montgomery_c 186s 175s +6%
polyvec_matrix_pointwise_montgomery_yvec 160s 150s +7%
rej_uniform_native 112s 98s +14%
mld_invntt_layer 109s 102s +7%
mld_ct_memcmp 69s 66s +5%
mld_ntt_layer 44s 43s +2%
fqmul 29s 28s +4%
mld_attempt_signature_generation 28s 27s +4%
sign_verify_internal 26s 28s -7%
keccakf1600x4_permute_native 22s 21s +5%
rej_uniform 21s 20s +5%
rej_uniform_c 21s 18s +17%
polyvecl_chknorm 19s 18s +6%
mld_check_pct 18s 15s +20%
mld_ntt_butterfly_block 15s 17s -12%
poly_chknorm_c 14s 13s +8%
poly_uniform_eta_4x 13s 11s +18%
polyveck_decompose 13s 12s +8%
poly_add 11s 12s -8%
polyt0_unpack 11s 12s -8%
compute_pack_t0_t1 9s 7s +29%
keccak_absorb 9s 6s +50%
poly_invntt_tomont_c 9s 8s +12%
polyvec_matrix_pointwise_montgomery_row 9s 9s +0%
polyveck_caddq 9s 10s -10%
polyz_unpack_c 9s 6s +50%
mld_prepare_domain_separation_prefix 8s 5s +60%
polyvecl_ntt 8s 8s +0%
intt_native_x86_64 7s 3s +133%
keccak_absorb_once_x4 7s 10s -30%
mld_keccakf1600_permute_c 7s 7s +0%
poly_caddq_c 7s 9s -22%
poly_power2round 7s 7s +0%
sign 7s 6s +17%
mld_compute_pack_z 6s 6s +0%
nttunpack_native_x86_64 6s 4s +50%
pointwise_acc_native_aarch64 6s 5s +20%
pointwise_native_x86_64 6s 5s +20%
poly_decompose_c 6s 5s +20%
poly_shiftl 6s 5s +20%
poly_uniform 6s 5s +20%
polyveck_chknorm 6s 2s +200%
polyveck_reduce 6s 7s -14%
sign_keypair_internal 6s 6s +0%
sign_signature_extmu 6s 6s +0%
sign_signature_internal 6s 5s +20%
sign_signature_pre_hash_internal 6s 3s +100%
sign_signature_pre_hash_shake256 6s 5s +20%
keccak_squeezeblocks_x4 5s 3s +67%
keccakf1600_xor_bytes 5s 3s +67%
keccakf1600x4_xor_bytes 5s 3s +67%
mld_ct_get_optblocker_u32 5s 2s +150%
ntt_native_aarch64 5s 4s +25%
poly_challenge 5s 4s +25%
poly_chknorm_native 5s 3s +67%
poly_invntt_tomont_native 5s 2s +150%
poly_ntt 5s 2s +150%
poly_ntt_c 5s 2s +150%
poly_ntt_native 5s 3s +67%
poly_permute_bitrev_to_custom_optional 5s 3s +67%
polyeta_unpack 5s 3s +67%
polyt0_pack 5s 5s +0%
polyw1_pack 5s 4s +25%
polyz_unpack_17_native_aarch64 5s 2s +150%
rej_eta_c 5s 3s +67%
rej_eta_native 5s 7s -29%
shake128_absorb 5s 3s +67%
shake128_finalize 5s 2s +150%
sign_pk_from_sk 5s 5s +0%
use_hint 5s 2s +150%
caddq 4s 3s +33%
intt_native_aarch64 4s 4s +0%
keccakf1600_permute 4s 1s +300%
keccakf1600_xor_bytes (big endian) 4s 2s +100%
mld_sample_s1_s2 4s 5s -20%
mld_sample_s1_s2_serial 4s 5s -20%
pack_sig_c 4s 2s +100%
pack_sk_s1 4s 5s -20%
pointwise_acc_native_x86_64 4s 5s -20%
poly_decompose_native 4s 3s +33%
poly_sub 4s 3s +33%
poly_uniform_eta 4s 3s +33%
poly_use_hint 4s 3s +33%
polyeta_pack 4s 3s +33%
polyt1_unpack 4s 2s +100%
polyveck_invntt_tomont 4s 7s -43%
polyveck_pack_eta 4s 3s +33%
polyvecl_uniform_gamma1_serial 4s 3s +33%
polyvecl_unpack_eta 4s 3s +33%
polyvecl_unpack_z 4s 3s +33%
rej_eta 4s 3s +33%
rej_uniform_native_aarch64 4s - new
shake128x4_absorb_once 4s 2s +100%
shake256_finalize 4s 2s +100%
shake256_init 4s 4s +0%
sig_unpack_hints 4s 3s +33%
sign_signature 4s 5s -20%
sign_verify_pre_hash_internal 4s 3s +33%
sign_verify_pre_hash_shake256 4s 4s +0%
unpack_pk_t1 4s 3s +33%
unpack_sk_t0hat 4s 3s +33%
keccak_f1600_x1_native_aarch64_v84a 3s 1s +200%
keccak_f1600_x4_native_aarch64_v84a 3s 3s +0%
keccak_f1600_x4_native_avx2 3s 2s +50%
keccak_finalize 3s 3s +0%
keccak_squeeze 3s 2s +50%
keccakf1600x4_permute 3s 2s +50%
mld_ct_cmask_neg_i32 3s 2s +50%
mld_ct_cmask_nonzero_u8 3s 2s +50%
mld_ct_sel_int32 3s 2s +50%
mld_keccakf1600x4_xor_bytes_c 3s 1s +200%
ntt_native_x86_64 3s 3s +0%
pack_sig_h 3s 3s +0%
pack_sig_z 3s 2s +50%
pack_sk_rho_key_tr_s2 3s 2s +50%
pointwise_native_aarch64 3s 5s -40%
poly_caddq 3s 3s +0%
poly_caddq_native 3s 3s +0%
poly_caddq_native_aarch64 3s 3s +0%
poly_chknorm_native_aarch64 3s 4s -25%
poly_decompose_32_native_aarch64 3s 3s +0%
poly_decompose_88_native_aarch64 3s 3s +0%
poly_pointwise_montgomery 3s 3s +0%
poly_reduce 3s 2s +50%
poly_uniform_gamma1 3s 4s -25%
poly_use_hint_c 3s 2s +50%
poly_use_hint_native 3s 3s +0%
poly_use_hint_native_aarch64 3s 4s -25%
polyveck_ntt 3s 3s +0%
polyveck_unpack_eta 3s 2s +50%
polyvecl_pack_eta 3s 3s +0%
polyvecl_uniform_gamma1 3s 1s +200%
polyz_unpack 3s 2s +50%
shake128_init 3s 2s +50%
shake128_squeeze 3s 3s +0%
shake256 3s 3s +0%
sign_open 3s 4s -25%
sign_verify_extmu 3s 4s -25%
sk_s1hat_get_poly 3s 3s +0%
sk_t0hat_get_poly 3s 3s +0%
unpack_sk_s2hat 3s 1s +200%
decompose 2s 3s -33%
fqscale 2s 4s -50%
keccak_f1600_x1_native_aarch64 2s 4s -50%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 2s 1s +100%
keccakf1600_permute_native 2s 2s +0%
keccakf1600x4_extract_bytes 2s 2s +0%
keccakf1600x4_extract_bytes_native 2s 2s +0%
keccakf1600x4_xor_bytes_native 2s 3s -33%
make_hint 2s 3s -33%
mld_ct_abs_i32 2s 2s +0%
mld_ct_cmask_nonzero_u32 2s 1s +100%
mld_ct_get_optblocker_i64 2s 2s +0%
mld_h 2s 4s -50%
mld_keccakf1600x4_extract_bytes_c 2s 2s +0%
mld_polymat_expand_entry 2s 6s -67%
mld_value_barrier_i64 2s 3s -33%
mld_value_barrier_u8 2s 3s -33%
montgomery_reduce 2s 3s -33%
poly_chknorm 2s 1s +100%
poly_decompose 2s 3s -33%
poly_permute_bitrev_to_custom_optional_native 2s 4s -50%
poly_pointwise_montgomery_native 2s 5s -60%
poly_uniform_4x 2s 3s -33%
poly_uniform_gamma1_4x 2s 4s -50%
polyt1_pack 2s 4s -50%
polyvec_matrix_expand 2s 3s -33%
polyvec_matrix_expand_serial 2s 2s +0%
polyveck_pack_w1 2s 1s +100%
polyvecl_pointwise_acc_montgomery 2s 4s -50%
polyvecl_pointwise_acc_montgomery_c 2s 3s -33%
polyvecl_pointwise_acc_montgomery_native 2s 2s +0%
polyz_pack 2s 3s -33%
polyz_unpack_19_native_aarch64 2s 2s +0%
polyz_unpack_native 2s 3s -33%
power2round 2s 5s -60%
reduce32 2s 4s -50%
shake128x4_squeezeblocks 2s 1s +100%
shake256_squeeze 2s 3s -33%
shake256x4_absorb_once 2s 3s -33%
shake256x4_squeezeblocks 2s 2s +0%
sign_keypair 2s 4s -50%
sign_verify 2s 3s -33%
sys_check_capability 2s 5s -60%
unpack_sk 2s 3s -33%
unpack_sk_s1hat 2s 2s +0%
yvec_get_poly 2s 5s -60%
yvec_init 2s 3s -33%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 1s 2s -50%
keccak_init 1s 2s -50%
keccakf1600_extract_bytes (big endian) 1s 3s -67%
mld_ct_get_optblocker_u8 1s 3s -67%
mld_keccakf1600_extract_bytes 1s 3s -67%
mld_value_barrier_u32 1s 2s -50%
poly_invntt_tomont 1s 2s -50%
shake128_release 1s 1s +0%
shake256_absorb 1s 2s -50%
shake256_release 1s 3s -67%
sk_s2hat_get_poly 1s 2s -50%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented May 21, 2026

CBMC Results (ML-DSA-87)

Full Results (199 proofs)
Proof Status Current Previous Change
**TOTAL** 2149s 1987s +8.2%
polyvecl_pointwise_acc_montgomery_c 300s 250s +20%
polyvec_matrix_expand 183s 169s +8%
rej_uniform_native 130s 124s +5%
mld_attempt_signature_generation 105s 102s +3%
poly_pointwise_montgomery_c 104s 92s +13%
mld_invntt_layer 99s 93s +6%
sign_verify_internal 91s 88s +3%
mld_ct_memcmp 75s 65s +15%
sign_signature_internal 55s 54s +2%
mld_ntt_layer 44s 41s +7%
polyvec_matrix_expand_serial 44s 37s +19%
fqmul 32s 26s +23%
compute_pack_t0_t1 26s 26s +0%
keccakf1600x4_permute_native 25s 24s +4%
polyvec_matrix_pointwise_montgomery_yvec 22s 21s +5%
poly_chknorm_c 19s 15s +27%
rej_uniform_c 19s 19s +0%
mld_ntt_butterfly_block 18s 17s +6%
rej_uniform 18s 15s +20%
poly_uniform_eta_4x 13s 13s +0%
polyeta_unpack 13s 11s +18%
polyt0_unpack 13s 13s +0%
keccak_absorb_once_x4 12s 9s +33%
polyveck_decompose 12s 12s +0%
poly_add 11s 11s +0%
poly_uniform_4x 11s 11s +0%
poly_invntt_tomont_c 10s 7s +43%
poly_ntt 10s 4s +150%
polyveck_ntt 10s 10s +0%
mld_check_pct 9s 11s -18%
mld_sample_s1_s2_serial 9s 9s +0%
poly_power2round 9s 7s +29%
polyz_unpack_c 9s 7s +29%
sign 9s 8s +12%
mld_compute_pack_z 8s 7s +14%
poly_caddq_c 8s 4s +100%
polyveck_caddq 8s 7s +14%
polyveck_invntt_tomont 8s 9s -11%
mld_keccakf1600_permute_c 7s 7s +0%
pointwise_acc_native_aarch64 7s 8s -12%
polyvecl_chknorm 7s 5s +40%
sign_keypair 7s 4s +75%
sign_signature_extmu 7s 4s +75%
keccak_absorb 6s 5s +20%
keccak_init 6s 3s +100%
pointwise_acc_native_x86_64 6s 8s -25%
poly_challenge 6s 2s +200%
polyt1_pack 6s 2s +200%
polyveck_chknorm 6s 6s +0%
polyvecl_ntt 6s 7s -14%
sign_open 6s 5s +20%
sign_pk_from_sk 6s 6s +0%
sign_signature 6s 4s +50%
fqscale 5s 2s +150%
intt_native_aarch64 5s 3s +67%
mld_sample_s1_s2 5s 6s -17%
ntt_native_x86_64 5s 4s +25%
pack_sig_h 5s 4s +25%
poly_sub 5s 5s +0%
poly_uniform_eta 5s 4s +25%
poly_uniform_gamma1_4x 5s 3s +67%
poly_use_hint_c 5s 4s +25%
polyveck_pack_eta 5s 4s +25%
polyvecl_uniform_gamma1 5s 2s +150%
power2round 5s 3s +67%
rej_eta_native 5s 3s +67%
shake256x4_absorb_once 5s 3s +67%
shake256x4_squeezeblocks 5s 5s +0%
sign_signature_pre_hash_internal 5s 3s +67%
caddq 4s 2s +100%
intt_native_x86_64 4s 3s +33%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 4s 4s +0%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 4s 3s +33%
keccak_squeezeblocks_x4 4s 4s +0%
keccakf1600x4_permute 4s 4s +0%
mld_ct_get_optblocker_u8 4s 3s +33%
mld_keccakf1600_extract_bytes 4s 2s +100%
mld_keccakf1600x4_extract_bytes_c 4s 3s +33%
mld_polymat_expand_entry 4s 3s +33%
mld_prepare_domain_separation_prefix 4s 5s -20%
pointwise_native_x86_64 4s 4s +0%
poly_caddq 4s 3s +33%
poly_caddq_native 4s 3s +33%
poly_chknorm_native_aarch64 4s 4s +0%
poly_decompose_88_native_aarch64 4s 2s +100%
poly_decompose_c 4s 3s +33%
poly_permute_bitrev_to_custom_optional 4s 3s +33%
poly_pointwise_montgomery_native 4s 3s +33%
poly_shiftl 4s 5s -20%
poly_uniform 4s 2s +100%
poly_use_hint_native_aarch64 4s 2s +100%
polyeta_pack 4s 4s +0%
polyt0_pack 4s 4s +0%
polyt1_unpack 4s 4s +0%
polyvec_matrix_pointwise_montgomery_row 4s 2s +100%
polyveck_unpack_eta 4s 3s +33%
polyvecl_pointwise_acc_montgomery_native 4s 4s +0%
polyz_unpack_19_native_aarch64 4s 3s +33%
shake128_squeeze 4s 2s +100%
shake256 4s 2s +100%
shake256_release 4s 1s +300%
sign_verify 4s 5s -20%
sk_s1hat_get_poly 4s 4s +0%
sys_check_capability 4s 1s +300%
unpack_sk 4s 3s +33%
unpack_sk_s1hat 4s 5s -20%
unpack_sk_t0hat 4s 5s -20%
use_hint 4s 3s +33%
yvec_get_poly 4s 2s +100%
yvec_init 4s 3s +33%
decompose 3s 5s -40%
keccak_f1600_x1_native_aarch64_v84a 3s 2s +50%
keccak_squeeze 3s 1s +200%
keccakf1600_extract_bytes (big endian) 3s 3s +0%
keccakf1600x4_xor_bytes 3s 3s +0%
mld_h 3s 5s -40%
ntt_native_aarch64 3s 4s -25%
pack_sk_rho_key_tr_s2 3s 2s +50%
pointwise_native_aarch64 3s 3s +0%
poly_caddq_native_aarch64 3s 3s +0%
poly_chknorm_native 3s 4s -25%
poly_decompose 3s 3s +0%
poly_decompose_32_native_aarch64 3s 3s +0%
poly_decompose_native 3s 3s +0%
poly_invntt_tomont_native 3s 4s -25%
poly_ntt_c 3s 4s -25%
poly_permute_bitrev_to_custom_optional_native 3s 3s +0%
poly_uniform_gamma1 3s 3s +0%
poly_use_hint_native 3s 5s -40%
polyveck_pack_w1 3s 2s +50%
polyvecl_pack_eta 3s 2s +50%
polyvecl_pointwise_acc_montgomery 3s 4s -25%
polyvecl_unpack_z 3s 1s +200%
rej_eta 3s 2s +50%
rej_eta_c 3s 6s -50%
shake128_absorb 3s 4s -25%
shake128_init 3s 4s -25%
shake128x4_absorb_once 3s 4s -25%
shake256_absorb 3s 3s +0%
shake256_init 3s 2s +50%
shake256_squeeze 3s 2s +50%
sig_unpack_hints 3s 3s +0%
sign_keypair_internal 3s 6s -50%
sign_signature_pre_hash_shake256 3s 2s +50%
sign_verify_extmu 3s 6s -50%
sign_verify_pre_hash_internal 3s 5s -40%
sign_verify_pre_hash_shake256 3s 2s +50%
sk_s2hat_get_poly 3s 2s +50%
unpack_sk_s2hat 3s 4s -25%
keccak_f1600_x1_native_aarch64 2s 2s +0%
keccak_f1600_x4_native_aarch64_v84a 2s 3s -33%
keccak_f1600_x4_native_avx2 2s 2s +0%
keccak_finalize 2s 2s +0%
keccakf1600_permute_native 2s 3s -33%
keccakf1600_xor_bytes 2s 2s +0%
keccakf1600_xor_bytes (big endian) 2s 1s +100%
keccakf1600x4_extract_bytes 2s 3s -33%
keccakf1600x4_extract_bytes_native 2s 4s -50%
keccakf1600x4_xor_bytes_native 2s 5s -60%
make_hint 2s 4s -50%
mld_ct_cmask_neg_i32 2s 1s +100%
mld_ct_cmask_nonzero_u32 2s 4s -50%
mld_ct_cmask_nonzero_u8 2s 3s -33%
mld_ct_get_optblocker_i64 2s 5s -60%
mld_ct_get_optblocker_u32 2s 4s -50%
mld_keccakf1600x4_xor_bytes_c 2s 3s -33%
mld_value_barrier_u32 2s 4s -50%
mld_value_barrier_u8 2s 3s -33%
montgomery_reduce 2s 3s -33%
nttunpack_native_x86_64 2s 3s -33%
pack_sig_c 2s 3s -33%
pack_sig_z 2s 2s +0%
pack_sk_s1 2s 3s -33%
poly_chknorm 2s 3s -33%
poly_invntt_tomont 2s 3s -33%
poly_ntt_native 2s 3s -33%
poly_pointwise_montgomery 2s 2s +0%
poly_reduce 2s 4s -50%
poly_use_hint 2s 3s -33%
polyveck_reduce 2s 2s +0%
polyvecl_uniform_gamma1_serial 2s 4s -50%
polyw1_pack 2s 2s +0%
polyz_pack 2s 3s -33%
polyz_unpack 2s 2s +0%
polyz_unpack_17_native_aarch64 2s 4s -50%
polyz_unpack_native 2s 3s -33%
reduce32 2s 3s -33%
rej_uniform_native_aarch64 2s - new
shake128x4_squeezeblocks 2s 3s -33%
shake256_finalize 2s 3s -33%
sk_t0hat_get_poly 2s 4s -50%
keccakf1600_permute 1s 3s -67%
mld_ct_abs_i32 1s 4s -75%
mld_ct_sel_int32 1s 4s -75%
mld_value_barrier_i64 1s 2s -50%
polyvecl_unpack_eta 1s 3s -67%
shake128_finalize 1s 2s -50%
shake128_release 1s 2s -50%
unpack_pk_t1 1s 3s -67%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented May 21, 2026

CBMC Results (ML-DSA-44)

Full Results (199 proofs)
Proof Status Current Previous Change
**TOTAL** 1818s 1672s +8.7%
polyvecl_pointwise_acc_montgomery_c 300s 255s +18%
rej_uniform_native 130s 117s +11%
poly_pointwise_montgomery_c 115s 97s +19%
mld_invntt_layer 103s 93s +11%
mld_ct_memcmp 81s 69s +17%
mld_attempt_signature_generation 59s 54s +9%
mld_ntt_layer 46s 44s +5%
polyvec_matrix_expand 32s 30s +7%
sign_signature_internal 32s 28s +14%
fqmul 31s 28s +11%
sign_verify_internal 27s 28s -4%
keccakf1600x4_permute_native 22s 22s +0%
polyvecl_chknorm 21s 16s +31%
mld_ntt_butterfly_block 20s 15s +33%
rej_uniform_c 19s 16s +19%
rej_uniform 18s 16s +12%
mld_check_pct 17s 15s +13%
poly_chknorm_c 17s 17s +0%
polyvec_matrix_pointwise_montgomery_yvec 16s 15s +7%
polyt0_unpack 15s 16s -6%
poly_uniform_eta_4x 14s 14s +0%
polyz_unpack_c 14s 12s +17%
compute_pack_t0_t1 13s 15s -13%
keccak_absorb_once_x4 13s 8s +62%
polyeta_unpack 13s 14s -7%
poly_uniform_4x 12s 13s -8%
polyvec_matrix_expand_serial 12s 11s +9%
poly_add 11s 10s +10%
poly_invntt_tomont_c 10s 8s +25%
poly_power2round 10s 10s +0%
mld_compute_pack_z 9s 9s +0%
sign 8s 6s +33%
keccak_absorb 7s 7s +0%
poly_decompose_c 7s 9s -22%
sign_keypair_internal 7s 6s +17%
sign_pk_from_sk 7s 6s +17%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 6s 3s +100%
mld_keccakf1600_permute_c 6s 7s -14%
poly_caddq_c 6s 3s +100%
polyveck_chknorm 6s 6s +0%
polyveck_decompose 6s 8s -25%
sign_open 6s 3s +100%
sign_signature_pre_hash_internal 6s 5s +20%
sk_s2hat_get_poly 6s 2s +200%
intt_native_x86_64 5s 3s +67%
keccakf1600_permute_native 5s 4s +25%
mld_h 5s 7s -29%
mld_keccakf1600_extract_bytes 5s 3s +67%
mld_keccakf1600x4_extract_bytes_c 5s 3s +67%
ntt_native_aarch64 5s 4s +25%
pack_sk_s1 5s 2s +150%
pointwise_acc_native_aarch64 5s 6s -17%
pointwise_acc_native_x86_64 5s 6s -17%
poly_challenge 5s 6s -17%
poly_invntt_tomont_native 5s 4s +25%
poly_permute_bitrev_to_custom_optional 5s 2s +150%
poly_sub 5s 2s +150%
polyvec_matrix_pointwise_montgomery_row 5s 3s +67%
polyveck_pack_w1 5s 3s +67%
polyvecl_unpack_z 5s 2s +150%
polyz_unpack 5s 2s +150%
rej_eta_c 5s 4s +25%
sig_unpack_hints 5s 2s +150%
sign_signature 5s 5s +0%
sign_verify_pre_hash_shake256 5s 6s -17%
unpack_sk_t0hat 5s 5s +0%
fqscale 4s 4s +0%
keccak_f1600_x4_native_avx2 4s 2s +100%
keccak_squeezeblocks_x4 4s 4s +0%
keccakf1600_xor_bytes 4s 3s +33%
keccakf1600_xor_bytes (big endian) 4s 3s +33%
keccakf1600x4_xor_bytes 4s 4s +0%
mld_ct_cmask_nonzero_u32 4s 2s +100%
mld_ct_get_optblocker_u32 4s 2s +100%
mld_sample_s1_s2_serial 4s 3s +33%
pack_sig_c 4s 4s +0%
pack_sk_rho_key_tr_s2 4s 3s +33%
poly_caddq_native_aarch64 4s 2s +100%
poly_chknorm_native 4s 4s +0%
poly_chknorm_native_aarch64 4s 2s +100%
poly_decompose_88_native_aarch64 4s 4s +0%
poly_decompose_native 4s 3s +33%
poly_ntt 4s 3s +33%
poly_pointwise_montgomery 4s 5s -20%
poly_shiftl 4s 5s -20%
poly_uniform 4s 4s +0%
poly_use_hint_native 4s 4s +0%
polyveck_invntt_tomont 4s 7s -43%
polyveck_pack_eta 4s 2s +100%
polyveck_unpack_eta 4s 3s +33%
polyvecl_ntt 4s 4s +0%
polyvecl_pointwise_acc_montgomery_native 4s 2s +100%
polyvecl_uniform_gamma1_serial 4s 2s +100%
polyz_unpack_17_native_aarch64 4s 6s -33%
power2round 4s 5s -20%
rej_eta_native 4s 4s +0%
shake256_squeeze 4s 1s +300%
sign_signature_pre_hash_shake256 4s 3s +33%
unpack_sk 4s 4s +0%
yvec_get_poly 4s 4s +0%
decompose 3s 4s -25%
intt_native_aarch64 3s 3s +0%
keccak_f1600_x1_native_aarch64 3s 3s +0%
keccak_f1600_x4_native_aarch64_v84a 3s 4s -25%
keccak_squeeze 3s 3s +0%
keccakf1600x4_permute 3s 3s +0%
keccakf1600x4_xor_bytes_native 3s 2s +50%
make_hint 3s 2s +50%
mld_ct_cmask_nonzero_u8 3s 1s +200%
mld_ct_get_optblocker_i64 3s 2s +50%
mld_polymat_expand_entry 3s 3s +0%
mld_prepare_domain_separation_prefix 3s 5s -40%
mld_sample_s1_s2 3s 3s +0%
nttunpack_native_x86_64 3s 5s -40%
pack_sig_h 3s 2s +50%
pack_sig_z 3s 2s +50%
pointwise_native_aarch64 3s 2s +50%
pointwise_native_x86_64 3s 2s +50%
poly_decompose 3s 3s +0%
poly_invntt_tomont 3s 2s +50%
poly_ntt_native 3s 3s +0%
poly_permute_bitrev_to_custom_optional_native 3s 3s +0%
poly_pointwise_montgomery_native 3s 3s +0%
poly_reduce 3s 3s +0%
poly_uniform_eta 3s 5s -40%
poly_uniform_gamma1 3s 5s -40%
poly_uniform_gamma1_4x 3s 3s +0%
poly_use_hint_c 3s 3s +0%
poly_use_hint_native_aarch64 3s 2s +50%
polyt0_pack 3s 5s -40%
polyt1_unpack 3s 3s +0%
polyveck_caddq 3s 6s -50%
polyvecl_pack_eta 3s 3s +0%
polyw1_pack 3s 1s +200%
polyz_pack 3s 2s +50%
polyz_unpack_19_native_aarch64 3s 1s +200%
polyz_unpack_native 3s 3s +0%
rej_eta 3s 2s +50%
shake128_release 3s 3s +0%
shake256 3s 2s +50%
shake256_absorb 3s 1s +200%
shake256_init 3s 1s +200%
sign_verify 3s 6s -50%
sign_verify_extmu 3s 4s -25%
unpack_pk_t1 3s 3s +0%
caddq 2s 3s -33%
keccak_f1600_x1_native_aarch64_v84a 2s 3s -33%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 2s 3s -33%
keccak_finalize 2s 1s +100%
keccak_init 2s 2s +0%
keccakf1600_extract_bytes (big endian) 2s 2s +0%
keccakf1600_permute 2s 3s -33%
keccakf1600x4_extract_bytes 2s 3s -33%
mld_ct_abs_i32 2s 4s -50%
mld_ct_cmask_neg_i32 2s 1s +100%
mld_ct_get_optblocker_u8 2s 3s -33%
mld_ct_sel_int32 2s 3s -33%
mld_keccakf1600x4_xor_bytes_c 2s 4s -50%
mld_value_barrier_i64 2s 2s +0%
mld_value_barrier_u32 2s 2s +0%
montgomery_reduce 2s 3s -33%
ntt_native_x86_64 2s 2s +0%
poly_caddq 2s 3s -33%
poly_chknorm 2s 3s -33%
poly_decompose_32_native_aarch64 2s 3s -33%
poly_ntt_c 2s 4s -50%
poly_use_hint 2s 4s -50%
polyeta_pack 2s 4s -50%
polyt1_pack 2s 3s -33%
polyveck_ntt 2s 4s -50%
polyveck_reduce 2s 1s +100%
polyvecl_pointwise_acc_montgomery 2s 2s +0%
polyvecl_uniform_gamma1 2s 3s -33%
reduce32 2s 2s +0%
rej_uniform_native_aarch64 2s - new
shake128_absorb 2s 2s +0%
shake128_finalize 2s 2s +0%
shake128_init 2s 3s -33%
shake128x4_absorb_once 2s 3s -33%
shake256_finalize 2s 3s -33%
shake256_release 2s 1s +100%
shake256x4_absorb_once 2s 5s -60%
shake256x4_squeezeblocks 2s 3s -33%
sign_keypair 2s 6s -67%
sign_signature_extmu 2s 3s -33%
sign_verify_pre_hash_internal 2s 4s -50%
sk_s1hat_get_poly 2s 3s -33%
sk_t0hat_get_poly 2s 5s -60%
sys_check_capability 2s 4s -50%
unpack_sk_s1hat 2s 4s -50%
use_hint 2s 4s -50%
yvec_init 2s 2s +0%
keccakf1600x4_extract_bytes_native 1s 2s -50%
mld_value_barrier_u8 1s 1s +0%
poly_caddq_native 1s 2s -50%
polyvecl_unpack_eta 1s 3s -67%
shake128_squeeze 1s 2s -50%
shake128x4_squeezeblocks 1s 3s -67%
unpack_sk_s2hat 1s 3s -67%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented May 21, 2026

CBMC Results (ML-DSA-65)

Full Results (199 proofs)
Proof Status Current Previous Change
**TOTAL** 2060s 1914s +7.6%
polyvecl_pointwise_acc_montgomery_c 333s 284s +17%
polyvec_matrix_expand 154s 149s +3%
rej_uniform_native 132s 124s +6%
poly_pointwise_montgomery_c 110s 94s +17%
mld_invntt_layer 99s 92s +8%
mld_ct_memcmp 79s 69s +14%
sign_verify_internal 72s 71s +1%
mld_attempt_signature_generation 70s 67s +4%
sign_signature_internal 46s 45s +2%
mld_ntt_layer 44s 45s -2%
fqmul 30s 31s -3%
polyvec_matrix_pointwise_montgomery_yvec 29s 29s +0%
keccakf1600x4_permute_native 24s 23s +4%
polyvec_matrix_expand_serial 24s 24s +0%
rej_uniform_c 21s 17s +24%
mld_ntt_butterfly_block 18s 16s +12%
poly_chknorm_c 17s 15s +13%
rej_uniform 17s 16s +6%
polyt0_unpack 16s 17s -6%
polyveck_decompose 14s 16s -12%
poly_power2round 13s 10s +30%
poly_uniform_eta_4x 13s 13s +0%
compute_pack_t0_t1 12s 15s -20%
keccak_absorb_once_x4 12s 9s +33%
mld_check_pct 12s 13s -8%
poly_uniform_4x 12s 12s +0%
poly_add 11s 11s +0%
polyveck_caddq 10s 8s +25%
polyveck_chknorm 10s 10s +0%
polyvecl_ntt 9s 6s +50%
poly_invntt_tomont_c 8s 7s +14%
polyveck_invntt_tomont 8s 6s +33%
sign 8s 8s +0%
intt_native_aarch64 7s 2s +250%
mld_keccakf1600_permute_c 7s 8s -12%
mld_sample_s1_s2 7s 4s +75%
poly_decompose_c 7s 8s -12%
poly_shiftl 7s 5s +40%
polyveck_ntt 7s 10s -30%
keccak_absorb 6s 7s -14%
mld_compute_pack_z 6s 10s -40%
poly_use_hint 6s 6s +0%
polyvecl_unpack_z 6s 4s +50%
rej_eta 6s 3s +100%
rej_eta_native 6s 4s +50%
sign_keypair_internal 6s 6s +0%
sign_open 6s 4s +50%
sign_pk_from_sk 6s 6s +0%
unpack_sk_t0hat 6s 5s +20%
keccak_squeezeblocks_x4 5s 6s -17%
keccakf1600x4_xor_bytes_native 5s 1s +400%
mld_prepare_domain_separation_prefix 5s 4s +25%
mld_sample_s1_s2_serial 5s 3s +67%
pointwise_acc_native_aarch64 5s 8s -38%
pointwise_acc_native_x86_64 5s 5s +0%
poly_caddq_native 5s 2s +150%
poly_ntt_native 5s 3s +67%
poly_use_hint_native_aarch64 5s 3s +67%
polyvecl_chknorm 5s 6s -17%
polyvecl_pack_eta 5s 4s +25%
polyvecl_pointwise_acc_montgomery_native 5s 5s +0%
rej_eta_c 5s 5s +0%
sign_keypair 5s 4s +25%
sign_signature_extmu 5s 3s +67%
sign_signature_pre_hash_internal 5s 3s +67%
sign_signature_pre_hash_shake256 5s 3s +67%
sign_verify 5s 3s +67%
sys_check_capability 5s 1s +400%
unpack_sk 5s 4s +25%
caddq 4s 3s +33%
decompose 4s 5s -20%
keccak_finalize 4s 4s +0%
mld_keccakf1600x4_extract_bytes_c 4s 2s +100%
ntt_native_aarch64 4s 2s +100%
pack_sig_h 4s 2s +100%
pack_sk_rho_key_tr_s2 4s 2s +100%
poly_caddq_c 4s 4s +0%
poly_challenge 4s 7s -43%
poly_decompose_32_native_aarch64 4s 5s -20%
poly_decompose_native 4s 4s +0%
poly_invntt_tomont 4s 5s -20%
poly_invntt_tomont_native 4s 3s +33%
poly_pointwise_montgomery 4s 2s +100%
poly_uniform 4s 3s +33%
poly_uniform_eta 4s 4s +0%
poly_uniform_gamma1 4s 4s +0%
poly_use_hint_native 4s 3s +33%
polyeta_pack 4s 4s +0%
polyeta_unpack 4s 3s +33%
polyt1_unpack 4s 5s -20%
polyveck_pack_w1 4s 2s +100%
polyvecl_pointwise_acc_montgomery 4s 1s +300%
polyvecl_uniform_gamma1 4s 4s +0%
polyz_pack 4s 2s +100%
polyz_unpack_c 4s 5s -20%
polyz_unpack_native 4s 2s +100%
reduce32 4s 2s +100%
shake256 4s 3s +33%
shake256x4_absorb_once 4s 3s +33%
sign_verify_pre_hash_internal 4s 3s +33%
sk_s1hat_get_poly 4s 2s +100%
unpack_sk_s2hat 4s 2s +100%
use_hint 4s 2s +100%
intt_native_x86_64 3s 2s +50%
keccak_f1600_x1_native_aarch64 3s 2s +50%
keccak_f1600_x4_native_aarch64_v84a 3s 2s +50%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 3s 2s +50%
keccak_f1600_x4_native_avx2 3s 5s -40%
keccakf1600_permute_native 3s 4s -25%
keccakf1600_xor_bytes (big endian) 3s 4s -25%
keccakf1600x4_extract_bytes 3s 1s +200%
keccakf1600x4_extract_bytes_native 3s 1s +200%
keccakf1600x4_xor_bytes 3s 3s +0%
make_hint 3s 3s +0%
mld_ct_cmask_nonzero_u32 3s 4s -25%
mld_ct_cmask_nonzero_u8 3s 4s -25%
mld_ct_get_optblocker_u8 3s 2s +50%
mld_h 3s 7s -57%
mld_keccakf1600_extract_bytes 3s 3s +0%
mld_keccakf1600x4_xor_bytes_c 3s 3s +0%
mld_value_barrier_i64 3s 3s +0%
montgomery_reduce 3s 4s -25%
ntt_native_x86_64 3s 5s -40%
pack_sig_z 3s 3s +0%
pointwise_native_aarch64 3s 3s +0%
pointwise_native_x86_64 3s 2s +50%
poly_caddq 3s 3s +0%
poly_caddq_native_aarch64 3s 2s +50%
poly_chknorm_native 3s 3s +0%
poly_chknorm_native_aarch64 3s 3s +0%
poly_decompose 3s 5s -40%
poly_ntt 3s 3s +0%
poly_ntt_c 3s 2s +50%
poly_permute_bitrev_to_custom_optional_native 3s 3s +0%
poly_pointwise_montgomery_native 3s 3s +0%
poly_uniform_gamma1_4x 3s 4s -25%
poly_use_hint_c 3s 3s +0%
polyt0_pack 3s 6s -50%
polyveck_unpack_eta 3s 4s -25%
polyz_unpack 3s 3s +0%
power2round 3s 1s +200%
rej_uniform_native_aarch64 3s - new
shake128_absorb 3s 2s +50%
shake128_squeeze 3s 2s +50%
shake128x4_squeezeblocks 3s 2s +50%
shake256_release 3s 2s +50%
shake256_squeeze 3s 3s +0%
shake256x4_squeezeblocks 3s 2s +50%
sig_unpack_hints 3s 3s +0%
sign_signature 3s 5s -40%
sign_verify_extmu 3s 3s +0%
sk_s2hat_get_poly 3s 2s +50%
unpack_sk_s1hat 3s 2s +50%
yvec_get_poly 3s 5s -40%
yvec_init 3s 3s +0%
fqscale 2s 3s -33%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 2s 3s -33%
keccak_squeeze 2s 2s +0%
keccakf1600_extract_bytes (big endian) 2s 4s -50%
keccakf1600x4_permute 2s 3s -33%
mld_ct_abs_i32 2s 1s +100%
mld_ct_cmask_neg_i32 2s 2s +0%
mld_ct_get_optblocker_i64 2s 1s +100%
mld_ct_sel_int32 2s 3s -33%
mld_value_barrier_u32 2s 3s -33%
mld_value_barrier_u8 2s 1s +100%
nttunpack_native_x86_64 2s 2s +0%
pack_sig_c 2s 3s -33%
pack_sk_s1 2s 2s +0%
poly_chknorm 2s 3s -33%
poly_permute_bitrev_to_custom_optional 2s 3s -33%
poly_reduce 2s 1s +100%
poly_sub 2s 1s +100%
polyt1_pack 2s 2s +0%
polyvec_matrix_pointwise_montgomery_row 2s 2s +0%
polyveck_pack_eta 2s 1s +100%
polyveck_reduce 2s 2s +0%
polyvecl_uniform_gamma1_serial 2s 4s -50%
polyvecl_unpack_eta 2s 2s +0%
polyw1_pack 2s 1s +100%
polyz_unpack_19_native_aarch64 2s 3s -33%
shake128_finalize 2s 4s -50%
shake128_init 2s 3s -33%
shake128_release 2s 2s +0%
shake128x4_absorb_once 2s 4s -50%
shake256_absorb 2s 2s +0%
shake256_init 2s 3s -33%
sign_verify_pre_hash_shake256 2s 2s +0%
sk_t0hat_get_poly 2s 2s +0%
unpack_pk_t1 2s 2s +0%
keccak_f1600_x1_native_aarch64_v84a 1s 1s +0%
keccak_init 1s 4s -75%
keccakf1600_permute 1s 3s -67%
keccakf1600_xor_bytes 1s 2s -50%
mld_ct_get_optblocker_u32 1s 4s -75%
mld_polymat_expand_entry 1s 4s -75%
poly_decompose_88_native_aarch64 1s 2s -50%
polyz_unpack_17_native_aarch64 1s 6s -83%
shake256_finalize 1s 1s +0%

@hanno-becker hanno-becker marked this pull request as ready for review May 21, 2026 11:19
@hanno-becker hanno-becker requested a review from a team as a code owner May 21, 2026 11:19
@hanno-becker hanno-becker requested a review from mkannwischer May 21, 2026 11:19
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mac Mini (M1, 2020) benchmarks (opt)

Details
Benchmark suite Current: 6c13542 Previous: ec0cdd4 Ratio
ML-DSA-44 keypair 46535 cycles 46506 cycles 1.00
ML-DSA-44 sign 131058 cycles 131078 cycles 1.00
ML-DSA-44 verify 47345 cycles 47320 cycles 1.00
ML-DSA-65 keypair 81706 cycles 81693 cycles 1.00
ML-DSA-65 sign 215431 cycles 215418 cycles 1.00
ML-DSA-65 verify 79324 cycles 79309 cycles 1.00
ML-DSA-87 keypair 132411 cycles 132409 cycles 1.00
ML-DSA-87 sign 277428 cycles 277534 cycles 1.00
ML-DSA-87 verify 134236 cycles 134235 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mac Mini (M1, 2020) benchmarks (no-opt)

Details
Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-44 keypair 112746 cycles 112740 cycles 1.00
ML-DSA-44 sign 400901 cycles 400842 cycles 1.00
ML-DSA-44 verify 120128 cycles 120086 cycles 1.00
ML-DSA-65 keypair 192883 cycles 192877 cycles 1.00
ML-DSA-65 sign 649925 cycles 649964 cycles 1.00
ML-DSA-65 verify 192956 cycles 192956 cycles 1
ML-DSA-87 keypair 318782 cycles 318775 cycles 1.00
ML-DSA-87 sign 828588 cycles 828851 cycles 1.00
ML-DSA-87 verify 326650 cycles 326654 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i)

Details
Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-44 keypair 43768 cycles 45391 cycles 0.96
ML-DSA-44 sign 133272 cycles 136216 cycles 0.98
ML-DSA-44 verify 45772 cycles 47321 cycles 0.97
ML-DSA-65 keypair 76387 cycles 78853 cycles 0.97
ML-DSA-65 sign 218619 cycles 223148 cycles 0.98
ML-DSA-65 verify 76486 cycles 77818 cycles 0.98
ML-DSA-87 keypair 124993 cycles 126383 cycles 0.99
ML-DSA-87 sign 277386 cycles 280003 cycles 0.99
ML-DSA-87 verify 122348 cycles 124086 cycles 0.99

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i) (no-opt)

Details
Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-44 keypair 94221 cycles 94250 cycles 1.00
ML-DSA-44 sign 329262 cycles 329671 cycles 1.00
ML-DSA-44 verify 98725 cycles 98848 cycles 1.00
ML-DSA-65 keypair 161876 cycles 161842 cycles 1.00
ML-DSA-65 sign 539009 cycles 538466 cycles 1.00
ML-DSA-65 verify 160676 cycles 160405 cycles 1.00
ML-DSA-87 keypair 264143 cycles 264153 cycles 1.00
ML-DSA-87 sign 694078 cycles 694626 cycles 1.00
ML-DSA-87 verify 265819 cycles 265814 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SpacemiT K1 8 (Banana Pi F3) benchmarks (no-opt)

Details
Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-44 keypair 820432 cycles 820764 cycles 1.00
ML-DSA-44 sign 3222145 cycles 3222715 cycles 1.00
ML-DSA-44 verify 917121 cycles 917496 cycles 1.00
ML-DSA-65 keypair 1391448 cycles 1391037 cycles 1.00
ML-DSA-65 sign 5243597 cycles 5232579 cycles 1.00
ML-DSA-65 verify 1466874 cycles 1464573 cycles 1.00
ML-DSA-87 keypair 2298784 cycles 2299791 cycles 1.00
ML-DSA-87 sign 6610495 cycles 6616416 cycles 1.00
ML-DSA-87 verify 2406162 cycles 2407836 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a)

Details
Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-44 keypair 55563 cycles 58246 cycles 0.95
ML-DSA-44 sign 165532 cycles 168182 cycles 0.98
ML-DSA-44 verify 58065 cycles 58487 cycles 0.99
ML-DSA-65 keypair 96320 cycles 96874 cycles 0.99
ML-DSA-65 sign 267696 cycles 271854 cycles 0.98
ML-DSA-65 verify 96597 cycles 97497 cycles 0.99
ML-DSA-87 keypair 155692 cycles 164997 cycles 0.94
ML-DSA-87 sign 328179 cycles 339783 cycles 0.97
ML-DSA-87 verify 151858 cycles 154373 cycles 0.98

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2

Details
Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-44 keypair 112606 cycles 112584 cycles 1.00
ML-DSA-44 sign 355233 cycles 355058 cycles 1.00
ML-DSA-44 verify 117577 cycles 117447 cycles 1.00
ML-DSA-65 keypair 194491 cycles 194602 cycles 1.00
ML-DSA-65 sign 585379 cycles 585206 cycles 1.00
ML-DSA-65 verify 193298 cycles 193231 cycles 1.00
ML-DSA-87 keypair 321166 cycles 321492 cycles 1.00
ML-DSA-87 sign 749549 cycles 750302 cycles 1.00
ML-DSA-87 verify 318315 cycles 318520 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a) (no-opt)

Details
Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-44 keypair 133850 cycles 133824 cycles 1.00
ML-DSA-44 sign 522666 cycles 523429 cycles 1.00
ML-DSA-44 verify 146999 cycles 147203 cycles 1.00
ML-DSA-65 keypair 223918 cycles 224049 cycles 1.00
ML-DSA-65 sign 853033 cycles 850408 cycles 1.00
ML-DSA-65 verify 233445 cycles 233150 cycles 1.00
ML-DSA-87 keypair 371837 cycles 373095 cycles 1.00
ML-DSA-87 sign 1073791 cycles 1075287 cycles 1.00
ML-DSA-87 verify 384334 cycles 385379 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a)

Details
Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-44 keypair 46880 cycles 47240 cycles 0.99
ML-DSA-44 sign 144149 cycles 146136 cycles 0.99
ML-DSA-44 verify 49896 cycles 50648 cycles 0.99
ML-DSA-65 keypair 82602 cycles 83469 cycles 0.99
ML-DSA-65 sign 229918 cycles 230072 cycles 1.00
ML-DSA-65 verify 83172 cycles 83481 cycles 1.00
ML-DSA-87 keypair 130909 cycles 132078 cycles 0.99
ML-DSA-87 sign 280817 cycles 283164 cycles 0.99
ML-DSA-87 verify 128828 cycles 130195 cycles 0.99

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4

Details
Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-44 keypair 67437 cycles 67230 cycles 1.00
ML-DSA-44 sign 201387 cycles 201509 cycles 1.00
ML-DSA-44 verify 70315 cycles 70229 cycles 1.00
ML-DSA-65 keypair 119444 cycles 119440 cycles 1.00
ML-DSA-65 sign 328182 cycles 328213 cycles 1.00
ML-DSA-65 verify 116781 cycles 116941 cycles 1.00
ML-DSA-87 keypair 196729 cycles 196651 cycles 1.00
ML-DSA-87 sign 425010 cycles 424774 cycles 1.00
ML-DSA-87 verify 193266 cycles 193028 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i)

Details
Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-44 keypair 61922 cycles 62711 cycles 0.99
ML-DSA-44 sign 191461 cycles 193467 cycles 0.99
ML-DSA-44 verify 66255 cycles 67225 cycles 0.99
ML-DSA-65 keypair 108162 cycles 112590 cycles 0.96
ML-DSA-65 sign 314656 cycles 319856 cycles 0.98
ML-DSA-65 verify 109273 cycles 112272 cycles 0.97
ML-DSA-87 keypair 172432 cycles 172667 cycles 1.00
ML-DSA-87 sign 383790 cycles 385139 cycles 1.00
ML-DSA-87 verify 172188 cycles 172209 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a) (no-opt)

Details
Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-44 keypair 118781 cycles 119396 cycles 0.99
ML-DSA-44 sign 448947 cycles 446583 cycles 1.01
ML-DSA-44 verify 128855 cycles 128844 cycles 1.00
ML-DSA-65 keypair 201858 cycles 202970 cycles 0.99
ML-DSA-65 sign 719584 cycles 719324 cycles 1.00
ML-DSA-65 verify 208524 cycles 207674 cycles 1.00
ML-DSA-87 keypair 333642 cycles 335008 cycles 1.00
ML-DSA-87 sign 914941 cycles 918169 cycles 1.00
ML-DSA-87 verify 341970 cycles 342415 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Intel Xeon 3rd gen (c6i)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-87 verify 178158 cycles 172209 cycles 1.03

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2 (no-opt)

Details
Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-44 keypair 213751 cycles 212528 cycles 1.01
ML-DSA-44 sign 758285 cycles 758331 cycles 1.00
ML-DSA-44 verify 230133 cycles 229988 cycles 1.00
ML-DSA-65 keypair 378479 cycles 378899 cycles 1.00
ML-DSA-65 sign 1241045 cycles 1241831 cycles 1.00
ML-DSA-65 verify 372957 cycles 372984 cycles 1.00
ML-DSA-87 keypair 604331 cycles 603666 cycles 1.00
ML-DSA-87 sign 1582990 cycles 1581558 cycles 1.00
ML-DSA-87 verify 618690 cycles 618334 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4 (no-opt)

Details
Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-44 keypair 128404 cycles 128472 cycles 1.00
ML-DSA-44 sign 445324 cycles 444913 cycles 1.00
ML-DSA-44 verify 136666 cycles 136570 cycles 1.00
ML-DSA-65 keypair 220459 cycles 220085 cycles 1.00
ML-DSA-65 sign 718170 cycles 718725 cycles 1.00
ML-DSA-65 verify 220838 cycles 221154 cycles 1.00
ML-DSA-87 keypair 365445 cycles 365468 cycles 1.00
ML-DSA-87 sign 918949 cycles 917811 cycles 1.00
ML-DSA-87 verify 371017 cycles 371454 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i) (no-opt)

Details
Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-44 keypair 150916 cycles 153976 cycles 0.98
ML-DSA-44 sign 545488 cycles 559057 cycles 0.98
ML-DSA-44 verify 163644 cycles 166793 cycles 0.98
ML-DSA-65 keypair 255567 cycles 256142 cycles 1.00
ML-DSA-65 sign 887481 cycles 889069 cycles 1.00
ML-DSA-65 verify 262294 cycles 263111 cycles 1.00
ML-DSA-87 keypair 425948 cycles 426566 cycles 1.00
ML-DSA-87 sign 1144091 cycles 1150462 cycles 0.99
ML-DSA-87 verify 440785 cycles 439448 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A76 (Raspberry Pi 5) benchmarks (opt)

Details
Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-44 keypair 112466 cycles 112471 cycles 1.00
ML-DSA-44 sign 354625 cycles 354319 cycles 1.00
ML-DSA-44 verify 117059 cycles 117101 cycles 1.00
ML-DSA-65 keypair 194532 cycles 194670 cycles 1.00
ML-DSA-65 sign 584264 cycles 584352 cycles 1.00
ML-DSA-65 verify 193237 cycles 193010 cycles 1.00
ML-DSA-87 keypair 320621 cycles 321273 cycles 1.00
ML-DSA-87 sign 748906 cycles 749948 cycles 1.00
ML-DSA-87 verify 317880 cycles 318693 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3

Details
Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-44 keypair 71572 cycles 71489 cycles 1.00
ML-DSA-44 sign 211568 cycles 211342 cycles 1.00
ML-DSA-44 verify 74829 cycles 74924 cycles 1.00
ML-DSA-65 keypair 125984 cycles 125905 cycles 1.00
ML-DSA-65 sign 347604 cycles 347998 cycles 1.00
ML-DSA-65 verify 123873 cycles 124044 cycles 1.00
ML-DSA-87 keypair 206240 cycles 206671 cycles 1.00
ML-DSA-87 sign 443131 cycles 447427 cycles 0.99
ML-DSA-87 verify 204465 cycles 204120 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3 (no-opt)

Details
Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-44 keypair 137889 cycles 137948 cycles 1.00
ML-DSA-44 sign 482110 cycles 481676 cycles 1.00
ML-DSA-44 verify 148803 cycles 148659 cycles 1.00
ML-DSA-65 keypair 241031 cycles 240730 cycles 1.00
ML-DSA-65 sign 785016 cycles 784965 cycles 1.00
ML-DSA-65 verify 240613 cycles 241049 cycles 1.00
ML-DSA-87 keypair 395048 cycles 395084 cycles 1.00
ML-DSA-87 sign 1005956 cycles 1004845 cycles 1.00
ML-DSA-87 verify 402645 cycles 403184 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A76 (Raspberry Pi 5) benchmarks (no-opt)

Details
Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-44 keypair 212371 cycles 212425 cycles 1.00
ML-DSA-44 sign 756804 cycles 756359 cycles 1.00
ML-DSA-44 verify 229234 cycles 229083 cycles 1.00
ML-DSA-65 keypair 378769 cycles 378500 cycles 1.00
ML-DSA-65 sign 1240394 cycles 1240209 cycles 1.00
ML-DSA-65 verify 371908 cycles 371886 cycles 1.00
ML-DSA-87 keypair 602950 cycles 602448 cycles 1.00
ML-DSA-87 sign 1580764 cycles 1580047 cycles 1.00
ML-DSA-87 verify 619634 cycles 618603 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A55 (Snapdragon 888) benchmarks (opt)

Details
Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-44 keypair 271362 cycles 266927 cycles 1.02
ML-DSA-44 sign 803267 cycles 797012 cycles 1.01
ML-DSA-44 verify 272085 cycles 268934 cycles 1.01
ML-DSA-65 keypair 465548 cycles 462006 cycles 1.01
ML-DSA-65 sign 1346356 cycles 1325140 cycles 1.02
ML-DSA-65 verify 452908 cycles 447051 cycles 1.01
ML-DSA-87 keypair 796067 cycles 789145 cycles 1.01
ML-DSA-87 sign 1814243 cycles 1808930 cycles 1.00
ML-DSA-87 verify 774654 cycles 768055 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A55 (Snapdragon 888) benchmarks (no-opt)

Details
Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-44 keypair 458869 cycles 456127 cycles 1.01
ML-DSA-44 sign 2126957 cycles 2116401 cycles 1.00
ML-DSA-44 verify 552019 cycles 548507 cycles 1.01
ML-DSA-65 keypair 770502 cycles 766953 cycles 1.00
ML-DSA-65 sign 3470118 cycles 3454812 cycles 1.00
ML-DSA-65 verify 855533 cycles 851822 cycles 1.00
ML-DSA-87 keypair 1257950 cycles 1239858 cycles 1.01
ML-DSA-87 sign 4350714 cycles 4301107 cycles 1.01
ML-DSA-87 verify 1376864 cycles 1364841 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A72 (Raspberry Pi 4) benchmarks (opt)

Details
Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-44 keypair 220320 cycles 225607 cycles 0.98
ML-DSA-44 sign 629787 cycles 613159 cycles 1.03
ML-DSA-44 verify 218806 cycles 221594 cycles 0.99
ML-DSA-65 keypair 381297 cycles 390520 cycles 0.98
ML-DSA-65 sign 981110 cycles 1011825 cycles 0.97
ML-DSA-65 verify 362378 cycles 366404 cycles 0.99
ML-DSA-87 keypair 643194 cycles 647780 cycles 0.99
ML-DSA-87 sign 1333643 cycles 1330441 cycles 1.00
ML-DSA-87 verify 622584 cycles 623332 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Arm Cortex-A72 (Raspberry Pi 4) benchmarks (opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-87 keypair 672194 cycles 647780 cycles 1.04
ML-DSA-87 sign 1388591 cycles 1330441 cycles 1.04

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A72 (Raspberry Pi 4) benchmarks (no-opt)

Details
Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-44 keypair 311539 cycles 313248 cycles 0.99
ML-DSA-44 sign 1218975 cycles 1165859 cycles 1.05
ML-DSA-44 verify 357865 cycles 340574 cycles 1.05
ML-DSA-65 keypair 577066 cycles 577106 cycles 1.00
ML-DSA-65 sign 1946955 cycles 1997418 cycles 0.97
ML-DSA-65 verify 545442 cycles 555863 cycles 0.98
ML-DSA-87 keypair 865804 cycles 881765 cycles 0.98
ML-DSA-87 sign 2425492 cycles 2508825 cycles 0.97
ML-DSA-87 verify 893414 cycles 921818 cycles 0.97

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@mkannwischer mkannwischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hanno-becker. LGTM.

@mkannwischer mkannwischer merged commit ec0cdd4 into main May 21, 2026
493 checks passed
@mkannwischer mkannwischer deleted the keccak_stack_align branch May 21, 2026 13:20
@hanno-becker hanno-becker added x86_64 enhancement New feature or request labels May 22, 2026
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Arm Cortex-A72 (Raspberry Pi 4) benchmarks (no-opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 6c13542 Previous: cbc80f6 Ratio
ML-DSA-44 sign 1218975 cycles 1165859 cycles 1.05
ML-DSA-44 verify 357865 cycles 340574 cycles 1.05

This comment was automatically generated by workflow using github-action-benchmark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants