Add ppc64le backend (supports p8 and above architectures) [full CI]#1677
Add ppc64le backend (supports p8 and above architectures) [full CI]#1677mkannwischer wants to merge 29 commits into
Conversation
There was a problem hiding this comment.
Mac Mini (M1, 2020) benchmarks
Details
| Benchmark suite | Current: 12036a9 | Previous: db75353 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
12319 cycles |
12319 cycles |
1 |
ML-KEM-512 encaps |
14997 cycles |
14997 cycles |
1 |
ML-KEM-512 decaps |
19549 cycles |
19551 cycles |
1.00 |
ML-KEM-768 keypair |
21263 cycles |
21264 cycles |
1.00 |
ML-KEM-768 encaps |
23873 cycles |
23874 cycles |
1.00 |
ML-KEM-768 decaps |
30417 cycles |
30425 cycles |
1.00 |
ML-KEM-1024 keypair |
30328 cycles |
30327 cycles |
1.00 |
ML-KEM-1024 encaps |
34573 cycles |
34573 cycles |
1 |
ML-KEM-1024 decaps |
44189 cycles |
44190 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
ppc64le (POWER10) benchmarks
Details
| Benchmark suite | Current: 12036a9 | Previous: db75353 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
38059 cycles |
59591 cycles |
0.64 |
ML-KEM-512 encaps |
43668 cycles |
72335 cycles |
0.60 |
ML-KEM-512 decaps |
53803 cycles |
92227 cycles |
0.58 |
ML-KEM-768 keypair |
67131 cycles |
97410 cycles |
0.69 |
ML-KEM-768 encaps |
76341 cycles |
114292 cycles |
0.67 |
ML-KEM-768 decaps |
90588 cycles |
139751 cycles |
0.65 |
ML-KEM-1024 keypair |
108574 cycles |
151020 cycles |
0.72 |
ML-KEM-1024 encaps |
119069 cycles |
169141 cycles |
0.70 |
ML-KEM-1024 decaps |
137495 cycles |
200776 cycles |
0.68 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Intel Xeon 4th gen (c7i)
Details
| Benchmark suite | Current: 12036a9 | Previous: db75353 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
12048 cycles |
12038 cycles |
1.00 |
ML-KEM-512 encaps |
13632 cycles |
13787 cycles |
0.99 |
ML-KEM-512 decaps |
17778 cycles |
17801 cycles |
1.00 |
ML-KEM-768 keypair |
21266 cycles |
21014 cycles |
1.01 |
ML-KEM-768 encaps |
22146 cycles |
22184 cycles |
1.00 |
ML-KEM-768 decaps |
28443 cycles |
28329 cycles |
1.00 |
ML-KEM-1024 keypair |
29577 cycles |
29959 cycles |
0.99 |
ML-KEM-1024 encaps |
31745 cycles |
31722 cycles |
1.00 |
ML-KEM-1024 decaps |
39476 cycles |
39346 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Intel Xeon 4th gen (c7i)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.
| Benchmark suite | Current: 5f3d329 | Previous: 2ee902c | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
12044 cycles |
9751 cycles |
1.24 |
ML-KEM-512 encaps |
13624 cycles |
11423 cycles |
1.19 |
ML-KEM-512 decaps |
17783 cycles |
15570 cycles |
1.14 |
ML-KEM-768 keypair |
21292 cycles |
16302 cycles |
1.31 |
ML-KEM-768 encaps |
22015 cycles |
17954 cycles |
1.23 |
ML-KEM-768 decaps |
28023 cycles |
23461 cycles |
1.19 |
ML-KEM-1024 keypair |
29562 cycles |
22439 cycles |
1.32 |
ML-KEM-1024 encaps |
31715 cycles |
24509 cycles |
1.29 |
ML-KEM-1024 decaps |
39395 cycles |
32178 cycles |
1.22 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
AMD EPYC 3rd gen (c6a)
Details
| Benchmark suite | Current: 12036a9 | Previous: db75353 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
14207 cycles |
14376 cycles |
0.99 |
ML-KEM-512 encaps |
15984 cycles |
16060 cycles |
1.00 |
ML-KEM-512 decaps |
21538 cycles |
21627 cycles |
1.00 |
ML-KEM-768 keypair |
25114 cycles |
24794 cycles |
1.01 |
ML-KEM-768 encaps |
25658 cycles |
25550 cycles |
1.00 |
ML-KEM-768 decaps |
33523 cycles |
33409 cycles |
1.00 |
ML-KEM-1024 keypair |
34848 cycles |
37228 cycles |
0.94 |
ML-KEM-1024 encaps |
36114 cycles |
37346 cycles |
0.97 |
ML-KEM-1024 decaps |
47236 cycles |
46787 cycles |
1.01 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
AMD EPYC 4th gen (c7a)
Details
| Benchmark suite | Current: 12036a9 | Previous: db75353 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
12796 cycles |
12777 cycles |
1.00 |
ML-KEM-512 encaps |
14285 cycles |
14269 cycles |
1.00 |
ML-KEM-512 decaps |
19148 cycles |
19117 cycles |
1.00 |
ML-KEM-768 keypair |
22525 cycles |
22412 cycles |
1.01 |
ML-KEM-768 encaps |
23071 cycles |
23051 cycles |
1.00 |
ML-KEM-768 decaps |
30094 cycles |
30064 cycles |
1.00 |
ML-KEM-1024 keypair |
34224 cycles |
32997 cycles |
1.04 |
ML-KEM-1024 encaps |
33002 cycles |
33104 cycles |
1.00 |
ML-KEM-1024 decaps |
42405 cycles |
42483 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Intel Xeon 4th gen (c7i) (no-opt)
Details
| Benchmark suite | Current: 12036a9 | Previous: db75353 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
28337 cycles |
28237 cycles |
1.00 |
ML-KEM-512 encaps |
36782 cycles |
36623 cycles |
1.00 |
ML-KEM-512 decaps |
45398 cycles |
45123 cycles |
1.01 |
ML-KEM-768 keypair |
46215 cycles |
46315 cycles |
1.00 |
ML-KEM-768 encaps |
55787 cycles |
55593 cycles |
1.00 |
ML-KEM-768 decaps |
69875 cycles |
69917 cycles |
1.00 |
ML-KEM-1024 keypair |
70417 cycles |
70363 cycles |
1.00 |
ML-KEM-1024 encaps |
82459 cycles |
82510 cycles |
1.00 |
ML-KEM-1024 decaps |
99343 cycles |
99218 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'AMD EPYC 4th gen (c7a)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.
| Benchmark suite | Current: 12036a9 | Previous: db75353 | Ratio |
|---|---|---|---|
ML-KEM-1024 keypair |
34224 cycles |
32997 cycles |
1.04 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Intel Xeon 3rd gen (c6i)
Details
| Benchmark suite | Current: 12036a9 | Previous: db75353 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
17603 cycles |
17547 cycles |
1.00 |
ML-KEM-512 encaps |
19900 cycles |
19938 cycles |
1.00 |
ML-KEM-512 decaps |
26420 cycles |
26450 cycles |
1.00 |
ML-KEM-768 keypair |
31203 cycles |
31168 cycles |
1.00 |
ML-KEM-768 encaps |
31989 cycles |
32415 cycles |
0.99 |
ML-KEM-768 decaps |
41468 cycles |
41536 cycles |
1.00 |
ML-KEM-1024 keypair |
43770 cycles |
43998 cycles |
0.99 |
ML-KEM-1024 encaps |
45855 cycles |
46270 cycles |
0.99 |
ML-KEM-1024 decaps |
58042 cycles |
58266 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Intel Xeon 3rd gen (c6i)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.
| Benchmark suite | Current: 5f3d329 | Previous: 2ee902c | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
17588 cycles |
16301 cycles |
1.08 |
ML-KEM-512 encaps |
19896 cycles |
18736 cycles |
1.06 |
ML-KEM-512 decaps |
26412 cycles |
25234 cycles |
1.05 |
ML-KEM-768 keypair |
31190 cycles |
28649 cycles |
1.09 |
ML-KEM-768 encaps |
31768 cycles |
30001 cycles |
1.06 |
ML-KEM-1024 keypair |
43790 cycles |
37884 cycles |
1.16 |
ML-KEM-1024 encaps |
45790 cycles |
40704 cycles |
1.12 |
ML-KEM-1024 decaps |
58108 cycles |
54265 cycles |
1.07 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Graviton4
Details
| Benchmark suite | Current: 12036a9 | Previous: db75353 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
17686 cycles |
17646 cycles |
1.00 |
ML-KEM-512 encaps |
20642 cycles |
20608 cycles |
1.00 |
ML-KEM-512 decaps |
27077 cycles |
27084 cycles |
1.00 |
ML-KEM-768 keypair |
29981 cycles |
29899 cycles |
1.00 |
ML-KEM-768 encaps |
32757 cycles |
32774 cycles |
1.00 |
ML-KEM-768 decaps |
42007 cycles |
41962 cycles |
1.00 |
ML-KEM-1024 keypair |
43716 cycles |
43745 cycles |
1.00 |
ML-KEM-1024 encaps |
48773 cycles |
48719 cycles |
1.00 |
ML-KEM-1024 decaps |
61379 cycles |
61386 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
AMD EPYC 3rd gen (c6a) (no-opt)
Details
| Benchmark suite | Current: 12036a9 | Previous: db75353 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
40197 cycles |
40266 cycles |
1.00 |
ML-KEM-512 encaps |
48368 cycles |
48417 cycles |
1.00 |
ML-KEM-512 decaps |
62501 cycles |
62596 cycles |
1.00 |
ML-KEM-768 keypair |
63804 cycles |
63800 cycles |
1.00 |
ML-KEM-768 encaps |
74937 cycles |
74978 cycles |
1.00 |
ML-KEM-768 decaps |
93413 cycles |
93631 cycles |
1.00 |
ML-KEM-1024 keypair |
95310 cycles |
95102 cycles |
1.00 |
ML-KEM-1024 encaps |
109384 cycles |
109294 cycles |
1.00 |
ML-KEM-1024 decaps |
132137 cycles |
132065 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
AMD EPYC 4th gen (c7a) (no-opt)
Details
| Benchmark suite | Current: 12036a9 | Previous: db75353 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
36602 cycles |
36810 cycles |
0.99 |
ML-KEM-512 encaps |
43115 cycles |
43058 cycles |
1.00 |
ML-KEM-512 decaps |
55703 cycles |
55671 cycles |
1.00 |
ML-KEM-768 keypair |
58678 cycles |
58693 cycles |
1.00 |
ML-KEM-768 encaps |
67624 cycles |
67471 cycles |
1.00 |
ML-KEM-768 decaps |
84521 cycles |
84392 cycles |
1.00 |
ML-KEM-1024 keypair |
89114 cycles |
89088 cycles |
1.00 |
ML-KEM-1024 encaps |
99256 cycles |
99346 cycles |
1.00 |
ML-KEM-1024 decaps |
120774 cycles |
120756 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Intel Xeon 3rd gen (c6i) (no-opt)
Details
| Benchmark suite | Current: 12036a9 | Previous: db75353 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
45749 cycles |
45722 cycles |
1.00 |
ML-KEM-512 encaps |
54475 cycles |
54376 cycles |
1.00 |
ML-KEM-512 decaps |
69855 cycles |
69830 cycles |
1.00 |
ML-KEM-768 keypair |
74173 cycles |
74187 cycles |
1.00 |
ML-KEM-768 encaps |
86050 cycles |
86041 cycles |
1.00 |
ML-KEM-768 decaps |
106672 cycles |
106532 cycles |
1.00 |
ML-KEM-1024 keypair |
112123 cycles |
112130 cycles |
1.00 |
ML-KEM-1024 encaps |
124717 cycles |
124654 cycles |
1.00 |
ML-KEM-1024 decaps |
150632 cycles |
150714 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Graviton3
Details
| Benchmark suite | Current: 12036a9 | Previous: db75353 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
18675 cycles |
18639 cycles |
1.00 |
ML-KEM-512 encaps |
21889 cycles |
21878 cycles |
1.00 |
ML-KEM-512 decaps |
28890 cycles |
28864 cycles |
1.00 |
ML-KEM-768 keypair |
31630 cycles |
31545 cycles |
1.00 |
ML-KEM-768 encaps |
34788 cycles |
34776 cycles |
1.00 |
ML-KEM-768 decaps |
44839 cycles |
44778 cycles |
1.00 |
ML-KEM-1024 keypair |
46069 cycles |
46079 cycles |
1.00 |
ML-KEM-1024 encaps |
51492 cycles |
51492 cycles |
1 |
ML-KEM-1024 decaps |
65005 cycles |
65023 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Graviton4 (no-opt)
Details
| Benchmark suite | Current: 12036a9 | Previous: db75353 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
35504 cycles |
35410 cycles |
1.00 |
ML-KEM-512 encaps |
40175 cycles |
40114 cycles |
1.00 |
ML-KEM-512 decaps |
51132 cycles |
51139 cycles |
1.00 |
ML-KEM-768 keypair |
56800 cycles |
56670 cycles |
1.00 |
ML-KEM-768 encaps |
64827 cycles |
65147 cycles |
1.00 |
ML-KEM-768 decaps |
78931 cycles |
79294 cycles |
1.00 |
ML-KEM-1024 keypair |
87846 cycles |
87857 cycles |
1.00 |
ML-KEM-1024 encaps |
97109 cycles |
96871 cycles |
1.00 |
ML-KEM-1024 decaps |
115956 cycles |
115822 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Arm Cortex-A76 (Raspberry Pi 5) benchmarks
Details
| Benchmark suite | Current: 12036a9 | Previous: db75353 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
28265 cycles |
28220 cycles |
1.00 |
ML-KEM-512 encaps |
34157 cycles |
34107 cycles |
1.00 |
ML-KEM-512 decaps |
44377 cycles |
44335 cycles |
1.00 |
ML-KEM-768 keypair |
47618 cycles |
47614 cycles |
1.00 |
ML-KEM-768 encaps |
53934 cycles |
53937 cycles |
1.00 |
ML-KEM-768 decaps |
68340 cycles |
68365 cycles |
1.00 |
ML-KEM-1024 keypair |
70248 cycles |
70248 cycles |
1 |
ML-KEM-1024 encaps |
78733 cycles |
78728 cycles |
1.00 |
ML-KEM-1024 decaps |
98416 cycles |
98444 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Graviton3 (no-opt)
Details
| Benchmark suite | Current: 12036a9 | Previous: db75353 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
39016 cycles |
38886 cycles |
1.00 |
ML-KEM-512 encaps |
44562 cycles |
44589 cycles |
1.00 |
ML-KEM-512 decaps |
56630 cycles |
56665 cycles |
1.00 |
ML-KEM-768 keypair |
62456 cycles |
62296 cycles |
1.00 |
ML-KEM-768 encaps |
71385 cycles |
72308 cycles |
0.99 |
ML-KEM-768 decaps |
86856 cycles |
87700 cycles |
0.99 |
ML-KEM-1024 keypair |
96224 cycles |
96159 cycles |
1.00 |
ML-KEM-1024 encaps |
106363 cycles |
106136 cycles |
1.00 |
ML-KEM-1024 decaps |
126811 cycles |
126585 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Graviton2
Details
| Benchmark suite | Current: 12036a9 | Previous: db75353 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
28238 cycles |
28274 cycles |
1.00 |
ML-KEM-512 encaps |
34161 cycles |
34125 cycles |
1.00 |
ML-KEM-512 decaps |
44340 cycles |
44382 cycles |
1.00 |
ML-KEM-768 keypair |
47638 cycles |
47672 cycles |
1.00 |
ML-KEM-768 encaps |
53920 cycles |
53906 cycles |
1.00 |
ML-KEM-768 decaps |
68398 cycles |
68361 cycles |
1.00 |
ML-KEM-1024 keypair |
70366 cycles |
70253 cycles |
1.00 |
ML-KEM-1024 encaps |
78752 cycles |
78754 cycles |
1.00 |
ML-KEM-1024 decaps |
98551 cycles |
98440 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
SpacemiT K1 8 (Banana Pi F3) benchmarks
Details
| Benchmark suite | Current: 12036a9 | Previous: db75353 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
155506 cycles |
155497 cycles |
1.00 |
ML-KEM-512 encaps |
163419 cycles |
163389 cycles |
1.00 |
ML-KEM-512 decaps |
206655 cycles |
206624 cycles |
1.00 |
ML-KEM-768 keypair |
249866 cycles |
249882 cycles |
1.00 |
ML-KEM-768 encaps |
270396 cycles |
270411 cycles |
1.00 |
ML-KEM-768 decaps |
332775 cycles |
332827 cycles |
1.00 |
ML-KEM-1024 keypair |
395754 cycles |
395617 cycles |
1.00 |
ML-KEM-1024 encaps |
423609 cycles |
422610 cycles |
1.00 |
ML-KEM-1024 decaps |
507214 cycles |
506225 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Graviton2 (no-opt)
Details
| Benchmark suite | Current: 12036a9 | Previous: db75353 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
59230 cycles |
59139 cycles |
1.00 |
ML-KEM-512 encaps |
68655 cycles |
68634 cycles |
1.00 |
ML-KEM-512 decaps |
87379 cycles |
87351 cycles |
1.00 |
ML-KEM-768 keypair |
95187 cycles |
95327 cycles |
1.00 |
ML-KEM-768 encaps |
109224 cycles |
109878 cycles |
0.99 |
ML-KEM-768 decaps |
134022 cycles |
134352 cycles |
1.00 |
ML-KEM-1024 keypair |
146800 cycles |
148090 cycles |
0.99 |
ML-KEM-1024 encaps |
162753 cycles |
163969 cycles |
0.99 |
ML-KEM-1024 decaps |
194331 cycles |
195624 cycles |
0.99 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Arm Cortex-A55 (Snapdragon 888) benchmarks
Details
| Benchmark suite | Current: 12036a9 | Previous: db75353 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
59765 cycles |
59775 cycles |
1.00 |
ML-KEM-512 encaps |
67418 cycles |
67520 cycles |
1.00 |
ML-KEM-512 decaps |
86122 cycles |
86168 cycles |
1.00 |
ML-KEM-768 keypair |
97488 cycles |
97434 cycles |
1.00 |
ML-KEM-768 encaps |
110849 cycles |
110991 cycles |
1.00 |
ML-KEM-768 decaps |
138197 cycles |
138336 cycles |
1.00 |
ML-KEM-1024 keypair |
155074 cycles |
154826 cycles |
1.00 |
ML-KEM-1024 encaps |
172210 cycles |
172753 cycles |
1.00 |
ML-KEM-1024 decaps |
209438 cycles |
208701 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
CBMC Results (ML-KEM-1024)Full Results (191 proofs)
|
CBMC Results (ML-KEM-512)Full Results (191 proofs)
|
CBMC Results (ML-KEM-768)Full Results (191 proofs)
|
There was a problem hiding this comment.
Arm Cortex-A72 (Raspberry Pi 4) benchmarks
Details
| Benchmark suite | Current: 12036a9 | Previous: db75353 | Ratio |
|---|---|---|---|
ML-KEM-512 keypair |
50711 cycles |
50835 cycles |
1.00 |
ML-KEM-512 encaps |
59128 cycles |
58693 cycles |
1.01 |
ML-KEM-512 decaps |
75104 cycles |
74946 cycles |
1.00 |
ML-KEM-768 keypair |
86483 cycles |
86333 cycles |
1.00 |
ML-KEM-768 encaps |
94967 cycles |
95399 cycles |
1.00 |
ML-KEM-768 decaps |
117949 cycles |
119161 cycles |
0.99 |
ML-KEM-1024 keypair |
129328 cycles |
130734 cycles |
0.99 |
ML-KEM-1024 encaps |
141943 cycles |
143257 cycles |
0.99 |
ML-KEM-1024 decaps |
174407 cycles |
173329 cycles |
1.01 |
This comment was automatically generated by workflow using github-action-benchmark.
This commit introduce separate proof for: - mlk_keccakf1600_permute_c() - mlk_keccakf1600x4_extract_bytes_c() - mlk_keccakf1600x4_xor_bytes_c() For arithmetic function that have a native implementation, we have 3 CBMC proofs: 1. Proof for the pure C implementation names XXX_c() 2. Proof for the wrapper function on top of the C implementation 3. Proof for the wrapper function on top of the native function (with C fallback). This commit seperate current proofs for these three functions follow above structure. For each function, the following steps performed: - Add the corresponding CBMC contract, copied from the wrapper function. - Create a dedicated CBMC proof for the pure C implementation. - Update the existing wrapper CBMC proof Makefiles by adding XXX_C to USE_FUNCTION_CONTRACTS, and apply the same change to the native proof configuration. Signed-off-by: willieyz <willie.zhao@chelpis.com> Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
I recently switched from Chelpis to zeroRISC meaning our MAINTAINERS.md is outdated. This commit removes affilations and Discord identifers from the file as they are unnecessary. Signed-off-by: Matthias J. Kannwischer <matthias@zerorisc.com> Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
Signed-off-by: Matthias J. Kannwischer <matthias@zerorisc.com> Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
…res and tested.
1. Run scripts/autogen and scripts/lint on Mac but not sure if it runs for ppc64le.
2. Run simpasm on Red Hat Linux.
3. Added detailed comments on NTT and INTT implementations.
4. Used C type symbols to improve readability.
5. Fixed some typos.
Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
The following tests were run on p10.
[09:28] danny@ltcden12-lp1 new_ppc64le_mlkem % ./scripts/tests func
INFO > Functional Test Compile (native no_opt): make func OPT=0 AUTO=1 -j40
INFO > Functional Test ML-KEM-512 (native no_opt): make run_func_512 -j40
INFO > Functional Test ML-KEM-768 (native no_opt): make run_func_768 -j40
INFO > Functional Test ML-KEM-1024 (native no_opt): make run_func_1024 -j40
INFO > Functional Test Compile (native opt): make func OPT=1 AUTO=1 -j40
INFO > Functional Test ML-KEM-512 (native opt): make run_func_512 -j40
INFO > Functional Test ML-KEM-768 (native opt): make run_func_768 -j40
INFO > Functional Test ML-KEM-1024 (native opt): make run_func_1024 -j40
All good!
[09:28] danny@ltcden12-lp1 new_ppc64le_mlkem % ./scripts/tests bench -c PERF
INFO > Benchmark Compile (native no_opt): make bench OPT=0 AUTO=1 CYCLES=PERF -j40
INFO > Benchmark ML-KEM-512 (native no_opt): make run_bench_512
INFO > Benchmark ML-KEM-512 (native no_opt): test/build/mlkem512/bin/bench_mlkem512
keypair cycles = 66982
encaps cycles = 78820
decaps cycles = 100923
percentile 1 10 20 30 40 50 60 70 80 90 99
keypair percentiles: 66438 66690 66791 66857 66920 66982 67043 67122 67218 67306 71905
encaps percentiles: 78322 78516 78618 78687 78752 78820 78878 78933 79012 79116 83825
decaps percentiles: 100427 100634 100733 100804 100869 100923 100985 101056 101131 101253 105852
INFO > Benchmark ML-KEM-768 (native no_opt): make run_bench_768
INFO > Benchmark ML-KEM-768 (native no_opt): test/build/mlkem768/bin/bench_mlkem768
keypair cycles = 111380
encaps cycles = 125891
decaps cycles = 154364
percentile 1 10 20 30 40 50 60 70 80 90 99
keypair percentiles: 110575 110914 111083 111192 111291 111380 111496 111617 111821 112414 116725
encaps percentiles: 125081 125403 125526 125655 125776 125891 125998 126122 126293 126771 131358
decaps percentiles: 153575 153870 154008 154131 154261 154364 154487 154630 154782 155313 159863
INFO > Benchmark ML-KEM-1024 (native no_opt): make run_bench_1024
INFO > Benchmark ML-KEM-1024 (native no_opt): test/build/mlkem1024/bin/bench_mlkem1024
keypair cycles = 166809
encaps cycles = 185315
decaps cycles = 220229
percentile 1 10 20 30 40 50 60 70 80 90 99
keypair percentiles: 165339 165995 166236 166435 166616 166809 167007 167200 167505 171058 175606
encaps percentiles: 183839 184563 184778 184951 185158 185315 185518 185744 186123 189637 192014
decaps percentiles: 218911 219430 219705 219841 220027 220229 220436 220673 221029 224484 226901
INFO > Benchmark Compile (native opt): make bench OPT=1 AUTO=1 CYCLES=PERF -j40
INFO > Benchmark ML-KEM-512 (native opt): make run_bench_512
INFO > Benchmark ML-KEM-512 (native opt): test/build/mlkem512/bin/bench_mlkem512
keypair cycles = 45750
encaps cycles = 50661
decaps cycles = 63561
percentile 1 10 20 30 40 50 60 70 80 90 99
keypair percentiles: 45248 45469 45546 45620 45690 45750 45806 45886 45954 46063 50703
encaps percentiles: 50192 50367 50468 50542 50600 50661 50710 50771 50858 50954 55652
decaps percentiles: 63091 63276 63381 63436 63497 63561 63623 63679 63743 63857 68437
INFO > Benchmark ML-KEM-768 (native opt): make run_bench_768
INFO > Benchmark ML-KEM-768 (native opt): test/build/mlkem768/bin/bench_mlkem768
keypair cycles = 79045
encaps cycles = 86455
decaps cycles = 103878
percentile 1 10 20 30 40 50 60 70 80 90 99
keypair percentiles: 78313 78578 78742 78847 78954 79045 79169 79285 79470 79978 84430
encaps percentiles: 85628 86009 86172 86272 86363 86455 86592 86711 86879 87292 92038
decaps percentiles: 103041 103399 103540 103676 103788 103878 103993 104104 104274 104736 109361
INFO > Benchmark ML-KEM-1024 (native opt): make run_bench_1024
INFO > Benchmark ML-KEM-1024 (native opt): test/build/mlkem1024/bin/bench_mlkem1024
keypair cycles = 124072
encaps cycles = 134500
decaps cycles = 157090
percentile 1 10 20 30 40 50 60 70 80 90 99
keypair percentiles: 122727 123259 123515 123720 123929 124072 124253 124527 125009 128466 133334
encaps percentiles: 133064 133681 133933 134129 134320 134500 134711 134933 135346 138753 141067
decaps percentiles: 155503 156261 156510 156694 156894 157090 157285 157605 158014 161592 166723
All good!
Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
Power Systems support ISA 2.07 and above. 2. Fixed typo, headers and return (MLK_MUST_CHECK_RETURN_VALUE). Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
(venv) [9:51][@MacBookPro] mlkem_test/ % ./scripts/autogen ✓ Generate citations (0.2s) ✓ Generate OQS META.yml files (0.0s) – Generate SLOTHY optimized assembly (0.0s) ✓ Check assembly register aliases (0.1s) ✓ Check assembly loop labels (0.1s) ✓ Normalize assembly macro syntax (0.3s) ✓ Generate zeta and lookup tables (0.0s) ✓ Generate HOL Light assembly (1.7s) ✓ Synchronize backends (1.4s) ✓ Generate header guards (0.1s) ✓ Complete final backend synchronization (0.6s) – Update HOL Light bytecode (0.0s) ✓ Generate monolithic source files (1.6s) ✓ Generate undefs (1.4s) ✓ Generate test configs (0.0s) ✓ Check macro typos (0.3s) /Users/danny/my_repo/ws/docker_mlkem/mlkem_test/./scripts/autogen:500: PyparsingDeprecationWarning: 'parseString' deprecated - use 'parse_string' exp = self.parser.parseString(exp, parseAll=True).as_list()[0] ✓ Generate preprocessor comments (1.6s) ✓ Format files (2.2s) updated BIBLIOGRAPHY.md updated mlkem/src/native/ppc64le/src/reduce.S updated mlkem/src/native/ppc64le/src/poly_tomont.S updated dev/ppc64le/src/ntt_ppc.S updated dev/ppc64le/src/reduce.S updated mlkem/src/native/ppc64le/src/intt_ppc.S updated dev/ppc64le/src/poly_tomont.S updated mlkem/mlkem_native.c updated integration/liboqs/config_ppc64le.h updated dev/ppc64le/src/intt_ppc.S updated mlkem/src/native/ppc64le/src/ntt_ppc.S updated mlkem/mlkem_native_asm.S ✓ Finalize and write files (0.0s) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:11 Done ✓ Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
…added ppc64le assembly files and consts.c in mlkem_native_asm.S and mlkem_native.c since autogen did not added these files for ppc64le. Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
Signed-off-by: Basil Hess <bhe@zurich.ibm.com> Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
Signed-off-by: Basil Hess <bhe@zurich.ibm.com> Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
and fall back to default implementation if not. Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
1. Fixed typos and minor comments. 2. Removed IZETA_NTT_OFFSET127 in consts_intt.inc. 3. Fixed _asm subfix in backend name. 4. Fixed MLK_PPC_ prefix in constants. Manually fixed the backend names of all assembly files in mlkem/src/native/ppc64le/src/ to run the the test since the simpasm can not be run properly for ppc64le in my env. Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
…s correct. ./scripts/tests func runs fine. Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
1. Fixed wrong path in ML-KEM-768_META.yml. 2. Added __POWER8_VECTOR__ guard in asm files and meta.h. 3. Fixed capitalization in asm macros and misc. 4. Renamed asm files to _ppc_asm.S. Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
… twisted zetas. Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
Signed-off-by: Basil Hess <bhe@zurich.ibm.com>
ppc64le: address review comments from mkannwischer
Signed-off-by: Basil Hess <bhe@zurich.ibm.com>
Signed-off-by: Basil Hess <bhe@zurich.ibm.com>
- ruff format scripts/autogen (formatting fix) - Add check-magic annotation for array size 2072 in consts.c and consts.h (7 groups of 8 base constants + 4 twiddle tables * 63 rows * 8 values) Signed-off-by: Basil Hess <bhe@zurich.ibm.com>
…power8 (nix libc is compiled for power8 and otherwise causes illegal instructions. Avoids unused data parameter errors in the fallback code path. Signed-off-by: Basil Hess <bhe@zurich.ibm.com>
Signed-off-by: Basil Hess <bhe@zurich.ibm.com>
Running the full CI on #1648