-
Notifications
You must be signed in to change notification settings - Fork 54
Add ppc64le backend (supports p8 and above architectures) #1648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
0e13d31
8224a81
ba5f7b7
0ca9e89
3da8401
8855d0d
202a94c
78f2037
8ae2ebf
ca4fd58
d41c5c4
d555371
ff708a1
4ac67f0
bad6a0e
26d1717
e246cc7
7921977
1289fe4
f3705dd
47024dd
ca81bf5
6899543
a122b2a
d31af68
fca498b
6b242ce
3442786
12036a9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| [//]: # (SPDX-License-Identifier: CC-BY-4.0) | ||
|
|
||
| # ppc64le backend (little endian) | ||
|
|
||
| This directory contains a native backend for little endian POWER 9 (ppc64le) and above systems. | ||
| Or, Power systems supporting ISA 2.07 and above. | ||
|
|
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,77 @@ | ||
| /* | ||
| * Copyright (c) The mlkem-native project authors | ||
| * SPDX-License-Identifier: Apache-2.0 OR ISC OR MIT | ||
| */ | ||
|
|
||
| #ifndef MLK_DEV_PPC64LE_META_H | ||
| #define MLK_DEV_PPC64LE_META_H | ||
|
|
||
| /* Identifier for this backend so that source and assembly files | ||
| * in the build can be appropriately guarded. */ | ||
| #define MLK_ARITH_BACKEND_PPC64LE_DEFAULT | ||
|
|
||
| #define MLK_ARITH_BACKEND_NAME PPC64LE_DEFAULT | ||
|
|
||
| /* Set of primitives that this backend replaces */ | ||
| #define MLK_USE_NATIVE_NTT | ||
| #define MLK_USE_NATIVE_INTT | ||
| #define MLK_USE_NATIVE_POLY_REDUCE | ||
| #define MLK_USE_NATIVE_POLY_TOMONT | ||
|
|
||
| #if !defined(__ASSEMBLER__) | ||
| #include <string.h> | ||
| #include "../../common.h" | ||
| #include "../../params.h" | ||
| #include "../api.h" | ||
| #include "src/arith_native_ppc64le.h" | ||
|
|
||
| MLK_MUST_CHECK_RETURN_VALUE | ||
| static MLK_INLINE int mlk_ntt_native(int16_t data[MLKEM_N]) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Suggested replacement (matching with AVX2/NEON, should be applied to the other functions in this file as well): /* in mlkem/src/sys.h: add MLK_SYS_CAP_PPC_ISA_2_07 to mlk_sys_cap */
MLK_MUST_CHECK_RETURN_VALUE
static MLK_INLINE int mlk_ntt_native(int16_t data[MLKEM_N])
{
if (!mlk_sys_check_capability(MLK_SYS_CAP_PPC_ISA_2_07))
return MLK_NATIVE_FUNC_FALLBACK;
mlk_ntt_ppc(data, mlk_ppc_qdata);
return MLK_NATIVE_FUNC_SUCCESS;
}The default Once the capability check is centralised on
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @bhess I disagreed with you. The purpose of that I want it to check the capability is not just checking the ISA 2.07 but also to be able to added different architecture support in the future like p10 specific or p11 etc. So, I'll be able to re-assign function pointer to a different entry point if needed. So, mlk_ntt_ppc will be a function pointer to a different entry depending on the architecture. That's for future arch support. I can remove this check now. I just want to get the implementation merged quickly because we have been over this implementation for months if not 6 months.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok sounds good, I don't think runtime checks are a blocker for this PR, so removing them seems fine and can be iterated later.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think we want a function pointer based dispatch; unless there's a good reason to do it otherwise, dispatch should follow the pattern of checking architecture flags and calling the correct function directly.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I agree. For this PR, I’m in favor of removing the checks. We can defer runtime capability checks to a later iteration if we need to support multiple architectures/ISA levels in one build. |
||
| { | ||
| #if defined(__POWER8_VECTOR__) | ||
| mlk_ntt_ppc_asm(data, mlk_ppc_qdata); | ||
| return MLK_NATIVE_FUNC_SUCCESS; | ||
| #else | ||
| (void)data; | ||
| return MLK_NATIVE_FUNC_FALLBACK; | ||
| #endif | ||
| } | ||
|
|
||
| MLK_MUST_CHECK_RETURN_VALUE | ||
| static MLK_INLINE int mlk_intt_native(int16_t data[MLKEM_N]) | ||
| { | ||
| #if defined(__POWER8_VECTOR__) | ||
| mlk_intt_ppc_asm(data, mlk_ppc_qdata); | ||
| return MLK_NATIVE_FUNC_SUCCESS; | ||
| #else | ||
| (void)data; | ||
| return MLK_NATIVE_FUNC_FALLBACK; | ||
| #endif | ||
| } | ||
|
|
||
| MLK_MUST_CHECK_RETURN_VALUE | ||
| static MLK_INLINE int mlk_poly_reduce_native(int16_t data[MLKEM_N]) | ||
| { | ||
| #if defined(__POWER8_VECTOR__) | ||
| mlk_reduce_ppc_asm(data, mlk_ppc_qdata); | ||
| return MLK_NATIVE_FUNC_SUCCESS; | ||
| #else | ||
| (void)data; | ||
| return MLK_NATIVE_FUNC_FALLBACK; | ||
| #endif | ||
| } | ||
|
|
||
| MLK_MUST_CHECK_RETURN_VALUE | ||
| static MLK_INLINE int mlk_poly_tomont_native(int16_t data[MLKEM_N]) | ||
| { | ||
| #if defined(__POWER8_VECTOR__) | ||
| mlk_poly_tomont_ppc_asm(data, mlk_ppc_qdata); | ||
| return MLK_NATIVE_FUNC_SUCCESS; | ||
| #else | ||
| (void)data; | ||
| return MLK_NATIVE_FUNC_FALLBACK; | ||
| #endif | ||
| } | ||
| #endif /* !__ASSEMBLER__ */ | ||
|
|
||
| #endif /* !MLK_DEV_PPC64LE_META_H */ | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This backend should be enabled in the cross tests in CI: See ebbaf3e.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It may be worthwhile to explicitly pass |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,24 @@ | ||
| /* | ||
| * Copyright (c) The mlkem-native project authors | ||
| * SPDX-License-Identifier: Apache-2.0 OR ISC OR MIT | ||
| */ | ||
| #ifndef MLK_DEV_PPC64LE_SRC_ARITH_NATIVE_PPC64LE_H | ||
| #define MLK_DEV_PPC64LE_SRC_ARITH_NATIVE_PPC64LE_H | ||
|
|
||
| #include <stdint.h> | ||
| #include "../../../common.h" | ||
| #include "consts.h" | ||
|
|
||
| #define mlk_ntt_ppc_asm MLK_NAMESPACE(ntt_ppc_asm) | ||
| void mlk_ntt_ppc_asm(int16_t *, const int16_t *); | ||
|
|
||
| #define mlk_intt_ppc_asm MLK_NAMESPACE(intt_ppc_asm) | ||
| void mlk_intt_ppc_asm(int16_t *, const int16_t *); | ||
|
|
||
| #define mlk_reduce_ppc_asm MLK_NAMESPACE(reduce_ppc_asm) | ||
| void mlk_reduce_ppc_asm(int16_t *r, const int16_t *); | ||
|
|
||
| #define mlk_poly_tomont_ppc_asm MLK_NAMESPACE(poly_tomont_ppc_asm) | ||
| void mlk_poly_tomont_ppc_asm(int16_t *, const int16_t *); | ||
|
|
||
| #endif /* !MLK_DEV_PPC64LE_SRC_ARITH_NATIVE_PPC64LE_H */ |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,95 @@ | ||
| /* | ||
| * Copyright (c) The mlkem-native project authors | ||
| * SPDX-License-Identifier: Apache-2.0 OR ISC OR MIT | ||
| */ | ||
|
|
||
| #include "../../../common.h" | ||
|
|
||
| #if defined(MLK_ARITH_BACKEND_PPC64LE_DEFAULT) && \ | ||
| !defined(MLK_CONFIG_MULTILEVEL_NO_SHARED) && defined(__POWER8_VECTOR__) | ||
|
|
||
| #include "consts.h" | ||
|
|
||
| /* 7 groups of 8 base constants + 4 twiddle tables * 63 rows * 8 values */ | ||
| /* check-magic: 2072 == 7 * 8 + 4 * 63 * 8 */ | ||
| MLK_ALIGN MLK_INTERNAL_DATA_DEFINITION const int16_t mlk_ppc_qdata[2072] = { | ||
| /* -Q */ | ||
| /* check-magic: -3329 == -1 * MLKEM_Q */ | ||
| -3329, | ||
| -3329, | ||
| -3329, | ||
| -3329, | ||
| -3329, | ||
| -3329, | ||
| -3329, | ||
| -3329, | ||
| /* QINV */ | ||
| /* check-magic: -3327 == pow(MLKEM_Q,-1,2^16) */ | ||
| -3327, | ||
| -3327, | ||
| -3327, | ||
| -3327, | ||
| -3327, | ||
| -3327, | ||
| -3327, | ||
| -3327, | ||
| /* Q */ | ||
| 3329, | ||
| 3329, | ||
| 3329, | ||
| 3329, | ||
| 3329, | ||
| 3329, | ||
| 3329, | ||
| 3329, | ||
| /* check-magic: 20159 == round(2^26 / MLKEM_Q) */ | ||
| 20159, | ||
| 20159, | ||
| 20159, | ||
| 20159, | ||
| 20159, | ||
| 20159, | ||
| 20159, | ||
| 20159, | ||
| /* N^-1 in Montgomery form: pow(128,-1,MLKEM_Q) * 2^16 mod MLKEM_Q = 512. | ||
| * Multiplying by this via Barrett-fqmul scales INTT output by N^-1 and | ||
| * leaves it in Montgomery form (mlk_poly_invntt_tomont contract). */ | ||
| 512, | ||
| 512, | ||
| 512, | ||
| 512, | ||
| 512, | ||
| 512, | ||
| 512, | ||
| 512, | ||
| /* check-magic: 5040 == round((512 * 2**16 + MLKEM_Q) / MLKEM_Q) // 2 */ | ||
| /* Barrett twist of N^-1*R = round_to_even(N_INV_MONT * 2^16 / MLKEM_Q) / 2 | ||
| */ | ||
| 5040, | ||
| 5040, | ||
| 5040, | ||
| 5040, | ||
| 5040, | ||
| 5040, | ||
| 5040, | ||
| 5040, | ||
| /* check-magic: 1353 == pow(2, 32, MLKEM_Q) */ | ||
| 1353, | ||
| 1353, | ||
| 1353, | ||
| 1353, | ||
| 1353, | ||
| 1353, | ||
| 1353, | ||
| 1353, | ||
| /* zetas for NTT */ | ||
| #include "consts_ntt.inc" | ||
| /* zetas for invNTT */ | ||
| #include "consts_intt.inc" | ||
| /* twisted zetas for NTT (Barrett high-mul) */ | ||
| #include "consts_ntt_tw.inc" | ||
| /* twisted zetas for invNTT (Barrett high-mul) */ | ||
| #include "consts_intt_tw.inc" | ||
| }; | ||
| #endif /* MLK_ARITH_BACKEND_PPC64LE_DEFAULT && \ | ||
| !MLK_CONFIG_MULTILEVEL_NO_SHARED && __POWER8_VECTOR__ */ |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,32 @@ | ||
| /* | ||
| * Copyright (c) The mlkem-native project authors | ||
| * SPDX-License-Identifier: Apache-2.0 OR ISC OR MIT | ||
| */ | ||
|
|
||
| #ifndef MLK_DEV_PPC64LE_SRC_CONSTS_H | ||
| #define MLK_DEV_PPC64LE_SRC_CONSTS_H | ||
| #include "../../../common.h" | ||
|
|
||
| /* Offsets into the constant table */ | ||
| /* check-magic: off */ | ||
| #define MLK_PPC_NQ_OFFSET 0 | ||
| #define MLK_PPC_QINV_OFFSET 16 | ||
| #define MLK_PPC_Q_OFFSET 32 | ||
| #define MLK_PPC_C20159_OFFSET 48 | ||
| #define MLK_PPC_N_INV_OFFSET 64 | ||
| #define MLK_PPC_N_INV_TW_OFFSET 80 | ||
| #define MLK_PPC_C1353_OFFSET 96 | ||
| #define MLK_PPC_ZETA_NTT_OFFSET 112 | ||
| #define MLK_PPC_ZETA_INTT_OFFSET 1120 | ||
| #define MLK_PPC_ZETA_NTT_TW_OFFSET 2128 | ||
| #define MLK_PPC_ZETA_INTT_TW_OFFSET 3136 | ||
| /* check-magic: on */ | ||
|
|
||
| #ifndef __ASSEMBLER__ | ||
| #define mlk_ppc_qdata MLK_NAMESPACE(ppc_qdata) | ||
| /* 7 groups of 8 base constants + 4 twiddle tables * 63 rows * 8 values */ | ||
| /* check-magic: 2072 == 7 * 8 + 4 * 63 * 8 */ | ||
| MLK_INTERNAL_DATA_DECLARATION const int16_t mlk_ppc_qdata[2072]; | ||
| #endif /* !__ASSEMBLER__ */ | ||
|
|
||
| #endif /* !MLK_DEV_PPC64LE_SRC_CONSTS_H */ |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,77 @@ | ||
| /* | ||
| * Copyright (c) The mlkem-native project authors | ||
| * SPDX-License-Identifier: Apache-2.0 OR ISC OR MIT | ||
| */ | ||
|
|
||
| /* | ||
| * WARNING: This file is auto-generated from scripts/autogen | ||
| * in the mlkem-native repository. | ||
| * Do not modify it directly. | ||
| */ | ||
|
|
||
| /* Twiddle factors for the PPC64LE inverse NTT. | ||
| * See autogen for details. | ||
| */ | ||
| -394, -394, -1175, -1175, -1219, -1219, 885, 885, | ||
| 1212, 1212, 1029, 1029, -1607, -1607, -1455, -1455, | ||
| -1179, -1179, 886, 886, 1143, 1143, -554, -554, | ||
| 1092, 1092, 1026, 1026, -525, -525, 403, 403, | ||
| 561, 561, -735, -735, -1230, -1230, -863, -863, | ||
| 319, 319, 757, 757, 1063, 1063, -556, -556, | ||
| -780, -780, 1645, 1645, 375, 375, -1239, -1239, | ||
| -1031, -1031, -109, -109, 1584, 1584, -1292, -1292, | ||
| -992, -992, 641, 641, 733, 733, 268, 268, | ||
| -1021, -1021, -941, -941, 939, 939, -892, -892, | ||
| 952, 952, -642, -642, -1482, -1482, 1461, 1461, | ||
| 1651, 1651, -1540, -1540, -1626, -1626, -540, -540, | ||
| -1173, -1173, -279, -279, 756, 756, -314, -314, | ||
| -667, -667, 233, 233, 1409, 1409, -48, -48, | ||
| 723, 723, 1100, 1100, 1637, 1637, -1041, -1041, | ||
| -568, -568, -680, -680, 17, 17, 583, 583, | ||
| 1227, 1227, 1227, 1227, 910, 910, 910, 910, | ||
| -855, -855, -855, -855, -219, -219, -219, -219, | ||
| 1481, 1481, 1481, 1481, 648, 648, 648, 648, | ||
| -682, -682, -682, -682, -712, -712, -712, -712, | ||
| 1534, 1534, 1534, 1534, -927, -927, -927, -927, | ||
| 1438, 1438, 1438, 1438, -461, -461, -461, -461, | ||
| 807, 807, 807, 807, 452, 452, 452, 452, | ||
| -1010, -1010, -1010, -1010, 1435, 1435, 1435, 1435, | ||
| 1320, 1320, 1320, 1320, -1414, -1414, -1414, -1414, | ||
| -464, -464, -464, -464, 33, 33, 33, 33, | ||
| -816, -816, -816, -816, 632, 632, 632, 632, | ||
| 650, 650, 650, 650, -1352, -1352, -1352, -1352, | ||
| -1052, -1052, -1052, -1052, -1274, -1274, -1274, -1274, | ||
| 1197, 1197, 1197, 1197, -1025, -1025, -1025, -1025, | ||
| -76, -76, -76, -76, -1573, -1573, -1573, -1573, | ||
| 289, 289, 289, 289, 331, 331, 331, 331, | ||
| 821, 821, 821, 821, 821, 821, 821, 821, | ||
| -1355, -1355, -1355, -1355, -1355, -1355, -1355, -1355, | ||
| -450, -450, -450, -450, -450, -450, -450, -450, | ||
| -936, -936, -936, -936, -936, -936, -936, -936, | ||
| -447, -447, -447, -447, -447, -447, -447, -447, | ||
| 535, 535, 535, 535, 535, 535, 535, 535, | ||
| -1235, -1235, -1235, -1235, -1235, -1235, -1235, -1235, | ||
| 1426, 1426, 1426, 1426, 1426, 1426, 1426, 1426, | ||
| 1333, 1333, 1333, 1333, 1333, 1333, 1333, 1333, | ||
| -1089, -1089, -1089, -1089, -1089, -1089, -1089, -1089, | ||
| 56, 56, 56, 56, 56, 56, 56, 56, | ||
| -283, -283, -283, -283, -283, -283, -283, -283, | ||
| 1476, 1476, 1476, 1476, 1476, 1476, 1476, 1476, | ||
| 1339, 1339, 1339, 1339, 1339, 1339, 1339, 1339, | ||
| -882, -882, -882, -882, -882, -882, -882, -882, | ||
| 296, 296, 296, 296, 296, 296, 296, 296, | ||
| -1583, -1583, -1583, -1583, -1583, -1583, -1583, -1583, | ||
| 569, 569, 569, 569, 569, 569, 569, 569, | ||
| -69, -69, -69, -69, -69, -69, -69, -69, | ||
| -543, -543, -543, -543, -543, -543, -543, -543, | ||
| 797, 797, 797, 797, 797, 797, 797, 797, | ||
| 193, 193, 193, 193, 193, 193, 193, 193, | ||
| -1410, -1410, -1410, -1410, -1410, -1410, -1410, -1410, | ||
| 1062, 1062, 1062, 1062, 1062, 1062, 1062, 1062, | ||
| 848, 848, 848, 848, 848, 848, 848, 848, | ||
| -1432, -1432, -1432, -1432, -1432, -1432, -1432, -1432, | ||
| 630, 630, 630, 630, 630, 630, 630, 630, | ||
| -687, -687, -687, -687, -687, -687, -687, -687, | ||
| -40, -40, -40, -40, -40, -40, -40, -40, | ||
| -749, -749, -749, -749, -749, -749, -749, -749, | ||
| -1600, -1600, -1600, -1600, -1600, -1600, -1600, -1600, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment https://github.com/pq-code-package/mlkem-native/pull/1648/changes#r3187159006 regarding the CI wasn't addressed.
Could you please extend the cross ppc64le CI, so that is exercises the new backend.
Also to test the guard, we should also add a test for a platform not supporting the power 8 vector instructions.
This patch should do the job:
ppc64le-power7-ci.patch