Artifact Evaluation — FCCM 2026
This repository contains the complete artifact for reproducing the two main experimental results presented in the paper:
| # | Result | Script | Output |
|---|---|---|---|
| 1 | FPGA resource utilization (Table) | build_all.py |
utilization.csv |
| 2 | CKKS HW vs. SW timing (Figure) | run_all_tests.py |
timing_results.pdf |
Pre-built FPGA bitstreams for all 10 configurations are included in
Pre_Built_Bitstreams/, so Result 2 can be reproduced without any
Xilinx synthesis tools — only the Kria KV260 board is needed.
ReFHE-NTT/
├── src/ # HLS source code (NTT kernel + modular arithmetic)
├── Build/
│ ├── HLS/ # Vitis HLS project (script.tcl, directives.tcl)
│ └── VIVADO/ # Vivado block designs
│ ├── vivado_proj.tcl # Mersenne template (300 MHz PL clock)
│ └── vivado_proj_b.tcl # Barrett template (143 MHz PL clock)
├── Pre_Built_Bitstreams/ # Pre-compiled bitstreams for all 10 configurations
│ ├── MERSENNE/ # NTT_M_64_{12..16}_wrapper.xsa
│ └── BARRETT/ # NTT_B_64_{12..16}_wrapper.xsa
├── Heaan_ckks/ # CKKS integration test (HW/SW co-design)
├── builds/ # Output directory for newly built bitstreams
│
├── configure.sh # Generate src/parameters.h for a configuration
├── Makefile # Top-level build orchestration (HLS + Vivado)
├── build_all.py # [Result 1] Build all configs, collect utilization
├── package_test.sh # Package test + bitstreams for device deployment
├── utilization.csv # [Result 1] Pre-generated utilization table
└── README.md # This file
The NTT kernel is parameterized by polynomial dimension (LOGN) and modular reduction strategy (Mersenne or Barrett):
| Parameter | Values | Description |
|---|---|---|
| LOGN | 12, 13, 14, 15, 16 | log2(N), polynomial ring dimension |
| MODE | mersenne, barrett |
Modular reduction strategy |
| LOGQ | 64 | Prime bit-width (fixed) |
This yields 10 configurations (5 LOGN values x 2 modes). Each mode uses a different Vivado block design with a distinct PL clock frequency:
| Mode | HLS Clock Target | Vivado PL Frequency | Vivado TCL Template |
|---|---|---|---|
| Mersenne | 5 ns | 300 MHz | Build/VIVADO/vivado_proj.tcl |
| Barrett | 7 ns | 143 MHz | Build/VIVADO/vivado_proj_b.tcl |
| Dependency | Version | Notes |
|---|---|---|
| AMD/Xilinx Vitis HLS | 2023.2 | HLS synthesis and IP export |
| AMD/Xilinx Vivado | 2024.2 | Block design, place & route |
| Python | >= 3.6 | Runs build_all.py |
| GNU Make | any | Build orchestration |
| bash | any | configure.sh and Makefile recipes |
| Linux x86_64 | any | At least 16 GB RAM recommended |
Building all 10 configurations takes several hours. A pre-generated
utilization.csvand pre-built bitstreams are provided.
| Dependency | Version | Notes |
|---|---|---|
| PYNQ SD image | 3.0+ | Base OS for the Kria board |
| XRT (Xilinx Runtime) | included in PYNQ | FPGA programming and buffer management |
| g++ | C++17 support | Compiles the test binary on-board |
| Python 3 | >= 3.6 | Runs run_all_tests.py |
| matplotlib | any | pip3 install matplotlib |
| numpy | any | pip3 install numpy |
| sudo access | — | Required for FPGA device access |
Several scripts reference environment-specific paths via Make variables. All have defaults that match common installation layouts, but must be overridden if your setup differs.
Host machine (top-level Makefile, used by build_all.py):
| Variable | Default | Purpose |
|---|---|---|
VITIS_HLS_SETTINGS |
/home/xilinx/Vitis_HLS/2023.2/settings64.sh |
Vitis HLS environment |
VIVADO_SETTINGS |
/home/xilinx/Vivado/2024.2/settings64.sh |
Vivado environment |
Override from the command line or environment:
make VITIS_HLS_SETTINGS=/opt/Xilinx/Vitis_HLS/2023.2/settings64.sh \
VIVADO_SETTINGS=/opt/Xilinx/Vivado/2024.2/settings64.sh \
LOGN=16 LOGQ=64 allOr export before running build_all.py:
export VITIS_HLS_SETTINGS=/opt/Xilinx/Vitis_HLS/2023.2/settings64.sh
export VIVADO_SETTINGS=/opt/Xilinx/Vivado/2024.2/settings64.sh
python3 build_all.pyKria device (heaan_test/Makefile, generated by package_test.sh):
| Variable | Default | Purpose |
|---|---|---|
XRT_SETUP |
/home/ubuntu/Kria-PYNQ/pynq/sdbuild/packages/xrt/xrt_setup.sh |
XRT runtime environment |
PYNQ_VENV |
/usr/local/share/pynq-venv/bin/activate |
PYNQ Python virtual environment |
If your PYNQ image uses different paths, override when running:
make XRT_SETUP=/path/to/xrt_setup.sh PYNQ_VENV=/path/to/pynq-venv/bin/activate \
LOGN=16 MODE=mersenne runThese variables are sourced with
2>/dev/null, so if they do not exist and XRT is already in yourPATH, the build and run will still succeed.
This result produces a CSV table with LUT, FF, BRAM, DSP, and URAM counts for each of the 10 NTT configurations.
A pre-generated utilization.csv is included at the repository root:
cat utilization.csvExpected columns: Design, Mode, LOGN, LOGQ, LUTs, FF, BRAM, DSP, URAM
Step 1. Set Xilinx tool paths (adjust for your installation):
export VITIS_HLS_SETTINGS=/path/to/Vitis_HLS/2023.2/settings64.sh
export VIVADO_SETTINGS=/path/to/Vivado/2024.2/settings64.shStep 2. Preview the build plan (dry run):
python3 build_all.py --dry-runThis prints all 10 configurations without building.
Step 3. Build all configurations and collect utilization:
python3 build_all.pyFor each of the 10 configurations the script:
- Calls
configure.shto generatesrc/parameters.hwith the correct defines (MERSENNE_NTTorBARRETT_RED,LOGN,LOGp, etc.) - Copies sources into
Build/HLS/and runs Vitis HLS with the mode-appropriate clock period (5 ns for Mersenne, 7 ns for Barrett) - Substitutes the design name into the matching Vivado TCL template
(
vivado_proj.tclat 300 MHz for Mersenne,vivado_proj_b.tclat 143 MHz for Barrett) and runs place & route - Parses the Vivado post-place utilization report
(
*_wrapper_utilization_placed.rpt) - Writes one row to
utilization.csv
Built bitstreams (.xsa files) are saved to builds/.
At the end, the script prints the full utilization table to the terminal.
Step 4. To build a single configuration manually:
make LOGN=14 LOGQ=64 allThis result measures the end-to-end timing of NTT, Encode, and Encrypt operations using the FPGA accelerator (HW) versus a pure software implementation (SW). The test runs all 10 configurations (5 LOGN x 2 modes), each repeated 50 times, and produces a grouped bar chart.
Pre-built bitstreams are included so that this result can be reproduced without rebuilding — only the Kria KV260 board is needed.
On the host machine, run:
./package_test.shBy default, this packages the bitstreams from builds/ — i.e., the ones
generated in Result 1. If builds/ is empty or you want to skip Result 1
entirely, use --prebuilt to package the provided pre-built bitstreams
instead:
./package_test.sh --prebuiltEither way, heaan_test.zip is created containing:
- CKKS test source code (from
Heaan_ckks/) - All 10 bitstreams (Mersenne + Barrett, LOGN 12..16)
- A self-contained Makefile with configurable
XRT_SETUPandPYNQ_VENVpaths - The
run_all_tests.pyautomation script
Transfer the zip to the board and extract it:
scp heaan_test.zip ubuntu@<kria-ip>:~/Desktop/
ssh ubuntu@<kria-ip>
cd ~/Desktop
unzip heaan_test.zip
cd heaan_testpip3 install matplotlib numpyCheck that the XRT and PYNQ paths in the Makefile match your board. The defaults assume a standard Kria-PYNQ image:
make help # shows current variable values and available bitstreamsIf your paths differ, either edit the Makefile or override on every
make / run_all_tests.py invocation (see Configurable Paths).
sudo python3 run_all_tests.pyThis script automatically:
- Iterates over all 10 configurations (LOGN=12..16, MODE=mersenne+barrett)
- For each configuration:
- Recompiles the test binary with the correct
-DLOGN,-DCSV_PATH,-DHwXSA_PATH, and-DMM_WIDTH_64(Mersenne only) flags - Loads the corresponding bitstream onto the FPGA
- Runs a correctness check (HW decode must match SW decode)
- Executes 50 timed runs measuring NTT, Encode, and Encrypt (HW and SW)
- Writes per-configuration results to
timing_{mode}_{logn}.csv
- Recompiles the test binary with the correct
- Aggregates all results into
results_summary.csv - Generates
timing_results.pdfandtiming_results.png
Options:
# Fewer runs per configuration (faster, less precise)
sudo python3 run_all_tests.py --nruns 10
# Regenerate the plot from existing CSV files (no FPGA needed)
python3 run_all_tests.py --plot-only# Compile and run Mersenne LOGN=16 with 50 runs
make LOGN=16 MODE=mersenne run
# Compile and run Barrett LOGN=14 with 20 runs
make LOGN=14 MODE=barrett NRUNS=20 runThe run target requires sudo for FPGA access and automatically sources
the XRT and PYNQ environments.
| File | Description |
|---|---|
timing_{mode}_{logn}.csv |
Per-run timing for one configuration (microseconds) |
results_summary.csv |
Aggregated averages across all configurations (milliseconds) |
timing_results.pdf |
Grouped bar chart (publication quality) |
timing_results.png |
Same chart in PNG format |
Build fails with "Mersenne not available" Mersenne reduction requires pre-computed shift-based multipliers, available only for LOGQ in {52, 55, 60, 61, 63, 64}. For other bit-widths, only Barrett mode is built.
"Bitstream not found" on the device
Verify that bitstreams/MERSENNE/ and bitstreams/BARRETT/ contain the
.xsa files. Re-run ./package_test.sh on the host if they are missing.
XRT / PYNQ environment not found
If make run fails to find XRT or PYNQ, check that the paths are correct:
ls /home/ubuntu/Kria-PYNQ/pynq/sdbuild/packages/xrt/xrt_setup.sh
ls /usr/local/share/pynq-venv/bin/activateOverride via make XRT_SETUP=... PYNQ_VENV=... run if they differ.
If XRT is already in your PATH (e.g., sourced in .bashrc), the defaults
will work even if the script paths do not exist.
Stack overflow / segfault on the device
Large polynomial dimensions (LOGN=15, 16) require a large stack. The Makefile
sets ulimit -s 1000000 automatically. If running manually:
sudo bash -c 'ulimit -s 1000000; ./main'matplotlib font warnings
The plotting script uses serif fonts with a fallback chain
(DejaVu Serif, Palatino, Times New Roman). If LaTeX rendering is not
available (no texlive installed), it falls back to standard matplotlib
fonts automatically. No additional font packages are required.
Plot shows only some configurations
Ensure all timing_{mode}_{logn}.csv files exist before running
--plot-only. Missing CSVs mean the corresponding tests did not complete.
@inproceedings{guerrini2026refhe,
title = {ReFHE-NTT: Resource-Driven NTT FPGA Architecture for Fully Homomorphic Encryption},
author = {Guerrini, Valentino and Sorrentino, Giuseppe and Barenghi, Alessandro and Conficconi, Davide},
booktitle = {Proceedings of the 34th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM)},
year = {2026},
organization = {IEEE},
note = {To appear}
}