A collection of linear algebra kernels optimized for PULP-based architectures. The project aims to be portable, testable, and easy to extend across architectures. Currently PLAY supports two PULP architectures: Pulp Open and Spatz.
include/— Contains the single public header,play.h, which exposes the library APIs.source/— Contains the implementation of the linear algebra kernels. Each subdirectory corresponds to a specific kernel implementation and each kernel subdirectory has a top-level<kernel>.cand anarch/folder with per-arch sources.test/— Contains test cases for each kernel for each supported architecture. Each kernel undertest/<kernel>/includes test code, per-target wrappers, test data and test data generation scripts.
This project uses a consistent pattern to support multiple architectures hiding specific implementation details to the user.
For each kernel:
- There is a public dispatcher, placed in
source/<kernel>/<kernel.c>, and forwards to a single unified symble named<kernel>_impl()(for examplematrix_mul_impl()). - Each architecture implements its own version of the kernel in
source/<kernel>/arch/<kernel>_<arch>.c(for examplematrix_mul_pulp_open.c,matrix_mul_spatz.c). - To avoid duplicating small per-kernel header files, this repository uses a centralized internal header
include/internal/arch_interface.h, which collects the prototypes for all unified per-kernel implementation symbols (for examplevector_mul_impl,vector_sub_impl, ...).- Dispatcher files (
source/<kernel>/<kernel>.c) and arch implementations (source/<kernel>/arch/<kernel>_<arch>.c) include this centralized header.
- Dispatcher files (
This means the public API and the test-level code do not need to know architecture internals; the build picks the correct arch implementation which provides the unified symbol. The user just call the API he is interested in, exposed and documented in the single header file play.h.
Each kernel has an associated test case in the test directory. Each test subfolder contains a dedicated folder for the source code and compilation files for each supported architecture, along with a shared folder containing the test input data and a Python script for generating them. Each architecture implements its own main.c, which loads the source data, calls the kernel under test with this data, and compares the calculated result with the expected one.
Environment setup is a prerequisite for test execution, varying according to the specific target architecture and simulation platform
To run a test targeting the Pulp Open architecture, it is necessary to install the RISCV GNU Toolchain and the pulp-sdk. Follow the instructions provided here for the toolchain and here for the pulp-sdk.
Once installed the toolchain and sdk, simply set the PULP_RISCV_GCC_TOOLCHAIN environment variable to your toolchain path and source the pulp-open configuration from the SDK. For example:
export PULP_RISCV_GCC_TOOLCHAIN="/opt/riscv"
source /home/sam/repos/pulp-sdk/configs/pulp-open.shNote: use the actual installation paths on your local machine.
To run a test targeting the Pulp Open architecture in RTL, it is necessary to build also the hardware first. Follow the instruction provided here.
To run test targeting the Spatz architecture, it is necessary to build and install the Spatz repository. Follow the instruction provided here. Note: This will install also the specific LLVM and GCC for Spatz. Once installed, set the following environment variables:
export LLVM_PATH="/opt/riscv/spatz-14-llvm"
export GCC_PATH="/home/sam/repos/spatz/install/riscv-gcc"
export SPATZ_SW_DIR="/home/sam/repos/spatz/sw"Note: use the actual installation paths on your local machine.
To run a test targeting the Spatz architecture on GVSoC, it is necessary set an additional environment variable.
export GVSOC_PATH="/home/sam/repos/gvsoc/install/bin/gvsoc"Note: use the actual installation paths on your local machine.
To run a test targeting the Spatz architecture on Verilator, it is necessary set an additional environment variable.
export VLT_PATH="/home/sam/repos/spatz/hw/system/spatz_cluster/bin/spatz_cluster.vlt"Note: use the actual installation paths on your local machine.
To run a test targeting the Spatz architecture on QuestaSim, it is necessary set an additional environment variable.
export VSIM_PATH="/home/sam/repos/spatz/hw/system/spatz_cluster/bin/spatz_cluster.vsim"Note: use the actual installation paths on your local machine.
For testing the architecture is selected via the TARGET make variable at compile time (e.g. make TARGET=PULP_OPEN or make TARGET=SPATZ). The top level test Makefile conditionally includes the arch specific Makefile; these target-specific Makefiles build sources and set compile-time flags (for example APP_CFLAGS += -DTARGET_IS_PULP_OPEN). Test sources are collected with a wildcard like $(wildcard $(SRC_DIR)/<kernel>/*.c) but are filtered to exclude any arch/ subdirectory so architecture implementations are not pulled in automatically. The add_arch_impl macro (defined in test/common/arch_select.mk) appends the architecture-specific implementation file source/<kernel>/arch/<kernel>_$(ARCH_SUFFIX).c to the list of source files to be compiled.
For testing, several make flags are available:
-
Common flags to all targets
TARGET=<target>- Specify the target architecture. Supported targets arePULP_OPENandSPATZSTATS=1— Collect and print performance countersENABLE_LOGGING=1— Enable loggingPRINT_DATA=1— Print inputs, computed and reference values
-
Flags for
PULP_OPENUSE_CLUSTER=1— Run on the cluster (multi-core) instead of the fabric controller (default)NUM_CORES=<n>— Number of cluster cores whenUSE_CLUSTER=1(defaultNUM_CORES=1, max valueNUM_CORES=8)
-
Flags for
SPATZNUM_CC=<n>- Number of Core-Complexes in Cluster (defaultNUM_CC=1, max valueNUM_CC=2)PLATFORM=<platform> - Emulator platform. Supported platforms areGVSOC,VLTandVSIM(defaultVLT)
To build and run a test, run:
make -C test/<kernel> clean all run TARGET=<target> PLATFORM=<platform> <FLAGS>For example:
make -C test/matrix_mul clean all run TARGET=PULP_OPEN PLATFORM=GVSOC USE_CLUSTER=1 NUM_CORES=1 ENABLE_LOGGING=1 STATS=1Each test case includes a Python script (generator.py) to generate input data and expected results. The script typically takes parameters such as vector or matrix dimensions. Data is saved to test/<kernel/test_data/data.h.
To regenerate data:
$ python3 generator.py <LEN>For example:
python3 test/vector_mul/test_data/generator.py 128The test/runners/<arch> directory contains several Python scripts to automate test execution, parse outputs and traces, and to generate graphs and benchmark tables. For example, see the Pulp Open benchmarks for QuestaSim here and the Pulp Open benchmarks for GVSoC here.
To use PLAY as a library in your project, follow these steps:
- Include the Header: Add the single header in your source code:
#include "play.h" - Include Path: Add the library's
includedirectory to your compiler's include path (e.g.,-Ipath/to/PLAY/include). - Compilation & Linking: Compile and link the library by adding:
- The top-level implementation files:
source/*/*.c - The architecture-specific implementation file(s) for your target:
source/<kernel>/arch/<kernel>_$(ARCH_SUFFIX).c(e.g.,source/<kernel>/arch/<kernel>_pulp_open.c).
- The top-level implementation files:
The single play.h header simplifies adoption: it exposes the public API while the build system selects and compiles the correct backend implementation for the chosen architecture.
-
Kernel implementation (add architecture support to kernels)
- Pick an architecture suffix, e.g.
myarch. - Add
source/<kernel>/arch/<kernel>_myarch.cand implement<kernel>_impl(...). - Note: the implementation must match the function signature declared in
include/internal/arch_interface.hexactly.
- Pick an architecture suffix, e.g.
-
Register the architecture in the build mapping
- Update
test/common/arch_select.mkto map yourTARGETtoARCH_SUFFIXsoadd_arch_implwill pick the correct file for each kernel.
- Update
-
Add test cases
- Implement the test case under
test/<kernel>/<new-arch>(wraps for PULP and/or SPATZ as needed).
- Implement the test case under
-
Test data generation
- If your new architecture requires different test data or conditional data, adapt the Python test-data generator(s) (e.g.
test/*/test_data/generator.py) so generated headers (data.h) or test inputs include the appropriate#if TARGET_IS_<ARCH>branches.
- If your new architecture requires different test data or conditional data, adapt the Python test-data generator(s) (e.g.
-
Update top-level build files
- Adapt the top-level Makefile(s) or CMake configuration to include/build the target architecture. This usually means adding a target or including the proper per-architecture build snippet so the project supports a
make TARGET=MYARCH(or equivalent) invocation.
- Adapt the top-level Makefile(s) or CMake configuration to include/build the target architecture. This usually means adding a target or including the proper per-architecture build snippet so the project supports a
Note: remember to document any new toolchain or dependency requirements for your architecture, add example build commands to the README.