sparseDiTEngine is a sparse attention runtime for diffusion-transformer style
video workloads. The repo ships:
- the core
spadePython package underpython/spade - an in-tree
block_sparse_attnpackage underpython/block_sparse_attn - native CUDA/CUTLASS/ThunderKittens kernels under
csrc - model integrations for Wan 2.1, Wan 2.2, and Hunyuan Video under
model
For step-by-step environment setup, see INSTALL.md.
python/block_sparse_attn: first-party block sparse attention Python APIcsrc/block_sparse_attn: native block sparse attention extension sourcespython/spade: sparse engine APIs and model-facing integration codetests: unit and integration tests
The native extension lives in-tree and can be built directly from this repository.
export THUNDERKITTENS_ROOT="${THUNDERKITTENS_ROOT:-$PWD/3rdparty/ThunderKittens}"
python csrc/block_sparse_attn/setup.py build_ext --inplaceIf you need to control compilation targets, set
BLOCK_SPARSE_ATTN_CUDA_ARCHS, for example:
export BLOCK_SPARSE_ATTN_CUDA_ARCHS="80;89;90a"For source-tree usage:
export PYTHONPATH="$PWD/python"
python -c "import block_sparse_attn; import spade"The block_sparse_attn package preserves the historical import name while now
living directly inside this repository.
block_sparse_attn_func_bnshis the single BNSH forward entrypoint.- Hopper-only pure block-sparse forward kernels are dispatched through the same
block_sparse_attnpackage; there is no separate legacy Hopper extension. - Model subdirectories under
model/keep their own upstream licenses and notices.