mxfp

Here are 2 public repositories matching this topic...

yifu-ding / MP-Sparse-Attn

MP-Sparse-Attn provides Triton kernels for Diagonal-Tiled Mixed-Precision Attention, targeting efficient low-bit MXFP inference for Transformer models. It combines tile-level mixed-precision computation and kernel fusion to accelerate attention on modern GPUs.

triton attention mixed-precision low-bit-quantization llm-inference mxfp

Updated Jun 15, 2026
Python

Pushkinist / rMLX

Star

Rust-native single-binary MLX inference + conversion backend for Apple Silicon

Updated Jun 18, 2026
Rust

Improve this page

Add a description, image, and links to the mxfp topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the mxfp topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly