## Problem FA3 kernels crash on RTX 5090 (Blackwell, SM 12.0): `no kernel image is available for execution on the device` ## Upstream Status - Dao-AILab/flash-attention#2268: SM 12.0 support split into PRs #2329-#2336 - FA4 beta exists but requires Linux-only `nvidia-cutlass-dsl` - tridao estimate: 2-3 weeks for full SM 12.0 ## Current Workaround FlexAttention (PyTorch 2.5+) with sliding window — val_bpb 1.680 ## Action Items - [ ] Monitor FA4 SM 12.0 PRs - [ ] Update Dockerfile + train.py when FA4 lands - [ ] Evaluate SageAttention3 Blackwell variant