Flash Attention 3/4 does not support Blackwell (RTX 5090, SM 12.0)

## Problem

FA3 kernels crash on RTX 5090 (Blackwell, SM 12.0): `no kernel image is available for execution on the device`

## Upstream Status
- Dao-AILab/flash-attention#2268: SM 12.0 support split into PRs #2329-#2336
- FA4 beta exists but requires Linux-only `nvidia-cutlass-dsl`
- tridao estimate: 2-3 weeks for full SM 12.0

## Current Workaround
FlexAttention (PyTorch 2.5+) with sliding window — val_bpb 1.680

## Action Items
- [ ] Monitor FA4 SM 12.0 PRs
- [ ] Update Dockerfile + train.py when FA4 lands
- [ ] Evaluate SageAttention3 Blackwell variant

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flash Attention 3/4 does not support Blackwell (RTX 5090, SM 12.0) #4

Problem

Upstream Status

Current Workaround

Action Items

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Flash Attention 3/4 does not support Blackwell (RTX 5090, SM 12.0) #4

Description

Problem

Upstream Status

Current Workaround

Action Items

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions