Skip to content

FlexAttention sliding window replaces SDPA fallback #5

@2imi9

Description

@2imi9

Summary

Replaced SDPA fallback (no sliding window) with FlexAttention (PyTorch 2.5+). Supports sliding window + GQA natively.

Results

Backend Sliding Window val_bpb tok/sec
SDPA (old) No 1.739 ~70k
FlexAttention Yes 1.680 ~83k

Commit: 8c7bed4

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions