Problem Description
We can read here that FP8 is enabled only for gfx942. According to my understanding gfx950 and gfx1201 should also support it on architecture level.
|
@functools.cache |
|
def arch_supports_fp8(): |
|
return is_hip() and get_arch() in ('gfx942') |
Maybe there are other blockers that has to be resolved before enabling FP8 for those platforms in flash-attention. You can consider this issue as bug report if this is as simple as updating this check or feature request if more changes are needed, but eventually I think FP8 can be enabled for gfx950 and gfx1201.
Problem Description
We can read here that FP8 is enabled only for gfx942. According to my understanding gfx950 and gfx1201 should also support it on architecture level.
flash-attention/flash_attn/flash_attn_triton_amd/utils.py
Lines 774 to 776 in ea8fe36
Maybe there are other blockers that has to be resolved before enabling FP8 for those platforms in flash-attention. You can consider this issue as bug report if this is as simple as updating this check or feature request if more changes are needed, but eventually I think FP8 can be enabled for gfx950 and gfx1201.