Support byte_mirco_perf on intel gaudi2 by yupengzh-intel · Pull Request #141 · bytedance/xpu-perf

yupengzh-intel · 2025-08-27T08:21:35Z

Add HPU backend in byte_micro_perf/backends folder.
Copy HPU supported ops from GPU backend.
Gaudi doesn't support flash attention, we use fused_sdpa to support prefill for flash_attention op.
Add mark_step in byte_micro_perf/core/op.py, to break HPU graph under lazy mode, preventing graph creating issue when preparing too much tensors.

yupengzh-intel added 11 commits August 19, 2025 13:01

add gaudi2 support

0f26c7c

bug fix

c64f6c0

remove unused code

7a02ad5

remove unsupported op

c3a07dc

support flash_attention op decode case on Gaudi2 with vllm flat_pa

109a875

change flash_attation op from empty kv_cache to random kv_cache

234b2f2

fix flash_attention bug

bf0f108

remove unused code

f967345

improve HPU flash_attention performance

f38092b

rm unsupported fa case

1aa8c01

rm duplicated fa json

1a68317

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support byte_mirco_perf on intel gaudi2#141

Support byte_mirco_perf on intel gaudi2#141
yupengzh-intel wants to merge 11 commits into
bytedance:mainfrom
yupengzh-intel:main

yupengzh-intel commented Aug 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yupengzh-intel commented Aug 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant