Skip to content

[Examples] add profiling method for decode phrase#657

Open
GuoningHuang wants to merge 2 commits intobuddy-compiler:mainfrom
GuoningHuang:profiling
Open

[Examples] add profiling method for decode phrase#657
GuoningHuang wants to merge 2 commits intobuddy-compiler:mainfrom
GuoningHuang:profiling

Conversation

@GuoningHuang
Copy link
Contributor

@GuoningHuang GuoningHuang commented Dec 30, 2025

Added a profiling method for the decode phase of DeepSeek-R1.
This code models the execution latency of each operator in the first Transformer layer during the decode phase.
To use it, replace the subgraph0_decode.mlir file generated in build/examples/BuddyDeepSeekR1 with the uploaded one.
The output results are as follows:
image
The 1.5B model has a total of 28 layers, and multiplying this value by the per-layer latency yields nearly the end-to-end execution time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant