[frontend]: automatically inject profiling instructions in DynamoComiler#665
Draft
trdthg wants to merge 1 commit intobuddy-compiler:mainfrom
Draft
[frontend]: automatically inject profiling instructions in DynamoComiler#665trdthg wants to merge 1 commit intobuddy-compiler:mainfrom
trdthg wants to merge 1 commit intobuddy-compiler:mainfrom
Conversation
…iler
This commit introduces the `enable_profile` option to automatically measure
the execution time of each graph node. When enabled, the DynamoCompiler performs
the following instrumentations after `lower_to_top_level_ir`:
- Inject `rtclock` for timestamp acquire and `record_timing` for data record
- Injects global strings for each node in format "op_name_{node_index}_{node_name}"
- Injects timing probes around each node to
calculate and record the elapsed execution time.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This commit introduces the
enable_profileoption to automatically measure the execution time of each graph node. When enabled, the DynamoCompiler performs the following instrumentations afterlower_to_top_level_ir:rtclockfor timestamp acquire andrecord_timingfor data recordBuddyTransformer example output
There were duplicate entries for
arg0_1andoutputin the nodes list, I don't know why, so I prepended the strings with an ID as a workaround. They also might be useful for sorting.Checklist