Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeffdaily is this indeed the case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, why are you adding a backtricks here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ROCm has full upstream support of both modern pytorch profiling via kineto + ROCm's libroctracer, as well as the older autograd profiler via ROCm's roctx.
Run any application, pytorch or otherwise, using ROCm's rocprof. The PyTorch GPU traces will be collected as part of your rocprof output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@malfet I quoted this function
supported_activities
in official PyTorch API which does not mention anything about ROCm GPUs. I suppose they should have something here for the ROCm profiling if that's the case.Also the backtricks is due to an error in PySpelling test where
ROCm
is not in dictionary, so the test caught it as a misspelled word. I suppose I can either do this or add it in a custom dictionary list.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should update that comment. ROCm profiler uses roctracer library to trace on-device HIP kernels. Passing
ProfilerActivity.CUDA
is the correct activity to use for both CUDA and ROCm due to ROCm PyTorch's strategy of "masquerading" as CUDA so that users do not have to make any changes to their pytorch models when running on ROCm. PyTorch chose to expose "CUDA" in public APIs and we chose to reuse them to make the transition to ROCm from CUDA easier on users.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://pytorch.org/docs/stable/notes/hip.html.