Skip to content

Commit 647f422

Browse files
authored
Updating Mem Profiler content
1 parent 8f0556f commit 647f422

File tree

1 file changed

+22
-21
lines changed

1 file changed

+22
-21
lines changed

_posts/2020-7-20-pytorch-1.6-released.md

Lines changed: 22 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -53,33 +53,34 @@ print(example(torch.ones([])))
5353

5454
## [Beta] Memory Profiler
5555

56-
The `torch.autograd` API now includes a memory profiler that lets you inspect the cost of different operators inside your CPU and GPU models. There are two modes implemented at the moment - CPU-only using [profile](https://pytorch.org/docs/master/autograd.html#torch.autograd.profiler.profile) and nvprof based (registers both CPU and GPU activity) using [emit_nvtx](https://pytorch.org/docs/master/autograd.html#torch.autograd.profiler.emit_nvtx).
56+
The `torch.autograd.profiler` API now includes a memory profiler that lets you inspect the tensor memory cost of different operators inside your CPU and GPU models.
5757

5858
Here is an example usage of the API:
5959

6060
```python
61-
x = torch.randn((1, 1), requires_grad=True)
62-
with torch.autograd.profiler.profile(profile_memory=True) as prof:
63-
for _ in range(100): # any normal python code, really!
64-
y = x ** 2
65-
y.backward()
66-
# NOTE: some columns were removed for brevity
67-
print(prof.key_averages().table(sort_by="self_cpu_time_total"))
61+
import torch
62+
import torchvision.models as models
63+
import torch.autograd.profiler as profiler
64+
65+
model = models.resnet18()
66+
inputs = torch.randn(5, 3, 224, 224)
67+
with profiler.profile(profile_memory=True, record_shapes=True) as prof:
68+
model(inputs)
69+
70+
# NOTE: some columns were removed for brevity
71+
print(prof.key_averages().table(sort_by="self_cpu_memory_usage", row_limit=10))
72+
# --------------------------- --------------- --------------- ---------------
73+
# Name CPU Mem Self CPU Mem Number of Calls
74+
# --------------------------- --------------- --------------- ---------------
75+
# empty 94.79 Mb 94.79 Mb 123
76+
# resize_ 11.48 Mb 11.48 Mb 2
77+
# addmm 19.53 Kb 19.53 Kb 1
78+
# empty_strided 4 b 4 b 1
79+
# conv2d 47.37 Mb 0 b 20
80+
# --------------------------- --------------- --------------- ---------------
6881
```
6982

70-
```python
71-
------------------------------- --------- ------------- --------------- -------
72-
Name Self CPU total CPU time avg Number of Calls CPU Mem
73-
------------------------------- --------- ------------- --------------- -------
74-
mul 32.048ms 32.048ms 200 800 b
75-
pow 27.041ms 27.041ms 200 800 b
76-
PowBackward 09.727ms 55.483ms 100
77-
torch::autograd::AccumulateGrad 9.148ms 9.148ms 100
78-
torch::autograd::GraphRoot 691.816us 691.816us 100
79-
------------------------------- --------- ------------- --------------- -------
80-
```
81-
82-
* Design doc ([Link](https://github.com/pytorch/pytorch/pull/37775))
83+
* PR ([Link](https://github.com/pytorch/pytorch/pull/37775))
8384
* Documentation ([Link](https://pytorch.org/docs/stable/autograd.html#profiler))
8485

8586
# Distributed Training & RPC

0 commit comments

Comments
 (0)