Updating Mem Profiler content

brucejlin1 · web-flow · commit 647f422f6c08 · 2020-07-28T08:12:40.000-07:00
diff --git a/_posts/2020-7-20-pytorch-1.6-released.md b/_posts/2020-7-20-pytorch-1.6-released.md
@@ -53,33 +53,34 @@ print(example(torch.ones([])))
 
 ## [Beta] Memory Profiler 
 
-The `torch.autograd` API now includes a memory profiler that lets you inspect the cost of different operators inside your CPU and GPU models. There are two modes implemented at the moment - CPU-only using [profile](https://pytorch.org/docs/master/autograd.html#torch.autograd.profiler.profile) and nvprof based (registers both CPU and GPU activity) using [emit_nvtx](https://pytorch.org/docs/master/autograd.html#torch.autograd.profiler.emit_nvtx). 
+The `torch.autograd.profiler` API now includes a memory profiler that lets you inspect the tensor memory cost of different operators inside your CPU and GPU models.
 
 Here is an example usage of the API:
 
 ```python
-x = torch.randn((1, 1), requires_grad=True)
-with torch.autograd.profiler.profile(profile_memory=True) as prof:
-    for _ in range(100): # any normal python code, really!
-        y = x ** 2
-        y.backward()
-    # NOTE: some columns were removed for brevity
-print(prof.key_averages().table(sort_by="self_cpu_time_total"))
+import torch
+import torchvision.models as models
+import torch.autograd.profiler as profiler
+
+model = models.resnet18()
+inputs = torch.randn(5, 3, 224, 224)
+with profiler.profile(profile_memory=True, record_shapes=True) as prof:
+    model(inputs)
+
+# NOTE: some columns were removed for brevity
+print(prof.key_averages().table(sort_by="self_cpu_memory_usage", row_limit=10))
+# ---------------------------  ---------------  ---------------  ---------------
+# Name                         CPU Mem          Self CPU Mem     Number of Calls
+# ---------------------------  ---------------  ---------------  ---------------
+# empty                        94.79 Mb         94.79 Mb         123
+# resize_                      11.48 Mb         11.48 Mb         2
+# addmm                        19.53 Kb         19.53 Kb         1
+# empty_strided                4 b              4 b              1
+# conv2d                       47.37 Mb         0 b              20
+# ---------------------------  ---------------  ---------------  ---------------
  ```
 
-```python
-------------------------------- ---------   -------------  ---------------   -------
-Name Self                       CPU total   CPU time avg   Number of Calls   CPU Mem
-------------------------------- ---------   -------------  ---------------   -------
-mul                             32.048ms    32.048ms       200               800 b
-pow                             27.041ms    27.041ms       200               800 b
-PowBackward                     09.727ms    55.483ms       100
-torch::autograd::AccumulateGrad 9.148ms     9.148ms        100
-torch::autograd::GraphRoot      691.816us   691.816us      100
-------------------------------- ---------   -------------  ---------------   -------
-```
-
-* Design doc ([Link](https://github.com/pytorch/pytorch/pull/37775))
+* PR ([Link](https://github.com/pytorch/pytorch/pull/37775))
 * Documentation ([Link](https://pytorch.org/docs/stable/autograd.html#profiler))
 
 # Distributed Training & RPC