Skip to content

Commit c6cded4

Browse files
authored
Update 2020-7-20-pytorch-1.6-released.md
1 parent da7a6cd commit c6cded4

File tree

1 file changed

+23
-56
lines changed

1 file changed

+23
-56
lines changed

_posts/2020-7-20-pytorch-1.6-released.md

Lines changed: 23 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,12 @@ title: 'PyTorch 1.6 released w/ Native AMP Support, Microsoft joins as maintaine
44
author: Team PyTorch
55
---
66

7-
Today, we’re announcing the availability of PyTorch 1.6, along with updated domain libraries. We are also excited to announce the team at <Microsoft is now maintaining Windows> builds and binaries and will also be supporting the community on GitHub as well as the PyTorch Windows [discussion forums](https://discuss.pytorch.org/c/windows/).
7+
Today, we’re announcing the availability of PyTorch 1.6, along with updated domain libraries. We are also excited to announce the team at <Microsoft is now maintaining Windows> builds and binaries and will also be supporting the community on GitHub as well as the PyTorch Windows discussion forums.
88

99
The PyTorch 1.6 release includes a number of new APIs, tools for performance improvement and profiling, as well as major updates to both distributed data parallel (DDP) and remote procedure call (RPC) based distributed training.
1010
A few of the highlights include:
1111

12-
1. Automatic mixed precision (AMP) training is now natively supported and a stable feature (See <here> for more details) - thanks for NVIDIA’s contributions;
12+
1. Automatic mixed precision (AMP) training is now natively supported and a stable feature (See [here] for more details) - thanks for NVIDIA’s contributions;
1313
2. Native TensorPipe support now added for tensor-aware, point-to-point communication primitives built specifically for machine learning;
1414
3. Added support for complex tensors to the frontend API surface;
1515
4. New profiling tools providing tensor-level memory consumption information; and
@@ -23,9 +23,9 @@ Additionally, from this release onward, features will be classified as Stable, B
2323

2424
AMP allows users to easily enable automatic mixed precision training enabling higher performance and memory savings of up to 50% on Tensor Core GPUs. Using the natively supported `torch.cuda.amp` API, AMP provides convenience methods for mixed precision, where some operations use the `torch.float32 (float)` datatype and other operations use `torch.float16 (half)`. Some ops, like linear layers and convolutions, are much faster in `float16`. Other ops, like reductions, often require the dynamic range of `float32`. Mixed precision tries to match each op to its appropriate datatype.
2525

26-
* Design doc | [Link](https://github.com/pytorch/pytorch/issues/25081)
27-
* Documentation | [Link](https://pytorch.org/docs/stable/amp.html)
28-
* Usage examples | [Link](https://pytorch.org/docs/stable/notes/amp_examples.html)
26+
* Design doc ([Link](https://github.com/pytorch/pytorch/issues/25081))
27+
* Documentation ([Link](https://pytorch.org/docs/stable/amp.html))
28+
* Usage examples ([Link](https://pytorch.org/docs/stable/notes/amp_examples.html))
2929

3030
## [Beta] Fork/Join Parallelism
3131

@@ -34,7 +34,6 @@ This release adds support for a language-level construct as well as runtime supp
3434
Parallel execution of TorchScript programs is enabled through two primitives: `torch.jit.fork and torch.jit.wait`. In the below example, we parallelize execution of `foo:`
3535

3636
```python
37-
3837
import torch
3938
from typing import List
4039

@@ -50,7 +49,7 @@ def example(x):
5049
print(example(torch.ones([])))
5150
```
5251

53-
### * Documentation | [Link](https://pytorch.org/docs/stable/jit.html)
52+
* Documentation ([Link](https://pytorch.org/docs/stable/jit.html))
5453

5554
## [Beta] Memory Profiler
5655

@@ -80,10 +79,8 @@ torch::autograd::GraphRoot 691.816us 691.816us 100
8079
----------------------------------- --------------- --------------- ---------------
8180
```
8281

83-
* Design doc | [Link](https://github.com/pytorch/pytorch/pull/37775)
84-
* Documentation | [Link](https://pytorch.org/docs/stable/autograd.html#profiler)
85-
86-
82+
* Design doc ([Link](https://github.com/pytorch/pytorch/pull/37775))
83+
* Documentation ([Link](https://pytorch.org/docs/stable/autograd.html#profiler))
8784

8885
# Distributed Training & RPC
8986

@@ -102,12 +99,12 @@ torch.distributed.rpc.init_rpc(
10299
torch.distributed.rpc.rpc_sync(...)
103100
```
104101

105-
* Design doc | [Link](https://github.com/pytorch/pytorch/issues/35251)
106-
* Documentation | [Link](https://pytorch.org/docs/stable/rpc/index.html)
102+
* Design doc ([Link](https://github.com/pytorch/pytorch/issues/35251))
103+
* Documentation ([Link](https://pytorch.org/docs/stable/rpc/index.html))
107104

108-
## [Beta] DDP+RPC
105+
## [Beta] DDP+RPC
109106

110-
PyTorch Distributed supports two powerful paradigms: DDP for full sync data parallel training of models and the RPC framework which allows for distributed model parallelism. Currently, these two features work independently and users can’t mix and match these to try out hybrid parallelism paradigms.
107+
PyTorch Distributed supports two powerful paradigms: DDP for full sync data parallel training of models and the RPC framework which allows for distributed model parallelism. Previously, these two features work independently and users can’t mix and match these to try out hybrid parallelism paradigms.
111108

112109
Starting in PyTorch 1.6, we’ve enabled DDP and RPC to work together seamlessly so that users can combine these two techniques to achieve both data parallelism and model parallelism. An example is where users would like to place large embedding tables on parameter servers and use the RPC framework for embedding lookups, but store smaller dense parameters on trainers and use DDP to synchronize the dense parameters. Below is a simple code snippet.
113110

@@ -124,9 +121,9 @@ for data in batch:
124121
torch.distributed.autograd.backward([loss])
125122
```
126123

127-
* DDP+RPC Tutorial | [Link](https://pytorch.org/tutorials/advanced/rpc_ddp_tutorial.html)
128-
* Documentation | [Link](https://pytorch.org/docs/stable/rpc/index.html)
129-
* Usage Examples | [Link](https://github.com/pytorch/examples/pull/800)
124+
* DDP+RPC Tutorial ([Link](https://pytorch.org/tutorials/advanced/rpc_ddp_tutorial.html))
125+
* Documentation ([Link](https://pytorch.org/docs/stable/rpc/index.html))
126+
* Usage Examples ([Link](https://github.com/pytorch/examples/pull/800))
130127

131128
## [Beta] RPC - Asynchronous User Functions
132129

@@ -147,12 +144,11 @@ ret = rpc.rpc_sync(
147144
)
148145

149146
print(ret) # prints tensor([3., 3.])
150-
151147
```
152148

153-
* Tutorial for performant batch RPC using Asynchronous User Functions| Link (https://github.com/pytorch/tutorials/blob/release/1.6/intermediate_source/rpc_async_execution.rst)
154-
* Documentation | Link (https://pytorch.org/docs/stable/rpc.html#torch.distributed.rpc.functions.async_execution)
155-
* Usage examples | Link (https://github.com/pytorch/examples/tree/stable/distributed/rpc/batch)
149+
* Tutorial for performant batch RPC using Asynchronous User Functions ([Link](https://github.com/pytorch/tutorials/blob/release/1.6/intermediate_source/rpc_async_execution.rst))
150+
* Documentation ([Link](https://pytorch.org/docs/stable/rpc.html#torch.distributed.rpc.functions.async_execution))
151+
* Usage examples ([Link](https://github.com/pytorch/examples/tree/stable/distributed/rpc/batch))
156152

157153
# Frontend API Updates
158154

@@ -164,15 +160,15 @@ The PyTorch 1.6 release brings beta level support for complex tensors including
164160

165161
## torchvision 0.7
166162

167-
torchvision 0.7 introduces two new pretrained semantic segmentation models, FCN ResNet50 (https://arxiv.org/abs/1411.4038) and DeepLabV3 ResNet50 (https://arxiv.org/abs/1706.05587), both trained on COCO and using smaller memory footprints than the ResNet101 backbone. We also introduced support for AMP (Automatic Mixed Precision) autocasting for torchvision models and operators, which automatically selects the floating point precision for different GPU operations to improve performance while maintaining accuracy.
163+
torchvision 0.7 introduces two new pretrained semantic segmentation models, [FCN ResNet50](https://arxiv.org/abs/1411.4038) and [DeepLabV3 ResNet50](https://arxiv.org/abs/1706.05587), both trained on COCO and using smaller memory footprints than the ResNet101 backbone. We also introduced support for AMP (Automatic Mixed Precision) autocasting for torchvision models and operators, which automatically selects the floating point precision for different GPU operations to improve performance while maintaining accuracy.
168164

169-
* Release notes | [Link](https://github.com/pytorch/vision/releases)
165+
* Release notes ([Link](https://github.com/pytorch/vision/releases))
170166

171167
## torchaudio 0.6
172168

173169
torchaudio now officially supports Windows. This release also introduces a new model module (with wav2letter included), new functionals (contrast, cvm, dcshift, overdrive, vad, phaser, flanger, biquad), datasets (GTZAN, CMU), and a new optional sox backend with support for torchscript.
174170

175-
* Release notes | [Link](https://github.com/pytorch/audio/releases)
171+
* Release notes ([Link](https://github.com/pytorch/audio/releases))
176172

177173
# Additional updates
178174

@@ -192,7 +188,7 @@ This is a great opportunity to connect with the community and practice your mach
192188

193189
## LPCV Challenge
194190

195-
The [2020 CVPR Low-Power Vision Challenge (LPCV) - Online Track for UAV video](https://lpcv.ai/2020CVPR/video-track) submission deadline coming up shortly. You have until July 31, 2020 to build a system that can discover and recognize characters in video captured by an unmanned aerial vehicle (UAV) accurately using PyTorch and Raspberry Pi 3B+.
191+
The [2020 CVPR Low-Power Vision Challenge (LPCV) - Online Track for UAV video](https://lpcv.ai/2020CVPR/video-track) submission deadline is coming up shortly. You have until July 31, 2020 to build a system that can discover and recognize characters in video captured by an unmanned aerial vehicle (UAV) accurately using PyTorch and Raspberry Pi 3B+.
196192

197193
## Prototype Features
198194

@@ -208,34 +204,5 @@ To reiterate, prototype features in PyTorch are early features that we are looki
208204

209205

210206
Cheers!
211-
Team PyTorch
212-
213-
214-
215-
216-
217-
218-
219-
220-
221-
222-
223-
224-
225-
226-
227-
228-
229-
230-
231-
232-
233-
234-
235-
236-
237-
238-
239-
240-
241207

208+
Team PyTorch

0 commit comments

Comments
 (0)