You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _posts/2021-6-14-pytorch-1.9-new-library-releases.md
+20-12Lines changed: 20 additions & 12 deletions
Original file line number
Diff line number
Diff line change
@@ -9,7 +9,7 @@ Today, we are announcing updates to a number of PyTorch libraries, alongside the
9
9
Some highlights include:
10
10
11
11
***TorchVision** - Added new SSD and SSDLite models, quantized kernels for object detection, GPU Jpeg decoding, and iOS support. See [release notes](https://github.com/pytorch/vision/releases) here.
12
-
***TorchAudio** - Added inference-only wav2vec 2.0 model that can run in non-Python environments (including iOS), improved re-sampling (i.e. Kaiser Window), switched to PyTorch native Complex type, improved filtering and autograd support. See [release notes](https://github.com/pytorch/audio/releases) here.
12
+
***TorchAudio** - Added wav2vec 2.0 model deployable in non-Python environments (including C++, Android, and iOS). Many performance improvements in lfilter, spectral operations, resampling. Added options for quality control in sampling (i.e. Kaiser window support). Initiated the migration of complex tensors operations. Improved autograd support. See [release notes](https://github.com/pytorch/audio/releases) here.
13
13
***TorchText** - Added a new high-performance Vocab module that provides common functional APIs for NLP workflows. See [release notes](https://github.com/pytorch/text/releases) here.
14
14
15
15
We’d like to thank the community for their support and work on this latest release.
@@ -19,10 +19,18 @@ Features in PyTorch releases are classified as Stable, Beta, and Prototype. You
19
19
# TorchVision 0.10
20
20
21
21
### (Stable) Quantized kernels for object detection
22
-
The forward pass of the nms and roi_align operators now support tensors with a quantized dtype, which can help lower the memory footprint of object detection models, particularly on mobile environments. For more details, refer to [the documentation](https://pytorch.org/vision/stable/auto_examples/index.html).
22
+
The forward pass of the nms and roi_align operators now support tensors with a quantized dtype, which can help lower the memory footprint of object detection models, particularly on mobile environments. For more details, refer to [the documentation](https://pytorch.org/vision/stable/ops.html#torchvision.ops.roi_align).
23
+
24
+
### (Stable) Speed optimizations for Tensor transforms
25
+
The resize and flip transforms have been optimized and its runtime improved by up to 5x on the CPU.
26
+
27
+
### (Stable) Documentation improvements
28
+
Significant improvements were made to the documentation. In particular, a new gallery of examples is available. These examples visually illustrate how each transform acts on an image, and also properly documents and illustrates the output of the segmentation models.
29
+
30
+
The example gallery will be extended in the future to provide more comprehensive examples and serve as a reference for common torchvision tasks. For more details, refer to [the documentation](https://pytorch.org/vision/stable/auto_examples/index.html).
23
31
24
32
### (Beta) New models for detection
25
-
SSD and SSDlite are two popular object detection architectures that are efficient in terms of speed and provide good results for low resolution pictures. In this release, we provide implementations for the original SSD model with VGG16 backbone and for its mobile-friendly variant SSDlite with MobileNetV3-Large backbone.
33
+
[SSD](https://arxiv.org/abs/1512.02325) and [SSDlite](https://arxiv.org/abs/1801.04381) are two popular object detection architectures that are efficient in terms of speed and provide good results for low resolution pictures. In this release, we provide implementations for the original SSD model with VGG16 backbone and for its mobile-friendly variant SSDlite with MobileNetV3-Large backbone.
26
34
27
35
The models were pre-trained on COCO train2017 and can be used as follows:
28
36
@@ -47,7 +55,7 @@ The following accuracies can be obtained on COCO val2017 (full results available
@@ -71,7 +79,7 @@ TorchVision 0.10 now provides pre-compiled iOS binaries for its C++ operators, w
71
79
# TorchAudio 0.9.0
72
80
73
81
### (Stable) Complex Tensor Migration
74
-
TorchAudio has functions that handle complex-valued tensors. These functions follow a convention to use an extra dimension to represent real and imaginary parts. (In the following, we call this convention pseudo complex type.) In PyTorch 1.6, the native complex type was introduced. As its API is getting stable, torchaudio has started to migrate to the native complex type.
82
+
TorchAudio has functions that handle complex-valued tensors. These functions follow a convention to use an extra dimension to represent real and imaginary parts. In PyTorch 1.6, the native complex type was introduced. As its API is getting stable, torchaudio has started to migrate to the native complex type.
75
83
76
84
In this release, we added support for native complex tensors, and you can opt-in to use them. Using the native complex types, we have verified that affected functions continue to support autograd and TorchScript, moreover, switching to native complex types improves their performance. For more details, refer to [pytorch/audio#1337](https://github.com/pytorch/audio/issues/1337).
77
85
@@ -95,22 +103,22 @@ We have added the model architectures from [Wav2Vec2.0](https://arxiv.org/abs/20
95
103
The following code snippet illustrates such a use case. Please check out our [c++ example directory](https://github.com/pytorch/audio/tree/master/examples/libtorchaudio) for the complete example. Currently, it is designed for running inference. If you would like more support for training, please file a feature request.
96
104
97
105
```python
98
-
# Import fine-tuned model from Hugging Face Hub
106
+
|# Import fine-tuned model from Hugging Face Hub
99
107
import transformers
100
108
from torchaudio.models.wav2vec2.utils import import_huggingface_model
101
109
102
110
original = Wav2Vec2ForCTC.from_pretrained("facebook/wav2vec2-base-960h")
103
-
imported = import_huggingface_model(original)
111
+
imported = import_huggingface_model(original)|
104
112
105
-
# Import fine-tuned model from fairseq
113
+
|# Import fine-tuned model from fairseq
106
114
import fairseq
107
115
from torchaudio.models.wav2vec2.utils import import_fairseq_model
For more details, see [the documentation](https://pytorch.org/audio/0.9.0/models.html#wav2vec2-0).
@@ -132,7 +140,7 @@ In release 0.8, we vectorized the operation in ```torchaudio.compliance.kaldi.re
132
140
We have:
133
141
* Added Kaiser window support for a wider range of resampling quality.
134
142
* Added ```rolloff``` parameter for anti-aliasing control.
135
-
* Added the mechanism to precompute the kernel and cache it in torchaudio.transforms.Resample for even faster operation.
143
+
* Added the mechanism to precompute the kernel and cache it in ```torchaudio.transforms.Resample``` for even faster operation.
136
144
* Moved the implementation from ```torchaudio.compliance.kaldi.resample_waveform``` to ```torchaudio.functional.resample``` and deprecated ```torchaudio.compliance.kaldi.resample_waveform```.
137
145
138
146
For more details, see [the documentation](https://pytorch.org/audio/0.9.0/transforms.html#resample).
0 commit comments