Skip to content

Commit 20f6f6f

Browse files
authored
Merge pull request pytorch#588 from pytorch/PyTorch-1.8-Blogs
PyTorch 1.8 Blog Posts
2 parents d8de607 + 2163ca5 commit 20f6f6f

3 files changed

+384
-0
lines changed
Lines changed: 158 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,158 @@
1+
---
2+
layout: blog_detail
3+
title: 'New PyTorch library releases including TorchVision Mobile, TorchAudio I/O, and more'
4+
author: Team PyTorch
5+
---
6+
7+
Today, we are announcing updates to a number of PyTorch libraries, alongside the [PyTorch 1.8 release](https://pytorch.org/blog/pytorch-1.8-released). The updates include new releases for the domain libraries including TorchVision, TorchText and TorchAudio as well as new version of TorchCSPRNG. These releases include a number of new features and improvements and, along with the PyTorch 1.8 release, provide a broad set of updates for the PyTorch community to build on and leverage.
8+
9+
Some highlights include:
10+
* **TorchVision** - Added support for PyTorch Mobile including [Detectron2Go](https://ai.facebook.com/blog/d2go-brings-detectron2-to-mobile) (D2Go), auto-augmentation of data during training, on the fly type conversion, and [AMP autocasting](https://pytorch.org/docs/stable/amp.html).
11+
* **TorchAudio** - Major improvements to I/O, including defaulting to sox_io backend and file-like object support. Added Kaldi Pitch feature and support for CMake based build allowing TorchAudio to better support no-Python environments.
12+
* **TorchText** - Updated the dataset loading API to be compatible with standard PyTorch data loading utilities.
13+
* **TorchCSPRNG** - Support for cryptographically secure pseudorandom number generators for PyTorch is now stable with new APIs for AES128 ECB/CTR and CUDA support on Windows.
14+
15+
Please note that, starting in PyTorch 1.6, features are classified as Stable, Beta, and Prototype. Prototype features are not included as part of the binary distribution and are instead available through either building from source, using nightlies or via compiler flag. You can see the detailed announcement [here](https://pytorch.org/blog/pytorch-feature-classification-changes/).
16+
17+
18+
# TorchVision 0.9.0
19+
### [Stable] TorchVision Mobile: Operators, Android Binaries, and Tutorial
20+
We are excited to announce the first on-device support and binaries for a PyTorch domain library. We have seen significant appetite in both research and industry for on-device vision support to allow low latency, privacy friendly, and resource efficient mobile vision experiences. You can follow this [new tutorial](https://github.com/pytorch/android-demo-app/tree/d2go/D2Go) to build your own Android object detection app using TorchVision operators, D2Go, or your own customer operators and model.
21+
22+
<div class="text-center">
23+
<img src="{{ site.url }}/assets/images/tochvisionmobile.png" width="30%">
24+
</div>
25+
26+
### [Stable] New Mobile models for Classification, Object Detection and Semantic Segmentation
27+
We have added support for the MobileNetV3 architecture and provided pre-trained weights for Classification, Object Detection and Segmentation. It is easy to get up and running with these models, just import and load them as you would any ```torchvision``` model:
28+
```python
29+
import torch
30+
import torchvision
31+
32+
# Classification
33+
x = torch.rand(1, 3, 224, 224)
34+
m_classifier = torchvision.models.mobilenet_v3_large(pretrained=True)
35+
m_classifier.eval()
36+
predictions = m_classifier(x)
37+
38+
# Quantized Classification
39+
x = torch.rand(1, 3, 224, 224)
40+
m_classifier = torchvision.models.quantization.mobilenet_v3_large(pretrained=True)
41+
m_classifier.eval()
42+
predictions = m_classifier(x)
43+
44+
# Object Detection: Highly Accurate High Resolution Mobile Model
45+
x = [torch.rand(3, 300, 400), torch.rand(3, 500, 400)]
46+
m_detector = torchvision.models.detection.fasterrcnn_mobilenet_v3_large_fpn(pretrained=True)
47+
m_detector.eval()
48+
predictions = m_detector(x)
49+
50+
# Semantic Segmentation: Highly Accurate Mobile Model
51+
x = torch.rand(1, 3, 520, 520)
52+
m_segmenter = torchvision.models.segmentation.deeplabv3_mobilenet_v3_large(pretrained=True)
53+
m_segmenter.eval()
54+
predictions = m_segmenter(x)
55+
```
56+
These models are highly competitive with TorchVision’s existing models on resource efficiency, speed, and accuracy. See our [release notes](https://github.com/pytorch/vision/releases) for detailed performance metrics.
57+
58+
### [Stable] AutoAugment
59+
[AutoAugment](https://arxiv.org/pdf/1805.09501.pdf) is a common Data Augmentation technique that can increase the accuracy of Scene Classification models. Though the data augmentation policies are directly linked to their trained dataset, empirical studies show that ImageNet policies provide significant improvements when applied to other datasets. We’ve implemented 3 policies learned on the following datasets: ImageNet, CIFA10 and SVHN. These can be used standalone or mixed-and-matched with existing transforms:
60+
```python
61+
from torchvision import transforms
62+
63+
t = transforms.AutoAugment()
64+
transformed = t(image)
65+
66+
67+
transform=transforms.Compose([
68+
transforms.Resize(256),
69+
transforms.AutoAugment(),
70+
transforms.ToTensor()])
71+
```
72+
### Other New Features for TorchVision
73+
* [Stable] All read and decode methods in the io.image package now support:
74+
* Palette, Grayscale Alpha and RBG Alpha image types during PNG decoding
75+
* On-the-fly conversion of image from one type to the other during read
76+
* [Stable] WiderFace dataset
77+
* [Stable] Improved FasterRCNN speed and accuracy by introducing a score threshold on RPN
78+
* [Stable] Modulation input for DeformConv2D
79+
* [Stable] Option to write audio to a video file
80+
* [Stable] Utility to draw bounding boxes
81+
* [Beta] Autocast support in all Operators
82+
Find the full TorchVision release notes [here](https://github.com/pytorch/vision/releases).
83+
84+
# TorchAudio 0.8.0
85+
### I/O Improvements
86+
We have continued our work from the [previous release](https://github.com/pytorch/audio/releases/tag/v0.7.0) to improve TorchAudio’s I/O support, including:
87+
* [Stable] Changing the default backend to “sox_io” (for Linux/macOS), and updating the “soundfile” backend’s interface to align with that of “sox_io”. The legacy backend and interface are still accessible, though it is strongly discouraged to use them.
88+
* [Stable] File-like object support in both "sox_io" backend, “soundfile” backend and sox_effects.
89+
* [Stable] New options to change the format, encoding, and bits_per_sample when saving.
90+
* [Stable] Added GSM, HTK, AMB, AMR-NB and AMR-WB format support to the “sox_io” backend.
91+
* [Beta] A new ```functional.apply_codec``` function which can degrade audio data by applying audio codecs supported by “sox_io” backend in an in-memory fashion.
92+
Here are some examples of features landed in this release:
93+
94+
```python
95+
# Load audio over HTTP
96+
with requests.get(URL, stream=True) as response:
97+
waveform, sample_rate = torchaudio.load(response.raw)
98+
99+
# Saving to Bytes buffer as 32-bit floating-point PCM
100+
buffer_ = io.BytesIO()
101+
torchaudio.save(
102+
buffer_, waveform, sample_rate,
103+
format="wav", encoding="PCM_S", bits_per_sample=16)
104+
105+
# Apply effects while loading audio from S3
106+
client = boto3.client('s3')
107+
response = client.get_object(Bucket=S3_BUCKET, Key=S3_KEY)
108+
waveform, sample_rate = torchaudio.sox_effects.apply_effect_file(
109+
response['Body'],
110+
[["lowpass", "-1", "300"], ["rate", "8000"]])
111+
112+
# Apply GSM codec to Tensor
113+
encoded = torchaudio.functional.apply_codec(
114+
waveform, sample_rate, format="gsm")
115+
```
116+
117+
Check out the revamped audio preprocessing tutorial, [Audio Manipulation with TorchAudio](https://pytorch.org/tutorials/beginner/audio_preprocessing_tutorial.html).
118+
119+
### [Stable] Switch to CMake-based build
120+
In the previous version of TorchAudio, it was utilizing CMake to build third party dependencies. Starting in 0.8.0, TorchaAudio uses CMake to build its C++ extension. This will open the door to integrate TorchAudio in non-Python environments (such as C++ applications and mobile). We will continue working on adding example applications and mobile integrations.
121+
122+
### [Beta] Improved and New Audio Transforms
123+
We have added two widely requested operators in this release: the SpectralCentroid transform and the Kaldi Pitch feature extraction (detailed in ["A pitch extraction algorithm tuned for automatic speech recognition"](https://ieeexplore.ieee.org/document/6854049)). We’ve also exposed a normalization method to Mel transforms, and additional STFT arguments to Spectrogram. We would like to ask our community to continue to [raise feature requests](https://github.com/pytorch/audio/issues/new?assignees=&labels=&template=feature-request.md) for core audio processing features like these!
124+
125+
### Community Contributions
126+
We had more contributions from the open source community in this release than ever before, including several completely new features. We would like to extend our sincere thanks to the community. Please check out the newly added [CONTRIBUTING.md](https://github.com/pytorch/audio/blob/master/CONTRIBUTING.md) for ways to contribute code, and remember that reporting bugs and requesting features are just as valuable. We will continue posting well-scoped work items as issues labeled “help-wanted” and “contributions-welcome” for anyone who would like to contribute code, and are happy to coach new contributors through the contribution process.
127+
128+
Find the full TorchAudio release notes [here](https://github.com/pytorch/audio/releases).
129+
130+
# TorchText 0.9.0
131+
### [Beta] Dataset API Updates
132+
In this release, we are updating TorchText’s dataset API to be compatible with PyTorch data utilities, such as DataLoader, and are deprecating TorchText’s custom data abstractions such as ```Field```. The updated datasets are simple string-by-string iterators over the data. For guidance about migrating from the legacy abstractions to use modern PyTorch data utilities, please refer to our [migration guide](https://github.com/pytorch/text/blob/master/examples/legacy_tutorial/migration_tutorial.ipynb).
133+
134+
The text datasets listed below have been updated as part of this work. For examples of how to use these datasets, please refer to our [end-to-end text classification tutorial](https://pytorch.org/tutorials/beginner/text_sentiment_ngrams_tutorial.html).
135+
* **Language modeling:** WikiText2, WikiText103, PennTreebank, EnWik9
136+
* **Text classification:** AG_NEWS, SogouNews, DBpedia, YelpReviewPolarity, YelpReviewFull, YahooAnswers, AmazonReviewPolarity, AmazonReviewFull, IMDB
137+
* **Sequence tagging:** UDPOS, CoNLL2000Chunking
138+
* **Translation:** IWSLT2016, IWSLT2017
139+
* **Question answer:** SQuAD1, SQuAD2
140+
141+
Find the full TorchText release notes [here](https://github.com/pytorch/text/releases).
142+
143+
# [Stable] TorchCSPRNG 0.2.0
144+
We [released TorchCSPRNG in August 2020](https://pytorch.org/blog/torchcsprng-release-blog/), a PyTorch C++/CUDA extension that provides cryptographically secure pseudorandom number generators for PyTorch. Today, we are releasing the 0.2.0 version and designating the library as stable. This release includes a new API for encrypt/decrypt with AES128 ECB/CTR as well as CUDA 11 and Windows CUDA support.
145+
146+
Find the full TorchCSPRNG release notes [here](https://github.com/pytorch/csprng/releases/).
147+
148+
149+
150+
151+
152+
153+
154+
Thanks for reading, and if you are excited about these updates and want to participate in the future of PyTorch, we encourage you to join the [discussion forums](https://discuss.pytorch.org/) and [open GitHub issues](https://github.com/pytorch).
155+
156+
Cheers!
157+
158+
***Team PyTorch***

0 commit comments

Comments
 (0)