Skip to content

Commit 125fb17

Browse files
Merge branch 'site' into homepage-update-ticker
2 parents 8a6d90e + 90894d1 commit 125fb17

File tree

3,937 files changed

+41622
-11983
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

3,937 files changed

+41622
-11983
lines changed

.github/workflows/check-quickstartmodule.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,11 @@ jobs:
5858
acc_type: [ "cuda11.x", "cuda10.2", "accnone" ]
5959
py_vers: [ "3.7", "3.8", "3.9" ]
6060
os: ["ubuntu-18.04", "macos-latest", "windows.4xlarge"]
61+
# We don't actively build for CUDA 10.2 on windows so we should
62+
# skip it in our test matrix for pip install
63+
exclude:
64+
- os: "windows.4xlarge"
65+
acc_type: "cuda10.2"
6166
env:
6267
TEST_ACC: ${{ matrix.acc_type }}
6368
TEST_VER: ${{ matrix.rel_type }}

Gemfile.lock

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -211,23 +211,23 @@ GEM
211211
rb-fsevent (~> 0.10, >= 0.10.3)
212212
rb-inotify (~> 0.9, >= 0.9.10)
213213
mercenary (0.3.6)
214-
mini_portile2 (2.6.1)
214+
mini_portile2 (2.8.0)
215215
minima (2.5.1)
216216
jekyll (>= 3.5, < 5.0)
217217
jekyll-feed (~> 0.9)
218218
jekyll-seo-tag (~> 2.1)
219219
minitest (5.14.4)
220220
multipart-post (2.1.1)
221-
nokogiri (1.12.5)
222-
mini_portile2 (~> 2.6.1)
221+
nokogiri (1.13.3)
222+
mini_portile2 (~> 2.8.0)
223223
racc (~> 1.4)
224224
octokit (4.20.0)
225225
faraday (>= 0.9)
226226
sawyer (~> 0.8.0, >= 0.5.3)
227227
pathutil (0.16.2)
228228
forwardable-extended (~> 2.6)
229229
public_suffix (4.0.6)
230-
racc (1.5.2)
230+
racc (1.6.0)
231231
rb-fsevent (0.10.4)
232232
rb-inotify (0.10.1)
233233
ffi (~> 1.0)

_case_studies/amazon-ads.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
---
2+
layout: blog_detail
3+
title: Amazon Ads
4+
logo: assets/images/amazon-ads-logo.png
5+
featured-home: true
6+
order: 1
7+
link: /blog/amazon-ads-case-study/
8+
---
9+
10+
Reduce inference costs by 71% and drive scale out using PyTorch, TorchServe, and AWS Inferentia.

_case_studies/salesforce.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ layout: blog_detail
33
title: Salesforce
44
logo: assets/images/salesforce.png
55
featured-home: true
6-
order: 1
6+
order: 2
77
link:
88
---
99

_case_studies/stanford-university.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ layout: blog_detail
33
title: Stanford University
44
logo: assets/images/stanford-university.png
55
featured-home: true
6-
order: 2
6+
order: 3
77
link:
88
---
99

_case_studies/udacity.md

Lines changed: 0 additions & 10 deletions
This file was deleted.

_includes/main_menu.html

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,14 @@
6060
<span class="dropdown-title docs-title">torchvision</span>
6161
<p></p>
6262
</a>
63+
<a class="nav-dropdown-item" href="{{ site.baseurl }}/data">
64+
<span class="dropdown-title docs-title">TorchData</span>
65+
<p></p>
66+
</a>
67+
<a class="nav-dropdown-item" href="{{ site.baseurl }}/torchrec">
68+
<span class="dropdown-title docs-title">TorchRec</span>
69+
<p></p>
70+
</a>
6371
<a class="nav-dropdown-item" href="{{ site.baseurl }}/serve">
6472
<span class="dropdown-title docs-title">TorchServe</span>
6573
<p></p>

_includes/quick_start_local.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@
3535
<div class="option-text">PyTorch Build</div>
3636
</div>
3737
<div class="col-md-4 option block version selected" id="stable">
38-
<div class="option-text">Stable (1.10.2)</div>
38+
<div class="option-text">Stable (1.11.0)</div>
3939
</div>
4040
<div class="col-md-4 option block version" id="preview">
4141
<div class="option-text">Preview (Nightly)</div>

_layouts/hub_detail.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ <h1>
3232
<a href="https://colab.research.google.com/github/pytorch/pytorch.github.io/blob/master/{{ page.path | replace: "_hub", "assets/hub" | replace: ".md", ".ipynb" }}"><button class="btn btn-lg with-right-white-arrow detail-colab-link">Open on Google Colab</button></a>
3333
{% if page.demo-model-link %}
3434
{% if page.demo-model-button-text == blank or page.demo-model-button-text == nil %}
35-
<a href="{{ page.demo-model-link }}"><button class="btn btn-lg with-right-white-arrow detail-web-demo-link">Demo Model Output</button></a>
35+
<a href="{{ page.demo-model-link }}"><button class="btn btn-lg with-right-white-arrow detail-web-demo-link">Open Model Demo</button></a>
3636
{% else %}
3737
<a href="{{ page.demo-model-link }}"><button class="btn btn-lg with-right-white-arrow detail-web-demo-link">{{ page.demo-model-button-text }}</button></a>
3838
{% endif %}
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
---
2+
layout: blog_detail
3+
title: 'Introducing TorchRec, a library for modern production recommendation systems'
4+
author: Meta AI - Donny Greenberg, Colin Taylor, Dmytro Ivchenko, Xing Liu, Anirudh Sudarshan
5+
featured-img: ''
6+
---
7+
8+
We are excited to announce [TorchRec](https://github.com/pytorch/torchrec), a PyTorch domain library for Recommendation Systems. This new library provides common sparsity and parallelism primitives, enabling researchers to build state-of-the-art personalization models and deploy them in production.
9+
10+
<p align="left">
11+
<img src="/assets/images/introducing-torchrec/torchrec_lockup.png" width="40%">
12+
</p>
13+
14+
## How did we get here?
15+
Recommendation Systems (RecSys) comprise a large footprint of production-deployed AI today, but you might not know it from looking at Github. Unlike areas like Vision and NLP, much of the ongoing innovation and development in RecSys is behind closed company doors. For academic researchers studying these techniques or companies building personalized user experiences, the field is far from democratized. Further, RecSys as an area is largely defined by learning models over sparse and/or sequential events, which has large overlaps with other areas of AI. Many of the techniques are transferable, particularly for scaling and distributed execution. A large portion of the global investment in AI is in developing these RecSys techniques, so cordoning them off blocks this investment from flowing into the broader AI field.
16+
17+
By mid-2020, the PyTorch team received a lot of feedback that there hasn't been a large-scale production-quality recommender systems package in the open-source PyTorch ecosystem. While we were trying to find a good answer, a group of engineers at Meta wanted to contribute Meta’s production RecSys stack as a PyTorch domain library, with a strong commitment to growing an ecosystem around it. This seemed like a good idea that benefits researchers and companies across the RecSys domain. So, starting from Meta’s stack, we began modularizing and designing a fully-scalable codebase that is adaptable for diverse recommendation use-cases. Our goal was to extract the key building blocks from across Meta’s software stack to simultaneously enable creative exploration and scale. After nearly two years, a battery of benchmarks, migrations, and testing across Meta, we’re excited to finally embark on this journey together with the RecSys community. We want this package to open a dialogue and collaboration across the RecSys industry, starting with Meta as the first sizable contributor.
18+
19+
20+
## Introducing TorchRec
21+
TorchRec includes a scalable low-level modeling foundation alongside rich batteries-included modules. We initially target “two-tower” ([[1]], [[2]]) architectures that have separate submodules to learn representations of candidate items and the query or context. Input signals can be a mix of floating point “dense” features or high-cardinality categorical “sparse” features that require large embedding tables to be trained. Efficient training of such architectures involves combining data parallelism that replicates the “dense” part of computation and model parallelism that partitions large embedding tables across many nodes.
22+
23+
In particular, the library includes:
24+
- Modeling primitives, such as embedding bags and jagged tensors, that enable easy authoring of large, performant multi-device/multi-node models using hybrid data-parallelism and model-parallelism.
25+
- Optimized RecSys kernels powered by [FBGEMM](https://github.com/pytorch/FBGEMM) , including support for sparse and quantized operations.
26+
- A sharder which can partition embedding tables with a variety of different strategies including data-parallel, table-wise, row-wise, table-wise-row-wise, and column-wise sharding.
27+
- A planner which can automatically generate optimized sharding plans for models.
28+
- Pipelining to overlap dataloading device transfer (copy to GPU), inter-device communications (input_dist), and computation (forward, backward) for increased performance.
29+
- GPU inference support.
30+
- Common modules for RecSys, such as models and public datasets (Criteo & Movielens).
31+
32+
To showcase the flexibility of this tooling, let’s look at the following code snippet, pulled from our DLRM Event Prediction example:
33+
```python
34+
# Specify the sparse embedding layers
35+
eb_configs = [
36+
EmbeddingBagConfig(
37+
name=f"t_{feature_name}",
38+
embedding_dim=64,
39+
num_embeddings=100_000,
40+
feature_names=[feature_name],
41+
)
42+
for feature_idx, feature_name in enumerate(DEFAULT_CAT_NAMES)
43+
]
44+
45+
# Import and instantiate the model with the embedding configuration
46+
# The "meta" device indicates lazy instantiation, with no memory allocated
47+
train_model = DLRM(
48+
embedding_bag_collection=EmbeddingBagCollection(
49+
tables=eb_configs, device=torch.device("meta")
50+
),
51+
dense_in_features=len(DEFAULT_INT_NAMES),
52+
dense_arch_layer_sizes=[512, 256, 64],
53+
over_arch_layer_sizes=[512, 512, 256, 1],
54+
dense_device=device,
55+
)
56+
57+
# Distribute the model over many devices, just as one would with DDP.
58+
model = DistributedModelParallel(
59+
module=train_model,
60+
device=device,
61+
)
62+
63+
optimizer = torch.optim.SGD(params, lr=args.learning_rate)
64+
# Optimize the model in a standard loop just as you would any other model!
65+
# Or, you can use the pipeliner to synchronize communication and compute
66+
for epoch in range(epochs):
67+
# Train
68+
```
69+
70+
71+
## Scaling Performance
72+
TorchRec has state-of-the-art infrastructure for scaled Recommendations AI, powering some of the largest models at Meta. It was used to train a 1.25 trillion parameter model, pushed to production in January, and a 3 trillion parameter model which will be in production soon. This should be a good indication that PyTorch is fully capable of the largest scale RecSys problems in industry. We’ve heard from many in the community that sharded embeddings are a pain point. TorchRec cleanly addresses that. Unfortunately it is challenging to provide large-scale benchmarks with public datasets, as most open-source benchmarks are too small to show performance at scale.
73+
74+
75+
## Looking ahead
76+
Open-source and open-technology have universal benefits. Meta is seeding the PyTorch community with a state-of-the-art RecSys package, with the hope that many join in on building it forward, enabling new research and helping many companies. The team behind TorchRec plan to continue this program indefinitely, building up TorchRec to meet the needs of the RecSys community, to welcome new contributors, and to continue to power personalization at Meta. We’re excited to begin this journey and look forward to contributions, ideas, and feedback!
77+
78+
79+
## References
80+
[[1]] Sampling-Bias-Corrected Neural Modeling for Large Corpus Item Recommendations
81+
82+
[[2]] DLRM: An advanced, open source deep learning recommendation model
83+
84+
85+
[1]: https://research.google/pubs/pub48840/
86+
[2]: https://ai.facebook.com/blog/dlrm-an-advanced-open-source-deep-learning-recommendation-model/

0 commit comments

Comments
 (0)