@@ -14,9 +14,9 @@ PyTorch with each method having their advantages in certain use cases:
14
14
* `DistributedDataParallel (DDP) <#learn-ddp >`__
15
15
* `Fully Sharded Data Parallel (FSDP) <#learn-fsdp >`__
16
16
* `Remote Procedure Call (RPC) distributed training <#learn-rpc >`__
17
- * `Pipeline Parallelism <#learn-pipeline-parallelism >`__
17
+ * `Custom Extensions <#custom-extensions >`__
18
18
19
- Read more about these options in [ Distributed Overview]( ../beginner/dist_overview.rst) .
19
+ Read more about these options in ` Distributed Overview < ../beginner/dist_overview.html >`__ .
20
20
21
21
.. _learn-ddp :
22
22
@@ -47,16 +47,23 @@ Learn DDP
47
47
+++
48
48
:octicon: `code;1em ` Code
49
49
50
+ .. grid-item-card :: :octicon:`file-code;1em`
51
+ Distributed Training with Uneven Inputs Using
52
+ the Join Context Manager
53
+ :shadow: none
54
+ :link: ../advanced_source/generic_join.rst
55
+ :link-type: url
56
+
57
+ This tutorial provides a short and gentle intro to the PyTorch
58
+ DistributedData Parallel.
59
+ +++
60
+ :octicon: `code;1em ` Code
61
+
50
62
.. _learn-fsdp :
51
63
52
64
Learn FSDP
53
65
----------
54
66
55
- Fully-Sharded Data Parallel (FSDP) is a tool that distributes model
56
- parameters across multiple workers, therefore enabling you to train larger
57
- models.
58
-
59
-
60
67
.. grid :: 3
61
68
62
69
.. grid-item-card :: :octicon:`file-code;1em`
@@ -86,9 +93,6 @@ models.
86
93
Learn RPC
87
94
---------
88
95
89
- Distributed Remote Procedure Call (RPC) framework provides
90
- mechanisms for multi-machine model training
91
-
92
96
.. grid :: 3
93
97
94
98
.. grid-item-card :: :octicon:`file-code;1em`
@@ -114,38 +118,44 @@ mechanisms for multi-machine model training
114
118
:octicon: `code;1em ` Code
115
119
116
120
.. grid-item-card :: :octicon:`file-code;1em`
117
- Distributed Pipeline Parallelism Using RPC
121
+ Implementing Batch RPC Processing Using Asynchronous Executions
118
122
:shadow: none
119
123
:link: https://example.com
120
124
:link-type: url
121
125
122
- Learn how to use a Resnet50 model for distributed pipeline parallelism
123
- with the Distributed RPC APIs .
126
+ In this tutorial you will build batch-processing RPC applications
127
+ with the @rpc.functions.async_execution decorator .
124
128
+++
125
129
:octicon: `code;1em ` Code
126
130
127
131
.. grid :: 3
128
132
129
133
.. grid-item-card :: :octicon:`file-code;1em`
130
- Implementing Batch RPC Processing Using Asynchronous Executions
134
+ Combining Distributed DataParallel with Distributed RPC Framework
131
135
:shadow: none
132
136
:link: https://example.com
133
137
:link-type: url
134
138
135
- In this tutorial you will build batch-processing RPC applications
136
- with the @rpc.functions.async_execution decorator .
139
+ In this tutorial you will learn how to combine distributed data
140
+ parallelism with distributed model parallelism .
137
141
+++
138
142
:octicon: `code;1em ` Code
139
143
144
+ .. _custom-extensions :
145
+
146
+ Custom Extensions
147
+ -----------------
148
+
149
+ .. grid :: 3
150
+
140
151
.. grid-item-card :: :octicon:`file-code;1em`
141
- Combining Distributed DataParallel with Distributed RPC Framework
152
+ Customize Process Group Backends Using Cpp Extensions
142
153
:shadow: none
143
- :link: https://example.com
154
+ :link: intermediate/process_group_cpp_extension_tutorial.html
144
155
:link-type: url
145
156
146
- In this tutorial you will learn how to combine distributed data
147
- parallelism with distributed model parallelism.
157
+ In this tutorial you will learn to implement a custom `ProcessGroup `
158
+ backend and plug that into PyTorch distributed package using
159
+ cpp extensions.
148
160
+++
149
161
:octicon: `code;1em ` Code
150
-
151
- .. _learn-pipeline-parallelism :
0 commit comments