Update 2020-4-21-pytorch-1-dot-5-released-with-new-and-updated-apis.md

wookim3 · web-flow · commit 7c09eb91cd96 · 2020-04-22T11:10:04.000-07:00
diff --git a/_posts/2020-4-21-pytorch-1-dot-5-released-with-new-and-updated-apis.md b/_posts/2020-4-21-pytorch-1-dot-5-released-with-new-and-updated-apis.md
@@ -75,7 +75,7 @@ The Distributed [RPC framework](https://pytorch.org/docs/stable/rpc.html) was la
 The RPC API allows users to specify functions to run and objects to be instantiated on remote nodes. These functions are transparently recorded so that gradients can backpropagate through remote nodes using Distributed Autograd.
 
 ### Distributed Autograd
-Distributed Autograd connects the autograd graph across several nodes and allows gradients to flow through during the backwards pass. Gradients are accumulated into a context (as opposed to the .grad field as with Autograd) and users must specify their model’s forward pass under a with `dist_autograd.context()` manager in order to ensure that all RPC communication is recorded properly. Currently, only FAST mode is implemented (see [https://pytorch.org/docs/stable/rpc/distributed_autograd.html#distributed-autograd-design](https://pytorch.org/docs/stable/rpc/distributed_autograd.html#distributed-autograd-design) for the difference between FAST and SMART modes). 
+Distributed Autograd connects the autograd graph across several nodes and allows gradients to flow through during the backwards pass. Gradients are accumulated into a context (as opposed to the .grad field as with Autograd) and users must specify their model’s forward pass under a with `dist_autograd.context()` manager in order to ensure that all RPC communication is recorded properly. Currently, only FAST mode is implemented (see [this doc](https://pytorch.org/docs/stable/rpc/distributed_autograd.html#distributed-autograd-design) for the difference between FAST and SMART modes). 
 
 ### Distributed Optimizer
 The distributed optimizer creates RRefs to optimizers on each worker with parameters that require gradients, and then uses the RPC API to run the optimizer remotely. The user must collect all remote parameters and wrap them in an `RRef`, as this is required input to the distributed optimizer. The user must also specify the distributed autograd `context_id` so that the optimizer knows in which context to look for gradients.