File tree Expand file tree Collapse file tree 1 file changed +2
-1
lines changed Expand file tree Collapse file tree 1 file changed +2
-1
lines changed Original file line number Diff line number Diff line change @@ -19,7 +19,8 @@ This repository contains PyTorch implementations of deep reinforcement learning
19191 . * REINFORCE* <sub ><sup > ([ Williams et al. 1992] ( http://www-anw.cs.umass.edu/~barto/courses/cs687/williams92simple.pdf ) ) </sup ></sub >
20201 . * Deep Deterministic Policy Gradients (DDPG)* <sub ><sup > ([ Lillicrap et al. 2016] ( https://arxiv.org/pdf/1509.02971.pdf ) ) </sup ></sub >
21211 . * Twin Delayed Deep Deterministic Policy Gradients (TD3)* <sub ><sup > ([ Fujimoto et al. 2018] ( https://arxiv.org/abs/1802.09477 ) ) </sup ></sub >
22- 1 . * Soft Actor-Critic (SAC & SAC-Discrete)* <sub ><sup > ([ Haarnoja et al. 2018] ( https://arxiv.org/pdf/1812.05905.pdf ) and [ Christodoulou 2019] ( https://arxiv.org/abs/1910.07207 ) ) </sup ></sub >
22+ 1 . * Soft Actor-Critic (SAC)* <sub ><sup > ([ Haarnoja et al. 2018] ( https://arxiv.org/pdf/1812.05905.pdf ) ) </sup ></sub >
23+ 1 . * Soft Actor-Critic for Discrete Actions (SAC-Discrete)* <sub ><sup > ([ Christodoulou 2019] ( https://arxiv.org/abs/1910.07207 ) ) </sup ></sub >
23241 . * Asynchronous Advantage Actor Critic (A3C)* <sub ><sup > ([ Mnih et al. 2016] ( https://arxiv.org/pdf/1602.01783.pdf ) ) </sup ></sub >
24251 . * Syncrhonous Advantage Actor Critic (A2C)*
25261 . * Proximal Policy Optimisation (PPO)* <sub ><sup > ([ Schulman et al. 2017] ( https://openai-public.s3-us-west-2.amazonaws.com/blog/2017-07/ppo/ppo-arxiv.pdf ) ) </sup ></sub >
You can’t perform that action at this time.
0 commit comments