Skip to content

robot0102/Deep-Reinforcement-Learning-Algorithms-with-PyTorch

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deep Reinforcement Learning Algorithms with PyTorch

Travis CI

RL PyTorch

This repository contains PyTorch implementations of deep reinforcement learning algorithms.

Algorithms Implemented

  1. Deep Q Learning (DQN) (Mnih 2013)
  2. DQN with Fixed Q Targets (Mnih 2013)
  3. Double DQN (Hado van Hasselt 2015)
  4. Double DQN with Prioritised Experience Replay (Schaul 2016)
  5. REINFORCE (Williams 1992)
  6. PPO (Schulman 2017)
  7. DDPG (Lillicrap 2016)
  8. Hill Climbing
  9. Genetic Evolution
  10. DQN with Hindsight Experience Replay (DQN-HER) (Andrychowicz 2018)
  11. DDPG with Hindsight Experience Replay (DDPG-HER) (Andrychowicz 2018)

All implementations are able to quickly solve Cart Pole (discrete actions), Mountain Car Continuous (continuous actions), Bit Flipping (discrete actions with dynamic goals) or Fetch Reach (continuous actions with dynamic goals). I plan to add A2C, A3C and PPO-HER soon.

Results

1. Cart Pole (Discrete Actions)

Below shows DQN, DQN with Fixed Q targets, Double DQN, Double DQN with Prioritised Experience Replay and PPO playing Cart Pole for 450 episodes. The mean result from running the algorithms with 3 random seeds is shown with the shaded area representing plus and minus 1 standard deviation. Hyperparameters used can be found in file Results/Cart_Pole.py .

Cart Pole Results

2. Mountain Car (Continuous Actions)

Below shows PPO and DDPG playing Mountain Car for 450 episodes. The mean result from running the algorithms with 3 random seeds is shown with the shaded area representing plus and minus 1 standard deviation. Hyperparameters used can be found in file Results/Mountain_Car.py

Mountain Car Continuous Results

3. Hindsight Experience Replay (HER) Experiements

Bit Flipping

Below shows the performance of DQN with and without Hindsight Experience Replay (HER) in the Bit Flipping Environment (14 bits) described in the paper Hindsight Experience Replay 2018. The results replicate the result found in the paper and show that adding HER allowed the agent to solve a problem that vanilla DQN was not able to practically solve. The hyperparameters used were the same for both agents and the same as in the paper, they can be found in the file: Results/Bit_Flipping/Results.py

ONLY DIFFERENCE IS HER added.. hyperparameters the same

Bit Flipping Results Fetch Reach Results

Fetch Reach

Below shows the performance of DDPG with and without Hindsight Experience Replay in the Fetch Reach environment which is introduced in this Open AI blog post. The results mirror those seen in paper Multi-Goal Reinforcement Learning 2018 and show that adding Hindsight Experience Replay dramatically improved the ability of the agent to learn the task. The hyperparameters used were the same for both agents and the same as in the paper, they can be found in the file: Results/Fetch_Reach/Results.py

Bit Flipping Results

Usage

The repository's high-level structure is:

├── Agents                    
    ├── Actor_Critic_Agents   
    ├── DQN_Agents         
    ├── Policy_Gradient_Agents
    └── Stochastic_Policy_Search_Agents 
├── Environments
    ├── Open_AI_Gym_Environments   
    ├── Other_Environments         
    └── Unity_Environments    
├── Results
    ├── Bit_Flipping_Environment   
    ├── Cart_Pole
    ├── Fetch_Reach
    ├── Mountain_Car_Continuous             
    └── Tennis        
├── Tests
├── Utilities
    ├── Data_Structures             
    └── Models            

i) To Watch the Agents Learn the Above Games

To watch all the different agents learn the above games follow these steps:

git clone https://github.com/p-christ/Deep_RL_Implementations.git
cd Deep_RL_Implementations

conda create --name myenvname
y
conda activate myenvname

pip3 install -r requirements.txt
export PYTHONPATH="${PYTHONPATH}:/Deep_RL_Implementations"

And then to watch them learn Cart Pole run: python Results/Cart_Pole/Results.py

To watch them learn Mountain Car run: python Results/Mountain_Car_Continuous/Results.py

To watch them learn Tennis you will need to download the environment:

  1. Linux: click here
  2. Mac OSX: click here
  3. Windows (32-bit): click here
  4. Windows (64-bit): click here

and then run: python Results/Tennis/Results.py

To watch them learn Bit Flipping run: python Results/Bit_Flipping/Results.py

To watch them learn Fetch Reach run: python Results/Fetch_Reach/Results.py

ii) To Train the Agents on your Own Game

To use the algorithms with your own particular game instead you follow these steps:

  1. Create an Environment class to represent your game - the environment class you create should extend the Base_Environment class found in the Environments folder to make it compatible with all the agents.

  2. Create a config object with the hyperparameters and game you want to use. See Results/Cart_Pole/Results.py for an example of this.

  3. Use function run_games_for_agents to have the different agents play the game. Again see Results/Cart_Pole/Results.py for an example of this.

About

PyTorch implementations of deep reinforcement learning algorithms and environments

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%