1. Module intro and roadmap
2. Value-based vs Policy based vs Actor-Critic
3. Policy Gradients (PG)
4. REINFORCE - Monte-Carlo PG
5. AC - Actor-Critic
6. A2C - Advantage Actor-Critic
7. A3C - Asynchronous Advantage Actor-Critic
8. TRPO - Trusted Region Policy Optimization
9. PPO - Proximal Policy Optimization
10. DDPG - Deep Determinstic Policy Gradients
11. StableBaselines library overview
12. Atari example with stable-baselines
13. Mario example with stable-baselines
14. StreetFighter example with stable-baselines