PowerPoint Presentation
PowerPoint Presentation Advanced Policy Gradients CS 285: Deep Reinforcement Learning, Decision Making, and Control Sergey Levine Class Notes 1. Homework 2 due today (11:59 pm)! • Don’t be late! 2. Homework 3 comes out this week • Start early! Q-learning takes a while to run Today’s Lecture 1. Why does policy gradient work? 2. Policy gradient is a type of policy iteration 3. Policy gradient as a c