REINFORCE: Reinforcement Learning Most Fundamental Algorithm

If you would like to see more videos like this please consider supporting me on Patreon -https://www.patreon.com/andriydrozdyuk Reinforcement Learning: An I...

Andriy Drozdyuk•15.2K views•13:42

🔥 Related Trending Topics

LIVE TRENDS

This video may be related to current global trending topics. Click any trend to explore more videos about what's hot right now!

THIS VIDEO IS TRENDING!

This video is currently trending in Singapore under the topic 'itoto system 12'.

About this video

If you would like to see more videos like this please consider supporting me on Patreon -https://www.patreon.com/andriydrozdyuk Reinforcement Learning: An Introduction, 2nd Ed, Sutton & Barto For REINFORCE algorithm see Section "13.3 REINFORCE: Monte Carlo Policy Gradient": http://incompleteideas.net/book/the-book-2nd.html Complete code used in the video can be found here: https://github.com/drozzy/reinforce 0:00 - Introduction 0:15 - Intro to RL 0:38 - Problem with Environment 1:02 - Why is this a problem for RL? 1:41 - Puppy treats (low level of abstraction) 2:14 - Good actions (middle level of abstraction) 3:22 - Reward as a signal (high level of abstraction) 4:04 - REINFORCE Algorithm Overview 5:11 - Collected Trajectory 6:01 - Product of G and Policy Gradient 6:34 - Two key concepts: sample and evaluate 6:48 - Sampling an action 7:22 - Sampling in REINFORCE 7:38 - Evaluating an action 8:24 - Sampling vs. Evaluating 8:41 - Sampling using torch.distributions.Categorical 9:12 - Evaluating using torch.distributions.Categorical 9:50 - Env/NN/Optim 10:07 - Collect One Episode of Experience 10:53 - Compute Discounted Returns 11:44 - Update the Policy 12:41 - Executing Trained Policy 13:04 - Demo Cart Pole Balancing

Video Information

Views
15.2K

Total views since publication

Likes
738

User likes and reactions

Duration
13:42

Video length

Published
Aug 16, 2021

Release date

Quality
hd

Video definition