REINFORCE: Reinforcement Learning Most Fundamental Algorithm

If you would like to see more videos like this please consider supporting me on Patreon -https://www.patreon.com/andriydrozdyuk Reinforcement Learning: An I...

Andriy Drozdyuk•15.2K views•Aug 16, 2021•13:42

🔥 Related Trending Topics

LIVE TRENDS

This video may be related to current global trending topics. Click any trend to explore more videos about what's hot right now!

THIS VIDEO IS TRENDING!

This video is currently trending in Singapore under the topic 'itoto system 12'.

Trending Now Globally

itoto system 12

china astronauten

palmeiras vs santos

plzeň – fenerbahçe

liga mx femenil

About this video

If you would like to see more videos like this please consider supporting me on Patreon -https://www.patreon.com/andriydrozdyuk Reinforcement Learning: An Introduction, 2nd Ed, Sutton & Barto For REINFORCE algorithm see Section "13.3 REINFORCE: Monte Carlo Policy Gradient": http://incompleteideas.net/book/the-book-2nd.html Complete code used in the video can be found here: https://github.com/drozzy/reinforce 0:00 - Introduction 0:15 - Intro to RL 0:38 - Problem with Environment 1:02 - Why is this a problem for RL? 1:41 - Puppy treats (low level of abstraction) 2:14 - Good actions (middle level of abstraction) 3:22 - Reward as a signal (high level of abstraction) 4:04 - REINFORCE Algorithm Overview 5:11 - Collected Trajectory 6:01 - Product of G and Policy Gradient 6:34 - Two key concepts: sample and evaluate 6:48 - Sampling an action 7:22 - Sampling in REINFORCE 7:38 - Evaluating an action 8:24 - Sampling vs. Evaluating 8:41 - Sampling using torch.distributions.Categorical 9:12 - Evaluating using torch.distributions.Categorical 9:50 - Env/NN/Optim 10:07 - Collect One Episode of Experience 10:53 - Compute Discounted Returns 11:44 - Update the Policy 12:41 - Executing Trained Policy 13:04 - Demo Cart Pole Balancing

Video Information

Views

15.2K

Total views since publication

Likes

738

User likes and reactions

Duration

13:42

Video length

Published

Aug 16, 2021

Release date

Quality

hd

Video definition

About the Channel

Andriy Drozdyuk

View channel →