Channel Avatar

Reinforcement Learning @UCZvMhJ3EaNvpacdlMmm3VKA@youtube.com

4.6K subscribers - no pronouns :c

More from this channel (soon)


28:52
Tutorial 1 - Probability Basics 2
20:16
Tutorial 2 - Linear Algebra 2
55:07
RL Framework and Applications
24:59
Introduction to Immediate RL
17:54
Bandit Optimalities
39:17
Value Function Based Methods
28:32
Introduction to RL
23:02
Tutorial 1 - Probability Basics 1
21:40
Tutorial 2 - Linear Algebra 1
43:12
Solving POMDP
33:28
POMDP Introduction
42:04
MAXQ Value Function Decomposition
30:12
MAXQ
15:14
Option Discovery
33:56
Hierarchical Abstract Machines
26:32
Learning with Options
22:44
Options
26:20
Semi Markov Decision Processes
22:40
Types of Optimality
31:27
Hierarchical Reinforcement Learning
20:04
Policy Gradient with Function Approximation
26:25
REINFORCE (cont'd)
12:49
Actor Critic and REINFORCE
36:42
Policy Gradient Approach
31:05
DQN and Fitted Q Iteration
17:15
LSPI and Fitted Q
49:21
LSTD and LSTDQ
27:18
Function Approximation and Eligibility Traces
22:15
State Aggregation Methods
15:21
Linear Parameterization
38:23
Function Approximation
32:39
Backward View of Eligibility Traces
33:10
Eligibility Trace Control
46:40
Eligibility Traces
30:13
Lec 33 - Q-Learning
22:26
Thompson Sampling
07:06
Lec 34 - Afterstate
22:08
TD(0) Control
35:11
TD(0)
36:24
UCT
27:40
Control in Monte Carlo
34:54
Dynamic Programming
16:33
Off Policy MC
22:47
Monte Carlo
13:26
Policy Iteration
23:28
Value Iteration
31:14
Lpi Convergence
18:03
Convergence Proof
25:52
Banach Fixed Point Theorem
31:23
Lec 20 - Cauchy Sequence and Green's Equation
29:26
Bellman Optimality Equation
33:08
MDP Modelling
14:24
Bellman Equation
36:49
Full RL Introduction
25:30
Policy Search
41:55
REINFORCE
30:09
PAC Bounds
12:32
Contextual Bandits
14:22
Thompson Sampling
44:41
Returns, Value functions and MDPs