CSYYYSC (Page 9)

Policy Gradient

10 Jun 2025 4 min read Reinforcement Learning

In Value/Policy-Based Control , we discussed Value-Based and Policy-Based control methods. However, there are still many details to cover before we can progress to more advanced algorithms. One of the key concepts in

On-Off Policy

09 Jun 2025 5 min read Reinforcement Learning

We have discussed about the difference between value-based and policy-based approaches in Value/Policy-Based Control. However, there is another important aspect to consider when distinguishing between algorithms: whether the policy is on-policy or

Value/Policy-Based Control

08 Jun 2025 5 min read Reinforcement Learning

Before diving into corporate reinforcement learning with deep neural networks, it's important to understand two fundamental concepts: the difference between policy-based and value-based approaches, and the distinction between on-policy and off-policy

Temporal Difference

08 Jun 2025 7 min read Markov Decision Process

We talked about Monte Carlo in RL its usage. However, the MC method is not well-suited for online learning, especially in tasks such as autonomous driving, where decisions need to be made continuously

Monte Carlo

07 Jun 2025 5 min read Markov Decision Process

In previous posts, we discussed the Bellman Equation in the context of Bellman Equation - Policy Iteration and Bellman Equation - Value Iteration, both of which assume access to background knowledge about the