CSY - CSYYYSC (Page 8)

Soft Actor Critic (SAC)

16 Jun 2025 8 min read Reinforcement Learning

In the previous post, we discussed Proximal Policy Optimization (PPO) and its strengths, which have made it a popular choice in recent years. However, as an on-policy method, PPO suffers from a key

Proximal Policy Optimization (PPO)

14 Jun 2025 7 min read Reinforcement Learning

In previous sections on Policy Gradient and REINFORCE, we introduced the core concepts of reinforcement learning (RL) algorithms that rely on gradients applied directly to the policy. These are known as policy-based approaches

Deep Q Network (DQN)

13 Jun 2025 5 min read Reinforcement Learning

DQN extends Q-Learning simply by replacing Q-table with a neural network, enabling it to handle high-dimensional and continuous state spaces, such as images in video games [Playing Atari with Deep Reinforcement Learning]. In

Q-Learning

12 Jun 2025 4 min read Reinforcement Learning

Today, we’re going to introduce one of the most well-known value-based reinforcement learning (RL) approaches: Q-Learning. So, what is Q-Learning? Q-Learning is a method where an agent learns to make decisions by

REINFORCE

11 Jun 2025 4 min read Reinforcement Learning

Today, we will discuss the REINFORCE algorithm (REward Increment = Nonnegative Factor × Offset Reinforcement × Characteristic Eligibility), which is derived from the Policy Gradient method we previously covered. In short, REINFORCE is a policy-based reinforcement