Advanced Exploration: Hindsight Experience Replay

Sebastian Dittert
Analytics Vidhya
Published in
7 min readJul 17, 2020

--

One of the challenges for reinforcement learning are sparse reward settings. That is, when the agent only gets a reward if he reaches the goal state. However, most reinforcement learning algorithms need to get feedback by a reward to learn to solve the task or to learn at all.

So by not getting any reward, most algorithms are destined to fail on the specific task, if they never encounter the goal state. To reach difficult goal states and to finally learn based on the received reward, special exploration strategies are needed.

In this article, I want to introduce Hindsight Experience Replay (HER) one of such exploration strategies that make it possible to learn quickly on sparse reward settings.

The beauty of HER is that its a rather simple and intuitive extension, that can be built into several off-policy learning algorithms. However, with this simple add-on, the algorithms are now capable of solving sparse reward settings, which they would not be able to without HER.

But let’s have a look at how HER works…

Intuition behind HER

Imagine you are a golfer and you have to hole in the ball. To bring this in the context of RL, we are the agent and our task is to hole in. If we do so, we receive a reward of +1. For any hit…

--

--

Sebastian Dittert
Analytics Vidhya

Ph.D. student at UPF Barcelona for Deep Reinforcement Learning