Contrastive Learning As a Reinforcement Learning Algorithm

Benjamin Eysenbach, Tianjun Zhang, Ruslan Salakhutdinov, Sergey Levine

Paper, Code, Video

tldr: Representation learning is an integral part of successful reinforcement learning (RL) algorithms, and most prior methods treat representation learning as a perception problem. These representations are typically learned by perception-specific auxiliary losses (e.g., VAE) or data augmentations (e.g., random cropping). In this paper, we take an alternative tack, showing how representation learning can emerge from reinforcement learning. To do this, we build upon prior work and apply contrastive representation learning to action-labeled trajectories, in such a way that the (inner product of) learned representations exactly corresponds to a goal-conditioned value function. Across a range goal-conditioned RL tasks, we demonstrate that our method achieves higher success rates and is more sample efficient than prior methods, with especially large gains on image-based tasks.

Below, we show examples of contrastive RL (NCE) solving some of the image-based tasks in the paper. In each video, the left shows what the agent is observing, and the right shows the desired goal image.

Opening the drawer

Sliding the window

Bin picking. Note that we do not use any demonstrations, reward functions, or reward shaping.

Pushing an object. This task is challenging because of partial observability: the agent cannot see the object at the start of the episode.