**Contrastive Learning As a Reinforcement Learning Algorithm**

Benjamin Eysenbach,   Tianjun Zhang,   Ruslan Salakhutdinov,   Sergey Levine

Paper,   Code,   video

![](images/method_figure.png) *__tldr__*: Representation learning is an integral part of successful reinforcement learning (RL) algorithms, and most prior methods treat representation learning as a perception problem. These representations are typically learned by perception-specific auxiliary losses (e.g., VAE) or data augmentations (e.g., random cropping). In this paper, we take an alternative tack, showing how representation learning can emerge from reinforcement learning. To do this, we build upon prior work and apply contrastive representation learning to action-labeled trajectories, in such a way that the (inner product of) learned representations exactly corresponds to a goal-conditioned value function. Across a range goal-conditioned RL tasks, we demonstrate that our method achieves higher success rates and is more sample efficient than prior methods, with especially large gains on image-based tasks. Below, we show examples of contrastive RL (NCE) solving some of the image-based tasks in the paper. In each video, the left shows what the agent is observing, and the right shows the desired goal image. ![Opening the drawer.](videos/sawyer_drawer_image.mp4 width="100%") ![Sliding the window.](videos/sawyer_window_image.mp4 width="100%") ![Bin picking. Note that we do not use any demonstrations, reward functions, or reward shaping.](videos/sawyer_bin_image.mp4 width="100%") ![Pushing an object. This task is challenging because of partial observability: the agent cannot see the object at the start of the episode.](videos/sawyer_push_image_partial.mp4 width="100%") --------------------------