CURL: Contrastive Unsupervised Representation Learning for Reinforcement Learning

Part of Proceedings of the International Conference on Machine Learning 1 pre-proceedings (ICML 2020)

Bibtex »Metadata »Paper »Supplemental »

Bibtek download is not availble in the pre-proceeding


Michael Laskin, Aravind Srinivas, Pieter Abbeel


<p>Reinforcement Learning for control tasks where the agent learns from raw high dimensional pixels has proven to be difficult and sample-inefficient. Operating on high-dimensional observational input poses a challenging credit assignment problem, which hinders the agent’s ability to learn optimal policies quickly. One promising approach to improve the sample efficiency of image-based RL algorithms is to learn low-dimensional representations from the raw input using unsupervised learning. To that end, we propose a new model: Contrastive Unsupervised Representation Learning for Reinforcement Learning (CURL). CURL extracts high level features from raw pixels using a contrastive learning objective and performs off-policy control on top of the extracted features. CURL achieves state-of-the-art performance and is the first image based algorithm across both model-free and model-based settings to nearly match the sample-efficiency and performance of state-based features on five out of the six DeepMind control benchmarks.</p>