Discovering and Achieving Goals via World Models

Russell Mendonca*, Oleh Rybkin*, Kostas Daniilidis, Danijar Hafner, Deepak Pathak

ICML 2021 Unsupervised RL Workshop (oral, 6%)
ICML 2021 Self-Supervised Learning Workshop (oral)

Paper Videos

How can artificial agents learn to solve wide ranges of tasks in complex visual environments in the absence of external supervision? We decompose this question into two problems, global exploration of the environment and learning to reliably reach situations found during exploration. We introduce the Latent Explorer Achiever (LEXA), a unified solution to these by learning a world model from the high-dimensional image inputs and using it to train an explorer and an achiever policy from imagined trajectories. Unlike prior methods that explore by reaching previously visited states, the explorer plans to discover unseen surprising states through foresight, which are then used as diverse targets for the achiever. After the unsupervised phase, LEXA solves tasks specified as goal images zero-shot without any additional learning. We introduce a challenging benchmark spanning across four standard robotic manipulation and locomotion domains with a total of over 40 test tasks. LEXA substantially outperforms previous approaches to unsupervised goal reaching, achieving goals that require interacting with multiple objects in sequence. Finally, to demonstrate the scalability and generality of LEXA, we train a single general agent across four distinct environments.

Method

LEXA explores the world and learns to solve arbitrary goal images purely from pixels and without any form of supervision. After the unsupervised interaction phase, LEXA solves complex tasks by reaching user-specified goal images.

RogoYoga

These goal images require the agent to reach and maintain diverse poses.

RoboBins

These goal images require picking and placing two blocks after another.

Kitchen

These challenging goal images require the agent to perform 3 tasks after another.