Working like a dog
Berkeley researchers may be one step closer to making robot dogs our new best friends. Using advances in reinforcement learning (RL), two separate teams have developed cutting-edge approaches to shorten training times for quadruped robots, getting them to walk — and even roll over — in record time.
In a first for the robotics field, a team led by Sergey Levine, associate professor of electrical engineering and computer sciences, demonstrated a robot dog learning to walk without prior training in just 20 minutes. The robot relied solely on trial and error in the field to master the movements necessary to walk and adapt to different settings. Levine’s team was able to accelerate the learning speed by leveraging advances in RL algorithms and machine learning frameworks, enabling the robot to learn more efficiently from its mistakes while interacting with its environment. “We are studying how to allow the robot to learn from its mistakes and continue to improve while it is acting in the real world,” said Laura Smith, a Ph.D. student and co-author of the paper, along with postdoctoral researcher Ilya Kostrikov.
A different team at Berkeley, led by Pieter Abbeel, professor of electrical engineering and computer sciences, took another approach to helping a robot dog teach itself to roll over, stand up and walk in just one hour of real-world training time. The robot also proved it could adapt. Within 10 minutes, it learned to withstand pushes or quickly roll over and get back on its feet, without any resets or intervention from the researchers. The team — including Ph.D. student Alejandro Escontrela, Danijar Hafner, Philipp Wu (B.S.’19 EECS/ME) and Ken Goldberg, professor of industrial engineering and operations research and of electrical engineering and computer sciences — employed an RL algorithm named Dreamer that uses a learned-world model. This model is built with data gathered from the robot’s ongoing interactions with the world, with the robot using it to imagine potential outcomes. “The robot sort of dreams and imagines what the consequences of its actions would be and then trains itself in imagination to improve, thinking of different actions and exploring a different sequence of events,” said Escontrela.
Learn more: Step by step; A walk in the park: Learning to walk in 20 minutes with model-free reinforcement learning (arXiv); DayDreamer: World models for physical robot learning (arXiv)