Why Toyota is building a “kindergarten for robots”

It’s teaching them the foundational skills needed to help people in the real world.

September 26, 2023

Toyota is using a generative AI-based method to teach robots to peel veggies, prepare snacks, and perform other dexterous tasks that could make them useful in the real world.

“This [method] allows you to teach skills to robots faster and with significantly fewer demonstrations than ever before,” said Russ Tedrake, VP of robotics research at Toyota Research Institute (TRI).

The challenge: A holy grail for robotics is the creation of general purpose robots that can enter our workplaces or homes and quickly learn to perform new tasks. To reach it, though, we’re going to need a fast, efficient method for training the AIs powering the bots.

“Even one year ago, I would not have predicted that we were close to this level of diverse dexterity.”
Russ Tedrake

While several techniques are promising — such as showing AIs demonstration videos and giving them “treats” like you’re training a dog — the hunt for the fastest, most efficient robot training method is ongoing.

What’s new? TRI has announced what it believes is a “breakthrough” method for teaching robots new dexterous skills, the kind that require a precise touch, such as pouring liquids or handling soft objects.

TRI says it has already used this method to quickly teach robots more than 60 skills, including how to use hand mixers, flip pancakes, and place dishes in a drying rack. Its goal is to reach 200 skills by the end of 2023 and 1,000 by the end of 2024.

“The tasks that I’m watching these robots perform are simply amazing — even one year ago, I would not have predicted that we were close to this level of diverse dexterity,” said Tedrake.

TRI has used its new training method to teach robots 60 skills. Credit: Toyota Motor Corporation

How it works: TRI’s new training method centers on “haptic demonstrations.”

These are created by having a researcher manually control a robot using a specially developed teleoperation interface. This interface provides haptic feedback to the operator, meaning they feel it when the robot is making contact with something.

The operator will usually walk the robot through a new task repeatedly for an hour or two, demonstrating it anywhere from a few dozen to hundreds of times, while the robot’s cameras and haptic sensors record the process.

The demonstration data is then fed to TRI’s AI model, which learns via “diffusion policy,” an approach developed by TRI and researchers at Columbia University.

The technique is based on the diffusion method powering some text-to-image AIs, such as Stable Diffusion and DALL-E 2, but instead of generating images from text, this version enables the AI to generate physical actions for a robot based on sensor data.

A TRI researcher using the teleoperation interface. Credit: Toyota Research Institute

Speed training: Diffusion policy is complicated (you can read the researchers’ paper on arXiv for all the details), but the bottom line, according to TRI, is that it’s much faster than other training methods.

“Our usual procedure is to teach the robot in the afternoon, let it learn overnight, and the next morning it’s able to do the new behavior,” said Ben Burchfiel, manager of dexterous manipulation at TRI.

This approach also opens the door to creating AIs that are capable of quickly learning how to do many tasks, the same way other generative AIs are capable of creating a variety of images or writing on a variety of topics.

“This method has great potential for building what we call ‘large-scale behavioral models,’” said Tedrake. “Just as large-scale language models revolutionized chatbots, these behavioral models allow robots to perform useful tasks in ways they couldn’t before.”

“We anticipate the next breakthrough will be when … they’re able to generalize, performing a new skill that they’ve never been taught.”
Russ Tedrake

Looking ahead: A limitation of TRI’s approach is that its bots can struggle to complete tasks under conditions that differ significantly from those in the demonstration data. For example, a robot that had no problem emptying a cup of ice into an uncluttered sink — like the one in its demonstration data — can struggle to empty it into a cluttered sink.

The researchers believe their robots will become more flexible as they are introduced to more diverse training data, so right now, they’re building what Tedrake calls a “kindergarten for robots” — its curriculum includes both haptic demonstrations and computer simulations, and the goal is to teach robots foundational skills that will be useful in many situations in the real world.

“We anticipate the next breakthrough will be when we’ve trained the robots with enough dextrous skills that they’re able to generalize, performing a new skill that they’ve never been taught,” said Tedrake.

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at tips@freethink.com.