Cornell University teaches robots new tasks from how-to videos in just 30 minutes

The Robot Report Staff
2025-04-27 08:30:00
www.therobotreport.com

A Cornell University robot arm with a two-fingered gripper picking up a cup next to a sink.

A RHyME-equipped robot picking up a mug from a counter. | Source: Cornell University

Cornell University researchers have developed a new robotic framework powered by artificial intelligence. RHyME — Retrieval for Hybrid Imitation under Mismatched Execution — allows robots to learn tasks by watching a single how-to video.

Robots can be finicky learners, said the Columbia team. Historically, they have required precise, step-by-step directions to complete basic tasks. They also tend to quit when things go off-script, like after dropping a tool or losing a screw. However, RHyME could fast-track the development and deployment of robotic systems by significantly reducing the time, energy, and money needed to train them, the researchers claimed.

“One of the annoying things about working with robots is collecting so much data on the robot doing different tasks,” said Kushal Kedia, a doctoral student in the field of computer science. “That’s not how humans do tasks. We look at other people as inspiration.”

Kedia will present the paper, “One-Shot Imitation under Mismatched Execution,” next month at the Institute of Electrical and Electronics Engineers’ (IEEE) International Conference on Robotics and Automation (ICRA) in Atlanta.

Paving the path for home robots

The university team said home robot assistants are still a long way off because they lack the wits to navigate the physical world and its countless contingencies.

To get robots up to speed, researchers like Kedia are training them with how-to videos — human demonstrations of various tasks in a lab setting. The Cornell researchers said they hope this approach, a branch of machine learning called “imitation learning,” will enable robots to learn a sequence of tasks faster and be able to adapt to real-world environments.

“Our work is like translating French to English – we’re translating any given task from human to robot,” said senior author Sanjiban Choudhury, assistant professor of computer science.

This translation task still faces a broader challenge: Humans move too fluidly for a robot to track and mimic, and training robots requires a lot of video. Furthermore, video demonstrations of, say, picking up a napkin or stacking dinner plates must be performed slowly and flawlessly. Any mismatch in actions between the video and the robot has historically spelled doom for robot learning, the researchers said.

“If a human moves in a way that’s any different from how a robot moves, the method immediately falls apart,” Choudhury said. “Our thinking was, ‘Can we find a principled way to deal with this mismatch between how humans and robots do tasks?’”

Cornell RHyME helps robots learn multi-step tasks

RHyME is the team’s answer – a scalable approach that makes robots less finicky and more adaptive. It enables a robotic system to use its own memory and connect the dots when performing tasks it has viewed only once by drawing on videos it has seen.

For example, a RHyME-equipped robot shown a video of a human fetching a mug from the counter and placing it in a nearby sink will comb its bank of videos and draw inspiration from similar actions, like grasping a cup and lowering a utensil.

The team said RHyME paves the way for robots to learn multiple-step sequences while significantly lowering the amount of robot data needed for training. RHyME requires just 30 minutes of robot data; in a lab setting, robots trained using the system achieved a more than 50% increase in task success compared to previous methods, the Cornell researchers said.

“This work is a departure from how robots are programmed today. The status quo of programming robots is thousands of hours of teleoperation to teach the robot how to do tasks. That’s just impossible,” Choudhury stated. “With RHyME, we’re moving away from that and learning to train robots in a more scalable way.”

Register now so you don’t miss out!

Source Link

Keep track of your essentials with the Apple AirTag 4 Pack, the ultimate tracking solution for your belongings. With over 5,972 ratings and a stellar 4.7-star average, this product has quickly become a customer favorite. Over 10,000 units were purchased in the past month, solidifying its status as a highly rated Amazon Choice product.

For just $79.98, you can enjoy peace of mind knowing your items are always within reach. Order now for only $79.98 at Amazon!

Unlock unlimited streaming with a free Amazon Prime trial!
Sign up today!

Help Power Techcratic’s Future – Scan To Support

If Techcratic’s content and insights have helped you, consider giving back by supporting the platform with crypto. Every contribution makes a difference, whether it’s for high-quality content, server maintenance, or future updates. Techcratic is constantly evolving, and your support helps drive that progress.

As a solo operator who wears all the hats, creating content, managing the tech, and running the site, your support allows me to stay focused on delivering valuable resources. Your support keeps everything running smoothly and enables me to continue creating the content you love. I’m deeply grateful for your support, it truly means the world to me! Thank you!

BITCOIN

Bitcoin Logo