Maria Deutscher
2025-03-12 18:11:00
siliconangle.com
Google LLC today introduced two new artificial intelligence models, Gemini Robotics and Gemini Robotics-ER, that are optimized to power autonomous machines.
The algorithms are based on the company’s Gemini 2.0 series of large language models. Introduced in December, the LLMs can process not only text but also multimodal data as video. This latter capability enables the new Gemini Robotics and Gemini Robotics-ER models to analyze footage from a robot’s cameras when making decisions.
Gemini Robotics is described as a vision-language-action model. According to Google, robots equipped with the model can perform complex tasks based on natural language instructions. A user could, for example, ask the AI to fold paper into origami shapes or place items in a Ziploc bag.
Historically, teaching an industrial robot a new task required manual programming. The task necessities specialized skills and can consume a significant amount of time. To ease the robot configuration process, Google’s researchers built Gemini Robotics with generality in mind. The company says that the AI can carry out tasks it was not taught to perform during training, which reduces the need for manual programming.
To test how well Gemini Robotics responds to new tasks, Google evaluated it using an AI generalization benchmark. The company determined the algorithm more than doubled the performance of earlier vision-language-action models. According to Google, Gemini Robotics can not only perform tasks it was not taught to perform but also change how it carries out those tasks when environmental conditions change.
“If an object slips from its grasp, or someone moves an item around, Gemini Robotics quickly replans and carries on — a crucial ability for robots in the real world, where surprises are the norm,” Carolina Parada, head of robotics at Google DeepMind, detailed in a blog post.
The other new AI model that the company debuted today, Robotics-ER, is geared toward spatial reasoning. This is a term for the complex sequence of computations that a robot must carry out before it can perform a task. Picking up a coffee mug, for example, requires a robotic arm to find the handle and calculate the angle from which it should be approached.
After developing a plan for how to carry out a task, Gemini Robotics-ER uses Gemini 2.0’s coding capabilities to turn the plan into a configuration script. This script programs the robot in which the AI is installed. If a task proves too complicated for Gemini Robotics-EP, developers can teach it the best course of action with a “handful of human demonstrations.”
“Gemini Robotics-ER can perform all the steps necessary to control a robot right out of the box, including perception, state estimation, spatial understanding, planning and code generation,” Parada wrote. “In such an end-to-end setting the model achieves a 2x-3x success rate compared to Gemini 2.0.”
Google will make Gemini Robotics-ER available to several partners, including Apptronik Inc., a humanoid robot startup that raised $350 million last month. The funding round saw the search giant join as an investor. Google will collaborate with Apptronik to develop humanoid robots equipped with Gemini 2.0.
Image: Google
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU
Enjoy the perfect blend of retro charm and modern convenience with the Udreamer Vinyl Record Player. With 9,041 ratings, a 4.3/5-star average, and 400+ units sold in the past month, this player is a fan favorite, available now for just $39.99.
The record player features built-in stereo speakers that deliver retro-style sound while also offering modern functionality. Pair it with your phone via Bluetooth to wirelessly listen to your favorite tracks. Udreamer also provides 24-hour one-on-one service for customer support, ensuring your satisfaction.
Don’t miss out—get yours today for only $39.99 at Amazon!
Help Power Techcratic’s Future – Scan To Support
If Techcratic’s content and insights have helped you, consider giving back by supporting the platform with crypto. Every contribution makes a difference, whether it’s for high-quality content, server maintenance, or future updates. Techcratic is constantly evolving, and your support helps drive that progress.
As a solo operator who wears all the hats, creating content, managing the tech, and running the site, your support allows me to stay focused on delivering valuable resources. Your support keeps everything running smoothly and enables me to continue creating the content you love. I’m deeply grateful for your support, it truly means the world to me! Thank you!
BITCOIN bc1qlszw7elx2qahjwvaryh0tkgg8y68enw30gpvge Scan the QR code with your crypto wallet app |
DOGECOIN D64GwvvYQxFXYyan3oQCrmWfidf6T3JpBA Scan the QR code with your crypto wallet app |
ETHEREUM 0xe9BC980DF3d985730dA827996B43E4A62CCBAA7a Scan the QR code with your crypto wallet app |
Please read the Privacy and Security Disclaimer on how Techcratic handles your support.
Disclaimer: As an Amazon Associate, Techcratic may earn from qualifying purchases.