AI, AI Use Cases

Google DeepMind Releases Newest Gemini Robotics Reasoning Model Designed to Improve How Robots Interpret and Act in Real World

Insider Brief

Google DeepMind released Gemini Robotics-ER 1.6, a robotics reasoning model designed to improve how machines interpret visual inputs, plan tasks and determine task completion in physical environments
The model adds capabilities including improved spatial reasoning, multi-view perception and instrument reading, enabling robots to identify objects, understand scenes and interpret gauges in industrial settings
Google DeepMind said the system also shows gains in safety and reliability, with improved hazard detection and adherence to physical constraints, and is available via the Gemini API and Google AI Studio

Google DeepMind has released Gemini Robotics-ER 1.6, an updated robotics reasoning model designed to improve how machines interpret and act in physical environments.

According to Google DeepMind, the model provides a high-level reasoning layer for robots, enabling systems to better understand visual inputs, plan tasks and determine when actions are complete. The release reflects continued efforts to connect advances in AI models with real-world robotics use cases, particularly in environments that require spatial awareness and decision-making.

What is Gemini Robotics-ER 1.6?

Gemini Robotics-ER 1.6 is a reasoning-first model built to support embodied AI systems, allowing robots to process visual information and translate it into physical actions. It can also interact with external tools, including search and vision-language-action systems, to support task execution.

Google DeepMind highlighted several areas of improvement over earlier versions of the model:

Spatial reasoning and object understanding: Improved ability to identify, count and locate objects, including more accurate detection and fewer errors such as identifying objects that are not present
Pointing and relational reasoning: Uses spatial “pointing” as an intermediate step to reason about relationships, trajectories and constraints in a scene
Task planning and success detection: Determines whether a task has been completed, allowing robots to decide whether to retry or move to the next step
Multi-view perception: Combines inputs from multiple cameras, such as overhead and wrist-mounted views, to build a more complete understanding of dynamic or partially obscured environments
Instrument reading: Adds the ability to interpret gauges, thermometers and sight glasses, a capability developed in collaboration with Boston Dynamics for inspection and monitoring tasks

Focus on real-world robotics applications

The company pointed out that the instrument reading capability reflects a practical use case in industrial settings, where robots such as Boston Dynamics’ Spot capture images of equipment that must be interpreted accurately. To do that, the model uses what it calls “agentic vision,” a combination of visual reasoning and intermediate computational steps, such as zooming into images and estimating measurements, to derive readings.

“Capabilities like instrument reading and more reliable task reasoning will enable Spot to see, understand and react to real-world challenges completely autonomously,” Marco da Silva, vice president and general manager of Spot at Boston Dynamics, noted in the announcement.

Improvements in Safety and Reliability

Google DeepMind said the model shows improved adherence to safety constraints, including better identification of potential hazards and more consistent decision-making around what objects can be safely manipulated. The system was also evaluated on tasks involving safety instruction following and risk detection in text and video scenarios.

“On these tasks, our Gemini Robotics-ER models improve over baseline Gemini 3.0 Flash performance (+6% in text, +10% in video) in perceiving injury risks accurately,” the company pointed out.

Availability

Gemini Robotics-ER 1.6 is available through the Gemini API and Google AI Studio, with developer tools and example workflows provided to support integration into robotics systems.

“For robots to be truly helpful in our daily lives and industries, they must do more than follow instructions, they must reason about the physical world,” the company said. “From navigating a complex facility to interpreting the needle on a pressure gauge, a robot’s “embodied reasoning” is what allows it to bridge the gap between digital intelligence and physical action.”

Image credit: Google DeepMind

Need Deeper Intelligence on the AI Market?

AI Insider's Market Intelligence platform tracks funding rounds, competitive landscapes, and technology trends across the global AI ecosystem in real time. Get the data and insights your organization needs to make informed decisions.

AI, AI Funding & Investment, Robotics

Mind Robotics Announces $400M in New Funding to Expand Industrial Robotics Deployment

Insider Brief Industrial robotics startup Mind Robotics has raised $400 million in new funding led by Kleiner Perkins, bringing total investment in the company to

AI, AI Use Cases, Business, News

Amazon Launches Agentic AI Assistant ‘Alexa for Shopping’

Insider Brief Amazon is rolling out a new AI-powered shopping assistant called Alexa for Shopping that combines conversational AI, personalized recommendations and automation tools across

AI, AI Funding & Investment, Robotics

Autonomous Defense Tech Company Anduril Announces $5B Series H Funding Round

Insider Brief Defense technology company Anduril Industries has raised $5 billion in a Series H funding round to expand manufacturing capacity and autonomous defense systems