AI, AI Funding & Investment, AI Research & Advances, Robotics

MIT Researchers Develop Long-Term Memory Framework for Robots

Insider Brief

MIT researchers have developed DAAAM, a memory framework that allows robots to build detailed maps of their surroundings and answer natural-language questions about what they have seen, where they saw it and when.
The system combines robotic mapping with computer vision, enabling robots to attach descriptions to objects and locations and later retrieve that information through language-based queries rather than relying only on coordinates or visual data.
In testing, DAAAM improved accuracy by 21% to 53% compared with existing methods, and the researchers said the technology could support applications ranging from industrial and warehouse robotics to autonomous inspection systems and augmented reality tools.

MIT researchers have developed a memory system that allows robots to remember detailed information about large environments and retrieve it using natural-language questions.

The study, led by Nicolas Gorlo, a graduate student at MIT, along with Luca Carlone, associate professor in MIT’s Department of Aeronautics and Astronautics and director of the MIT SPARK Laboratory, and Lukas Schmid, a former MIT research scientist who is now a professor at the University of Technology Nuremberg in Germany, was funded in part by the U.S. Army Research Laboratory and the Office of Naval Research.

DAAAM

The researchers call their system Describe Anything, Anywhere, Anytime, at Any Moment, or DAAAM, and it combines robotic mapping with advances in computer vision to create what researchers describe as a form of spatiotemporal memory. The goal is to allow robots not only to navigate an environment but also to remember what they have seen, where they saw it and when.

The work takes on the spatiotemporal memory challenge in robotics. While modern AI systems can recognize objects and understand language, they often struggle to connect those capabilities to detailed memories of real-world environments. Humans routinely perform this task, remembering where they left an item or recalling details about a location visited days earlier.

“If we want robots to work side-by-side with humans and interact better with humans, they must speak the same language,” Carlone noted. “The robot must be able to reason about time and space the same way humans do. That is essentially what our method is doing. It is turning a traditional map into a language-based map that is easier for the robot to think about and access using language.”

Merging Two Technologies

The researchers combined two technologies that are typically developed separately. Computer vision models can generate rich descriptions of objects and scenes but often operate on individual images. Robotic mapping systems can create large-scale three-dimensional maps but usually lack detailed semantic information about what those maps contain.

As a robot moves through an environment, the system creates a 3D map while attaching descriptions to objects it encounters. For example, it can identify a building, recognize objects nearby and store information about their location within the map. Rather than simply recording coordinates, the system builds a searchable memory that links places, objects and descriptions together.

The Processing Delay Problem

A key hurdle was speed, according to the researchers. Existing methods that generate detailed descriptions of objects can take several seconds per scene, making them impractical for robots operating in real time.

To tackle this problem, the researchers developed a method that groups nearby objects and selects only the most useful images for detailed analysis. By processing multiple objects simultaneously, the system reduced computational demands and increased annotation speed by roughly an order of magnitude, researchers pointed out.

Once information is stored, the robot must be able to retrieve it efficiently. The researchers integrated a large language model that uses specialized search tools to locate relevant information within the robot’s memory. Depending on the question, the system can search by object type, location or other contextual information.

In testing, DAAAM outperformed existing approaches, achieving accuracy improvements ranging from 21% to 53%, depending on the type of query.

The researchers envision several applications. In industrial settings, workers could ask robotic assistants to retrieve partially completed components or locate tools. Similar capabilities could support warehouse operations, autonomous inspection systems and service robots.

What’s Next?

Beyond robotics, the framework could be used in augmented reality systems that help maintenance workers identify anomalies or assist people navigating complex environments.

The researchers plan to expand the system to capture significant events in addition to objects and locations. They are also exploring ways to allow robots to express confidence in their answers, which could improve reliability in real-world deployments.

“Ultimately, we want to have robots that can help with any sort of tasks. With this framework, we are trying to create the foundations to enable a generalist agent that can do anything you ask,” Gorlo added.

Image credit: MIT

Attachments

13.1. What Is Space Domain Awareness and Why It Matters (4 MB)

Need Deeper Intelligence on the AI Market?

AI Insider's Market Intelligence platform tracks funding rounds, competitive landscapes, and technology trends across the global AI ecosystem in real time. Get the data and insights your organization needs to make informed decisions.

AI, AI Funding & Investment, Exclusives

AI Benchmarks Explained: What They Measure and Miss

Every model launch these days arrives with a wall of numbers: a score on a knowledge test, a score on a coding test, a score

AI, AI Funding & Investment, AI Policy & Regulation, Business, Insights

Altman Says AI Development May Need to Be “Paced” as Safety Concerns Mount

OpenAI CEO Sam Altman said AI development may need to slow down to give society time to adjust to new capability levels, speaking on the

AI Funding & Investment

Moonshot AI Hits $35B Valuation After $3.5B Funding Round on K3 Momentum

Moonshot AI has secured a $35 billion valuation after closing a $3.5 billion funding round, far exceeding the $1 billion to $2 billion the Beijing-based

Stay Updated with AI Insider

Get the latest AI funding news, market intelligence, and industry insights delivered to your inbox weekly.

Market Intelligence & Data

Track funding, map landscapes, and access bespoke data cuts.

Strategic Advisory

Market entry playbooks, ecosystem analysis, and technology scouting.

Due Diligence

Technical, commercial, and regulatory assessments for investors.

$ 0 M

Seed round tracked

Gitar — Code Validation

AI, AI Funding & Investment, Exclusives

Get the Weekly Briefing

Funding analysis, market intelligence, and industry trends delivered to your inbox every week.

Need bespoke intelligence?

Our team combines real-time data with decades of sector experience to guide your decisions.

MIT Researchers Develop Long-Term Memory Framework for Robots

DAAAM

Merging Two Technologies

The Processing Delay Problem

What’s Next?

Attachments

Need Deeper Intelligence on the AI Market?

Related Articles

AI Benchmarks Explained: What They Measure and Miss

Altman Says AI Development May Need to Be “Paced” as Safety Concerns Mount

Moonshot AI Hits $35B Valuation After $3.5B Funding Round on K3 Momentum

Stay Updated with AI Insider

Market Intelligence & Data

Strategic Advisory

Due Diligence

Seed round tracked

AI Benchmarks Explained: What They Measure and Miss

Altman Says AI Development May Need to Be “Paced” as Safety Concerns Mount

Moonshot AI Hits $35B Valuation After $3.5B Funding Round on K3 Momentum

Get the Weekly Briefing

Need bespoke intelligence?

Subscribe today for the latest news about the AI landscape