Entire: A Collaboration Platform for Agents and Humans

Alibaba DAMO Academy has recently released RynnBrain, an embodied foundation model grounded in physical reality. The model demonstrates strong capabilities in physical world understanding, spatial reasoning, and robot task planning.

Model Specifications

RynnBrain is available in three variants:

Core Capabilities

1. Comprehensive Egocentric Understanding

Excels in fine-grained video understanding and egocentric cognition, covering tasks such as embodied QA, counting, and OCR.

2. Diverse Spatio-temporal Localization

Possesses powerful localization capabilities across episodic memory, enabling precise identification of objects, target areas, and motion trajectories.

3. Physical-space Reasoning

Employs an interleaved reasoning strategy that alternates between textual and spatial grounding, ensuring that reasoning processes are firmly rooted in the physical environment.

4. Physics-aware Precise Planning

Integrates located affordances and object information into planning, enabling downstream VLA (Vision-Language-Action) models to execute intricate tasks with fine-grained instructions.

Specialized Models

In addition to the base models, DAMO Academy has released three post-trained specialized models:

Technical Report and Resources

DAMO Academy has also published a detailed technical report and open-sourced model weights and code on Hugging Face and ModelScope.

Related Links:

← All articles