Entire: A Collaboration Platform for Agents and Humans
Alibaba DAMO Academy has recently released RynnBrain, an embodied foundation model grounded in physical reality. The model demonstrates strong capabilities in physical world understanding, spatial reasoning, and robot task planning.
Model Specifications
RynnBrain is available in three variants:
- RynnBrain-2B: Lightweight dense model
- RynnBrain-8B: Standard dense model
- RynnBrain-30B-A3B: MoE (Mixture-of-Experts) model with 3B active parameters
Core Capabilities
1. Comprehensive Egocentric Understanding
Excels in fine-grained video understanding and egocentric cognition, covering tasks such as embodied QA, counting, and OCR.
2. Diverse Spatio-temporal Localization
Possesses powerful localization capabilities across episodic memory, enabling precise identification of objects, target areas, and motion trajectories.
3. Physical-space Reasoning
Employs an interleaved reasoning strategy that alternates between textual and spatial grounding, ensuring that reasoning processes are firmly rooted in the physical environment.
4. Physics-aware Precise Planning
Integrates located affordances and object information into planning, enabling downstream VLA (Vision-Language-Action) models to execute intricate tasks with fine-grained instructions.
Specialized Models
In addition to the base models, DAMO Academy has released three post-trained specialized models:
- RynnBrain-Plan: Robot task planning
- RynnBrain-Nav: Vision-language navigation
- RynnBrain-CoP: Chain-of-Point reasoning
Technical Report and Resources
DAMO Academy has also published a detailed technical report and open-sourced model weights and code on Hugging Face and ModelScope.
Related Links:
- Project Page: https://alibaba-damo-academy.github.io/RynnBrain.github.io/
- GitHub Repository: https://github.com/alibaba-damo-academy/RynnBrain
- Live Demo: Hugging Face Spaces
- Technical Report: PDF Download