HoloAgent is a unified embodied-agent framework for general-purpose robots, integrating closed-loop execution, 3D spatial memory, and robot skills for real-world tasks.
🔥 News
- [2026.06] HoloAgent-0 is released. Code is under preparation and will be released soon.
- [2025.09] FSR-VLN is released for fast-and-slow vision-language navigation.
✅ Release Status
- HoloAgent-0 code update
- HoloAgent-0 project page & paper
- FSR-VLN code
🧩 Components
- Embodied AgentOS: Coordinates high-level task planning, runtime feedback, and closed-loop robot execution.
- 3D Spatial Memory: Grounds robot reasoning in physical-world spatial representations for long-horizon tasks.
- Embodied Skills: Connects agent decisions to executable robot navigation and manipulation skills.
- FSR-VLN: Provides fast-and-slow vision-language navigation with a hierarchical multi-modal scene graph.
🧠 Framework for Closed-Loop Robot Execution
AgentOS turns language instructions into monitored skill graphs and closes the loop across spatial retrieval, execution, memory updates, and recovery.
🤖 Real-Robot Demonstrations
Compressed previews from real-hardware deployments. Full-resolution videos are available on the project page.
| Navigation and Dance Coordination | Long-Horizon Mobile Manipulation |
|---|---|
![]() | ![]() |
| Coordinate navigation and humanoid motion across robots. | Decompose long-horizon manipulation into navigation, grasping, placement, and recovery. |
| Active Exploration in a New Environment | Interactive Humanoid Command Execution |
|---|---|
![]() | ![]() |
| Explore new spaces and update 3D memory online. | Follow open-ended commands with navigation and embodied actions. |
| A Day with a Robot Companion | A Day in the Life of a Robot Guide |
|---|---|
![]() | ![]() |
| Combine language, 3D reasoning, navigation, interaction, and action. | Guide users through workspaces with spatial-memory-aware routes. |
🤖 FSR-VLN
FSR-VLN is the HoloAgent navigation component, combining a Hierarchical Multi-modal Scene Graph with Fast-to-Slow Navigation Reasoning for efficient long-range spatial reasoning.
🏗 Getting Started
The current repository includes FSR-VLN and navigation-agent setup. HoloAgent-0 code will be added in a future release.
1. Semantic Mapping and Retrieval Pipeline
- Task: Implement the semantic mapping and retrieval system based on the instructions in
fsr_vln/README.md. - Steps:
- Download the necessary pre-trained model checkpoints.
- Download and configure the required datasets.
- Set up the environment and dependencies as specified.
- Run the complete pipeline to verify its functionality for semantic mapping and visual place retrieval.
2. Navigation Agent Setup and Execution
- Task: Set up and test the navigation agent according to
nav_agent/README.md. - Steps:
- Install all required dependencies for the navigation environment.
- Configure the necessary parameters and environment settings.
- Execute the navigation agent to ensure it runs successfully and performs its intended tasks.
📚 Publications & Citation
If you find our project useful, please consider citing it:
@misc{holoagent2026holoagent0,
title={HoloAgent-0: A Unified Embodied Agent Framework with 3D Spatial Memory},
year={2026},
eprint={2606.23565},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2606.23565},
}
@misc{zhou2025fsrvlnfastslowreasoning,
title={FSR-VLN: Fast and Slow Reasoning for Vision-Language Navigation with Hierarchical Multi-modal Scene Graph},
author={Xiaolin Zhou and Tingyang Xiao and Liu Liu and Yucheng Wang and Maiyue Chen and Xinrui Meng and Xinjie Wang and Wei Feng and Wei Sui and Zhizhong Su},
year={2025},
eprint={2509.13733},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2509.13733},
}
🙏 Acknowledgements
This project is built upon and inspired by several outstanding open source projects: OVO、HOV-SG、rerun、dimos、openclaw.
⚖️ License
This project is licensed under the Apache License 2.0. See the LICENSE file for details.





