科学研究

交互式具身智能课题组

PI:Xueyan ZOU

研究方向:交互式具身智能、世界模型、灵巧控制与感知、具身基础模型

课题组简介

我们致力于通过丰富的感知信号构建灵巧操作系统,并着重在真实世界中实现交互式部署。以下为主要研究方向:

* 交互式世界模型:构建动态、交互式的数字孪生,模拟真实环境中的物理交互特性,通过闭环的“Real-to-Sim-to-Real”流程,实现策略在现实世界中的无缝部署。

* 灵巧控制与感知:基于物理环境中的多模态感知数据,学习用于灵巧操作与敏捷运动的高自由度、底层控制策略。

* 具身基础模型:将大语言模型/多模态大模型与具身推理相结合,使基础模型能够理解物理世界,并将高层意图转化为底层动作。

课题组网站:https://maureenzou.github.io/lab.html

代表性论文

1. DexIB: Learning with 1B Demonstrations for Dexterous Manipulation

Jianglong Ye, Keyi Wang, Chengjing Yuan, Ruihan Yang, Yiquan Li, Jiyue Zhu, Yuzhe Qin, Xueyan Zou, Xiaolong Wang

RSS 2025


2. NaViLA: Legged Robot Vision-Language-Action Model for Navigation

An-Chieh Cheng*, Xandone Ji*, Zhaojing Yang*, Zaitian Gongye, Xueyan Zou, Jan Kautz, Erdem Bıyık, Hongxu Yin, Sifei Liu, Xiaolong Wang

RSS 2025


3. WildLMa: Long Horizon Loco-Manipulation in the Wild

Ri-Zhao Qiu*, Yuchen Song*, Xuanbin Peng*, Sai Aneesh Suryadevara, Ge Yang, Minghuan Liu, Mazeyu Ji, Chengzhe Jia, Ruihan Yang, Xueyan Zou, Xiaolong Wang

ICRA 2025


4. 3D-Spatial Multimodal Memory

Xueyan Zou, Yuchen Song, Ri-Zhao Qiu, Xuanbin Peng, Jianglong Ye, Sifei Liu, Xiaolong Wang

ICLR 2025


5. What can Foundation Models’ Embeddings do?

Xueyan Zou, Linjie Li, Jianfeng Wang, Jianwei Yang, Mingyu Ding, Junyi Wei, Feng Li, Hao Zhang, Shilong Liu, Arul Aravinthan, Yong Jae Lee†, Lijuan Wang†

NeurIPS 2024


6. GraspSplats: Efficient Manipulation with 3D Feature Splatting

Mazeyu Ji*, Ri-Zhao Qiu*, Xueyan Zou, Xiaolong Wang

CoRL 2024


7. Semantic-SAM: Segment and Recognize Anything at Any Granularity

Feng Li*, Hao Zhang*, Peize Sun, Xueyan Zou, Shilong Liu, Jianwei Yang, Chunyuan Li, Lei Zhang†, Jianfeng Gao†

ECCV 2024


8. Llava-grounding: Grounded visual chat with large multimodal models

Hao Zhang, Hongyang Li, Feng Li, Tianhe Ren, Xueyan Zou, Shilong Liu, Shijia Huang, Jianfeng Gao, Lei Zhang, Chunyuan Li, Jianwei Yang

ECCV 2024


9. LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Shilong Liu, Hao Cheng, Haotian Liu, Hao Zhang, Feng Li, Tianhe Ren, Xueyan Zou, Jianwei Yang, Hang Su, Jun Zhu, Lei Zhang, Jianfeng Gao, Chunyuan Li

ECCV 2024


10. Visual In-Context Prompting

Feng Li, Qing Jiang, Hao Zhang, Tianhe Ren, Shilong Liu, Xueyan Zou, Huaizhe Xu, Hongyang Li, Chunyuan Li, Jianwei Yang, Lei Zhang, Jianfeng Gao

CVPR 2024


11. Segment Everything Everywhere All at Once

Xueyan Zou*, Jianwei Yang*, Hao Zhang*, Feng Li*, Linjie Li, Jianfeng Wang, Lijuan Wang, Jianfeng Gao†, Yong Jae Lee†

NeurIPS 2023


12. A simple framework for open-vocabulary segmentation and detection

Hao Zhang*, Feng Li*, Xueyan Zou, Shilong Liu, Chunyuan Li, Jianfeng Gao, Jianwei Yang†, Lei Zhang†

ICCV 2023


13. Generalized Decoding for Pixel, Image and Language

Xueyan Zou*, Zi-Yi Dou*, Jianwei Yang*, Zhe Gan, Linjie Li, Chunyuan Li, Xiyang Dai, Harkirat Behl, Jianfeng Wang, Lu Yuan, Nanyun Peng, Lijuan Wang, Yong Jae Lee†, Jianfeng Gao†

CVPR 2023, Poster and Demo


14. Extension of “Delving deeper into network anti-aliasing in ConvNets”

Xueyan Zou, Fanyi Xiao, Zhiding Yu, Yuheng Li, Yong Jae Lee

IJCV, Special Issue


15. Progressive Temporal Feature Alignment Network for Video Inpainting

Xueyan Zou, Linjie Yang, Ding Liu, Yong Jae Lee

CVPR 2021


16. Delving deeper into network anti-aliasing in ConvNets

Xueyan Zou, Fanyi Xiao, Zhiding Yu, Yong Jae Lee

BMVC 2020, Best Paper Award

课题组成员

新闻动态

TOP