PI:Xueyan ZOU
研究方向:人工智能中的多模态大模型、具身智能与空间智能等方向
课题组围绕人工智能中的多模态大模型、具身智能与空间智能等方向开展研究,致力于推动基础模型在智能感知、机器人导航及智能代理系统中的应用与优化。在学术创新方面,课题组负责人主导构建了X-Decoder、SEEM 和 FIND 等一系列多模态通用模型,在图像理解领域具有一定影响力,并进一步将多模态理解应用于具身智能与空间智能任务。相关系列工作已发表在计算机视觉顶会(CVPR/ICCV/ECCV),机器人顶会(RSS/CoRL/ICRA)以及机器学习顶会(NeurIPS/ICLR)。课题组负责人曾获得 BMVC 2020 最佳论文奖,并获得微软基础模型研究资助等。
1. DexIB: Learning with 1B Demonstrations for Dexterous Manipulation
Jianglong Ye, Keyi Wang, Chengjing Yuan, Ruihan Yang, Yiquan Li, Jiyue Zhu, Yuzhe Qin, Xueyan Zou, Xiaolong Wang
RSS 2025
2. NaViLA: Legged Robot Vision-Language-Action Model for Navigation
An-Chieh Cheng*, Xandone Ji*, Zhaojing Yang*, Zaitian Gongye, Xueyan Zou, Jan Kautz, Erdem Bıyık, Hongxu Yin, Sifei Liu, Xiaolong Wang
RSS 2025
3. WildLMa: Long Horizon Loco-Manipulation in the Wild
Ri-Zhao Qiu*, Yuchen Song*, Xuanbin Peng*, Sai Aneesh Suryadevara, Ge Yang, Minghuan Liu, Mazeyu Ji, Chengzhe Jia, Ruihan Yang, Xueyan Zou, Xiaolong Wang
ICRA 2025
4. 3D-Spatial Multimodal Memory
Xueyan Zou, Yuchen Song, Ri-Zhao Qiu, Xuanbin Peng, Jianglong Ye, Sifei Liu, Xiaolong Wang
ICLR 2025
5. What can Foundation Models’ Embeddings do?
Xueyan Zou, Linjie Li, Jianfeng Wang, Jianwei Yang, Mingyu Ding, Junyi Wei, Feng Li, Hao Zhang, Shilong Liu, Arul Aravinthan, Yong Jae Lee†, Lijuan Wang†
NeurIPS 2024
6. GraspSplats: Efficient Manipulation with 3D Feature Splatting
Mazeyu Ji*, Ri-Zhao Qiu*, Xueyan Zou, Xiaolong Wang
CoRL 2024
7. Semantic-SAM: Segment and Recognize Anything at Any Granularity
Feng Li*, Hao Zhang*, Peize Sun, Xueyan Zou, Shilong Liu, Jianwei Yang, Chunyuan Li, Lei Zhang†, Jianfeng Gao†
ECCV 2024
8. Llava-grounding: Grounded visual chat with large multimodal models
Hao Zhang, Hongyang Li, Feng Li, Tianhe Ren, Xueyan Zou, Shilong Liu, Shijia Huang, Jianfeng Gao, Lei Zhang, Chunyuan Li, Jianwei Yang
ECCV 2024
9. LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Shilong Liu, Hao Cheng, Haotian Liu, Hao Zhang, Feng Li, Tianhe Ren, Xueyan Zou, Jianwei Yang, Hang Su, Jun Zhu, Lei Zhang, Jianfeng Gao, Chunyuan Li
ECCV 2024
10. Visual In-Context Prompting
Feng Li, Qing Jiang, Hao Zhang, Tianhe Ren, Shilong Liu, Xueyan Zou, Huaizhe Xu, Hongyang Li, Chunyuan Li, Jianwei Yang, Lei Zhang, Jianfeng Gao
CVPR 2024
11. Segment Everything Everywhere All at Once
Xueyan Zou*, Jianwei Yang*, Hao Zhang*, Feng Li*, Linjie Li, Jianfeng Wang, Lijuan Wang, Jianfeng Gao†, Yong Jae Lee†
NeurIPS 2023
12. A simple framework for open-vocabulary segmentation and detection
Hao Zhang*, Feng Li*, Xueyan Zou, Shilong Liu, Chunyuan Li, Jianfeng Gao, Jianwei Yang†, Lei Zhang†
ICCV 2023
13. Generalized Decoding for Pixel, Image and Language
Xueyan Zou*, Zi-Yi Dou*, Jianwei Yang*, Zhe Gan, Linjie Li, Chunyuan Li, Xiyang Dai, Harkirat Behl, Jianfeng Wang, Lu Yuan, Nanyun Peng, Lijuan Wang, Yong Jae Lee†, Jianfeng Gao†
CVPR 2023, Poster and Demo
14. Extension of “Delving deeper into network anti-aliasing in ConvNets”
Xueyan Zou, Fanyi Xiao, Zhiding Yu, Yuheng Li, Yong Jae Lee
IJCV, Special Issue
15. Progressive Temporal Feature Alignment Network for Video Inpainting
Xueyan Zou, Linjie Yang, Ding Liu, Yong Jae Lee
CVPR 2021
16. Delving deeper into network anti-aliasing in ConvNets
Xueyan Zou, Fanyi Xiao, Zhiding Yu, Yong Jae Lee
BMVC 2020, Best Paper Award