MEOW LAB（Modeling Egocentric Omniscient Worlds）-College of Al, Tsinghua University

Research

MEOW LAB（Modeling Egocentric Omniscient Worlds）

PI：Miao LIU

Research Direction: Egocentric Vision & Multimodal Generative AI

Research Group Introduction

Our laboratory focuses on Egocentric Vision and Multimodal Generative AI. We are committed to:

- Designing AI that sees through your eyes, learns your skills, and understands your intentions. (Developing the next generation of human-centric intelligent systems that can see what you see, learn what you know, and understand what you think.)

I am currently a Senior Research Scientist at Meta GenAI. I received my Ph.D. from Georgia Institute of Technology and have been selected as a Young Talent for the National High-Level Talent Program (Overseas). I am dedicated to the theoretical research of egocentric vision and multimodal generative AI. I have published over 20 papers in top conferences and journals such as CVPR, ECCV, ACL, and TPAMI as a primary author. My research has been cited more than 9,000 times, with eight papers selected for oral presentations. My papers at CVPR 2022 and ECCV 2024 were shortlisted for the Best Paper Award, and the paper at BMVC 2022 received the Best Student Paper Award.

As a primary author, I have contributed to the construction of several widely recognized egocentric video datasets, including Ego4D, EgoExo4D, EGTEA Gaze+, and Behavior Vision Suite. These datasets have been widely applied in both academia and industry. In the field of first-person behavior understanding, I have proposed several algorithms for action recognition and prediction, which will be applied to the next generation of smart glasses products at Meta Reality Labs. Additionally, during my tenure at Meta GenAI, I have been deeply involved in the training and evaluation of multimodal generative models, including EMU, Llama3, and Llama4 (the multimodal parts only).

Representative Publications

1. Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs

Zeyi Huang, Yuyang Ji, Xiaofang Wang, Nikhil Mehta, Tong Xiao, Donghyun Lee, Sigmund Vanvalkenburgh, Shengxin Zha, Bolin Lai, Licheng Yu, Ning Zhang, Yong Jae Lee*, Miao Liu* (* Co-corresponding author)

Conference on Computer Vision and Pattern Recognition (CVPR), 2025

2. LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning

Bolin Lai, Xiaoliang Dai, Lawrence Chen, Guan Pang, James M. Rehg, Miao Liu

European Conference on Computer Vision (ECCV), 2024 Oral, Best Paper Award Candidate, 15/8585})

3. Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation

Bolin Lai, Fiona Ryan, Wenqi Jia, Miao Liu*, James M. Rehg* (* Co-corresponding author)

European Conference on Computer Vision (ECCV), 2024

4. In the Eye of Transformer: Global-Local Correlation for Egocentric Gaze Estimation

Bolin Lai, Miao Liu*, Fiona Ryan, James M. Rehg (* Co-corresponding author)

British Machine Vision Conference (BMVC), 2022 (Spotlight, Best Student Paper, 2/770)

5. Egocentric Activity Recognition and Localization on a 3D Map

Miao Liu, Lingni Ma, Kiran Somasundaram, Yin Li, Kristen Grauman, James M. Rehg, Chao Li

European Conference on Computer Vision (ECCV), 2022 (Spotlight presentation at the 2nd International Ego4D Workshop@ECCV 2022)

6. Generative Adversarial Network for Future Hand Segmentation from Egocentric Video

Wenqi Jia*, Miao Liu*, James M. Rehg (* Co-first author)

European Conference on Computer Vision (ECCV), 2022

7. Ego4D: Around the World in 3,000 Hours of Egocentric Video

Kristen Grauman, Andrew Westbury, Eugene Byrne*, Zachary Chavis*, Antonino Furnari*, Rohit Girdhar*, Jackson Hamburger*, Hao Jiang*, Miao Liu*, Xingyu Liu*, Miguel Martin*, Tushar Nagarajan*, ... , Jitendra Malik (*:Co-first author (student))

Conference on Computer Vision and Pattern Recognition (CVPR), 2022 (Oral, Best Paper Final list, 33/8161)

8. Attention Distillation for Learning Video Representations

Miao Liu, Xin Chen, Yun Zhang, Yin Li, James M. Rehg

British Machine Vision Conference (BMVC), 2020 (Oral, top 5.0%})

9. Forecasting Human Object Interaction: Joint Prediction of Motor Attention and Actions in First Person Video

Miao Liu, Siyu Tang, Yin Li, James M. Rehg

European Conference on Computer Vision (ECCV), 2020 (Oral, top 2.0%)

Research

MEOW LAB（Modeling Egocentric Omniscient Worlds）

Research Group Introduction

Representative Publications

Group Members

News & Updates

contacts

Office

Postal Code

Email