Assistant Professor (incoming)
Mainly engaged in research related to multimodal large models and natural language processing.
2018: Bachelor's degree from the Department of Computer Science and Technology, Tsinghua University.
2023: Ph.D. from the Department of Computer Science and Technology, Tsinghua University.
September 2023-Present: Employed at the School of Computing, National University of Singapore, as a Postdoctoral Research Fellow.
By improving model architectures, training methods, and data construction, we have significantly enhanced the scalability of multimodal large models. We led the development of the MiniCPM series—efficient, lightweight, and high-performance multimodal large models. These models were selected for the Hugging Face 2024 list of most popular and downloaded models, and recognized among the Top 10 Major Scientific and Technological Achievements at the Zhongguancun Forum Annual Conference.
We proposed a unified encoding framework for high-resolution images, multiple images, and videos, enabling efficient modeling and knowledge transfer of high-definition visual content.
We introduced a fine-grained human feedback learning method for multimodal models, which significantly reduces hallucinations in generated content and improves model reliability.
We developed a biomedical multimodal scientific large model, achieving for the first time deep interaction between molecular structures and natural language.
We have published over 30 papers in top-tier conferences and journals such as CVPR, ACL, ICLR, and *Nature Communications*, with multiple works selected as conference highlights or journal featured articles. Our work has garnered over 6,500 citations on Google Scholar, 36,000+ stars on GitHub, and over 10 million downloads of our open-source models. We have received honors including the CAAI Outstanding Doctoral Dissertation Award, Intel China Academic Achievement Award, and WAIC Yunfan Award.