PI:Yinpeng DONG
Research Direction: AI theory, machine learning, safety and alignment of large models, generative AIs, etc.
Tsinghua Safe and Trustworthy AI Research (T-STAR) Lab focuses on promoting in-depth, scientific understandings of current AI models, and developing novel theories and methods to improve the generalization, safety, reliability, efficiency, and trustworthiness of machine learning approaches, especially generative AIs.
▪ STAIR: Improving Safety Alignment with Introspective Reasoning (Oral, Accept rate ~1%)
Yichi Zhang, Siyuan Zhang, Yao Huang, Zeyu Xia, Zhengwei Fang, Xiao Yang, Ranjie Duan, Dong Yan, Yinpeng Dong#, Jun Zhu
International Conference on Machine Learning (ICML), 2025
▪ MultiTrust: A Comprehensive Benchmark Towards Trustworthy Multimodal Large Language Models
Yichi Zhang, Yao Huang, Yitong Sun, Chang Liu, Zhe Zhao, Zhengwei Fang, Yifan Wang, Huanran Chen, Xiao Yang, Xingxing Wei, Hang Su, Yinpeng Dong#, and Jun Zhu
Advances in Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track, 2024
▪ A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking
Chang Liu*, Yinpeng Dong*, Wenzhao Xiang, Xiao Yang, Hang Su, Jun Zhu, Yuefeng Chen, Yuan He, Hui Xue, and Shibao Zheng
International Journal of Computer Vision (IJCV), 2024
▪ Exploring the Transferability of Visual Prompting for Multimodal Large Language Models (Highlight, Accept rate ~2.8%)
Yichi Zhang, Yinpeng Dong#, Siyuan Zhang, Tianzan Min, Hang Su, and Jun Zhu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
▪ Query-Efficient Black-box Adversarial Attacks Guided by a Transfer-based Prior
Yinpeng Dong, Shuyu Cheng, Tianyu Pang, Hang Su, and Jun Zhu
IEEE Transaction on Pattern Analysis and Machine Intelligence (TPAMI), 2021
▪ Benchmarking Adversarial Robustness on Image Classification (Oral)
Yinpeng Dong, Qi-An Fu, Xiao Yang, Tianyu Pang, Hang Su, Zihao Xiao, and Jun Zhu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020
▪ Evading Defenses to Transferable Adversarial Examples by Translation-Invariant Attacks (Oral)
Yinpeng Dong, Tianyu Pang, Hang Su, and Jun Zhu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019
▪ Boosting Adversarial Attacks with Momentum (Spotlight)
Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, and Jianguo Li
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018