Selected Publications


Data Intelligence

BLINK-Twice: You see, but do you observe? A Reasoning Benchmark on Visual Perception
Junyan Ye, Dongzhi Jiang, Jun He, Baichuan Zhou, Zilong Huang, Zhiyuan Yan, Hongsheng Li, Conghui He, Weijia Li
NeurIPS 2025
Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning
Honglin Lin*, Qizhi Pei*, Xin Gao, Zhuoshi Pan, Yu Li, Juntao Li, Conghui He, Lijun Wu
NeurIPS 2025
MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer
Honglin Lin, Zhuoshi Pan, Yu Li, Qizhi Pei, Xin Gao, Mengzhang Cai, Conghui He, Lijun Wu
EMNLP 2025
Middo: Model-Informed Dynamic Data Optimization for Enhanced LLM Fine-Tuning via Closed-Loop Learning
Zinan Tang, Xin Gao, Qizhi Pei, Zhuoshi Pan, Mengzhang Cai, Jiang Wu, Conghui He, Lijun Wu
EMNLP 2025
Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models
Xinlin Zhuang*, Jiahui Peng*, Ren Ma*, Yinfan Wang, Tianyi Bai, Xingjian Wei, JIantao Qiu, Chi Zhang, Ying Qian, Conghui He
ACL 2025
OpenHuEval: Evaluating Large Language Model on Hungarian Specifics
Haote Yang*, Xingjian Wei*, Jiang Wu*, Noémi Ligeti-Nagy*, Jiaxing Sun*, Yinfan Wang*, Zijian Gyoz˝o Yang*, Junyuan Gao*, Jingchao Wang*, Bowen Jiang*, Shasha Wang, Nanjun Yu, Zihao Zhang, Shixin Hong, Hongwei Liu, Wei Li, Songyang Zhang, Dahua Lin, Lijun Wu, Gábor Prószéky, Conghui He
ACL 2025
A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis
Xin Gao, Qizhi Pei, Zinan Tang, Yu Li, Honglin Lin, Jiang Wu, Lijun Wu, Conghui He
ACL 2025
CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenge
Yu Li, Qizhi Pei, Mengyuan Sun, Honglin Lin, Chenlin Ming, Xin Gao, Jiang Wu, Conghui He, Conghui He
ACL 2025
Large Language Models Meet Symbolic Provers for Logical Reasoning Evaluation
Chengwen Qi*, Ren Ma*, Bowen Li*, He Du, Binyuan Hui, Jinwang Wu, Yuanjun Laili, Conghui He
ICLR 2025
LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models
Junyan Ye*, Baichuan Zhou*, Zilong Huang*, Junan Zhang*, Tianyi Bai*, Hengrui Kang, Jun He, Honglin Lin, Zihao Wang, Tong Wu, Zhizheng Wu, Yiping Chen, Dahua Lin, Conghui He, Weijia Li
ICLR 2025 (Spotlight)
Harnessing Diversity for Important Data Selection in Pretraining Large Language Models
Chi Zhang*, Huaping Zhong*, Kuan Zhang, Chengliang Chai, Rui Wang, Xinlin Zhuang, Tianyi Bai, Jiantao Qiu, Lei Cao, Ju Fan, Ye Yuan, Guoren Wang, Conghui He
ICLR 2025 (Spotlight)
OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
Linke Ouyang*, Yuan Qu*, Hongbin Zhou*, Jiawei Zhu*, Rui Zhang*, Qunshu Lin*, Bin Wang*, Zhiyuan Zhao, Man Jiang, Xiaomeng Zhao, Jin Shi, Fan Wu, Pei Chu, Minghao Liu, Zhenxiang Li, Chao Xu, Bo Zhang, Botian Shi, Zhongying Tu, Conghui He
CVPR 2025
Image Over Text: Transforming Formula Recognition Evaluation with Character Detection Matching
Bin Wang*, Fan Wu*, Linke Ouyang*, Zhuangcheng Gu, Rui Zhang, Renqiu Xia, Bo Zhang, Conghui He
CVPR 2025
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Qingyun Li*, Zhe Chen*, Weiyun Wang*, Wenhai Wang*, Shenglong Ye*, Zhenjiang Jin*, Guanzhou Chen*, Yinan He*, Zhangwei Gao*, Erfei Cui*, Jiashuo Yu*, Hao Tian*, Jiasheng Zhou*, Chao Xu*, Bin Wang*, Xingjian Wei*, Wei Li*, Wenjian Zhang*, Bo Zhang*, Pinlong Cai*, Licheng Wen*, Xiangchao Yan*, Zhenxiang Li*, Pei Chu*, Yi Wang*, Min Dou, Changyao Tian, Xizhou Zhu, Lewei Lu, Yushi Chen, Junjun He, Zhongying Tu*, Tong Lu, Yali Wang, Limin Wang, Dahua Lin, Yu Qiao, Botian Shi, Conghui He, Jifeng Dai
ICLR 2025
Spot the fake: Large multimodal model-based synthetic image detection with artifact explanation
Siwei Wen*, Junyan Ye*, Peilin Feng, Hengrui Kang, Zichen Wen, Yize Chen, Jiang Wu, Wenjun Wu, Conghui He, Weijia Li
Arxiv 2025
LEGION: Learning to Ground and Explain for Synthetic Image Detection
Hengrui Kang*, Siwei Wen*, Zichen Wen*, Junyan Ye, Weijia Li, Peilin Feng, Baichuan Zhou, Bin Wang, Dahua Lin, Linfeng Zhang, Conghui He
Arxiv 2025
Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining
Tianyi Bai, Ling Yang, Zhen Hao Wong, Jiahui Peng, Xinlin Zhuang, Chi Zhang, Lijun Wu, Jiantao Qiu, Wentao Zhang, Binhang Yuan, Conghui He
Arxiv 2024
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
Zhiyuan Zhao*, Hengrui Kang*, Bin Wang, Conghui He
Arxiv 2024

Large Language Models and Multimodal LLMs

Hallucination at a Glance: Controlled Visual Edits and Fine-Grained Multimodal Learning
Tianyi Bai*, Yuxuan Fan*, Jiantao Qiu, Fupeng Sun, Jiayi Song, Junlin Han, Zichen Liu, Conghui He, Wentao Zhang, Binhang Yuan
NeurIPS 2025
Multi-step Visual Reasoning with Visual Tokens Scaling and Verification
Tianyi Bai*, Zengjie Hu*, Fupeng Sun*, Jiantao Qiu, Yizhen Jiang, Guangxin He, Bohan Zeng, Conghui He, Binhang Yuan, Wentao Zhang
NeurIPS 2025
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation
Zichen Wen, Shaobo Wang, Yufa Zhou, Junyuan Zhang, Qintng Zhang, Yifeng Gao, Zhaorun Chen, Bin Wang, Weijia Li, Conghui He, Weijia Li
NeurIPS 2025
Stop Looking for Important Tokens in Multimodal Language Models: Duplication Matters More
Zichen Wen*, Yifeng Gao, Shaobo Wang, Junyuan Zhang, Qintong Zhang, Weijia Li, Conghui He, Linfeng Zhang
EMNLP 2025
Token Pruning in Multimodal Large Language Models: Are We Solving the Right Problem?
Zichen Wen*, Yifeng Gao*, Weijia Li, Conghui He, Linfeng Zhang
ACL 2025
LEMMA: Learning from Errors for MatheMatical Advancement in LLMs
Zhuoshi Pan, Yu Li, Honglin Lin, Qizhi Pei, Zinan Tang, Wei Wu, Chenlin Ming, H,Vicky Zhao, Conghui He, Conghui He, Lijun Wu
ACL 2025
MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion
Qizhi Pei, Lijun Wu, Zhuoshi Pan, Yu Li, Honglin Lin, Chenlin Ming, Xin Gao, Conghui He, Rui Yan
ACL 2025
Harnessing Diversity for Important Data Selection in Pretraining Large Language Models
Chi Zhang*, Huaping Zhong*, Kuan Zhang, Chengliang Chai, Rui Wang, Xinlin Zhuang, Tianyi Bai, Jiantao Qiu, Lei Cao, Ju Fan, Ye Yuan, Conghui He
ICLR 2025
Leveraging BEV Paradigm for Ground-to-Aerial Image Synthesis
Junyan Ye*, Jun He*, Weijia Li, Zhutao Lv, Yi Lin, JInhua Yu, Haote Yang, Conghui He
ICCV 2025
Where am I? Cross-View Geo-localization with Natural Language Descriptions
Junyan Ye, Honglin Lin, Leyan Ou, Dairong Chen, Zihao Wang, Qi Zhu, Conghui He, Weijia Li
ICCV 2025
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation
Junyuan Zhang*, Qintong Zhang*, Bin Wang*, Linke Ouyang, Zichen Wen, Ying Li, Ka-Ho Chow, Conghui He, Wentao Zhang
ICCV 2025
MLLM-DataEngine: An Iterative Refinement Approach for MLLM
Zhiyuan Zhao*, Linke Ouyang*, Bin Wang*, Siyuan Huang, Pan Zhang, Xiaoyi Dong, Jiaqi Wang, Conghui He
ICME 2025
Beyond hallucinations: Enhancing lvlms through hallucination-aware direct preference optimization
Zhiyuan Zhao*, Bin Wang*, Linke Ouyang*, Xiaoyi Dong, Jiaqi Wang, Songyang Zhang, Hang Yan, Conghui He
ICME 2025
Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations
Jiaxing Sun*, Weiquan Huang*, Jiang Wu*, Chenya Gu, Wei Li, Songyang Zhang, Hang Yan, Conghui He
ACL 2024
Parrot Captions Teach CLIP to Spot Text
Yiqi Lin*, Conghui He*, Alex Jinpeng Wang*, Bin Wang*, Weijia Li, Mike Zheng Shou
ECCV 2024
VIGC: Visual Instruction Generation and Correction
Bin Wang*, Fan Wu*, Xiao Han*, Jiahui Peng*, Huaping Zhong*, Pan Zhang, Xiaoyi Dong, Weijia Li, Wei Li, Jiaqi Wang, Conghui He
AAAI 2024
Utilize the Flow Before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning
Runchuan Zhu*, Zhipeng Ma*, Jiang Wu*, Junyuan Gao, Jiaqi Wang, Dahua Lin, Conghui He
AAAI 2025
Urbench: A comprehensive benchmark for evaluating large multimodal models in multi-view urban scenarios
Baichuan Zhou*, Haote Yang*, Dairong Chen*, Junyan Ye*, Tianyi Bai, Jinhua Yu, Songyang Zhang, Dahua Lin, Conghui He, Weijia Li
AAAI 2025
GRAIT: Gradient-Driven Refusal-Aware Instruction Tuning for Effective Hallucination Mitigation
Runchuan Zhu*, Zinco Jiang*, Jiang Wu*, Zhipeng Ma, Jiahe Song, Fengshuo Bai, Dahua Lin, Lijun Wu, Conghui He
NAACL 2025 (Findings)

AI for Science

3D-MolT5: Towards Unified 3D Molecule-Text Modeling with 3D Molecular Tokenization
Qizhi Pei, Rui Yan, Kaiyuan Gao, Jinhua Zhu, Lijun Wu
ICLR 2025
Fast and Accurate Blind Flexible Docking
Zizhuo Zhang, Lijun Wu, Kaiyuan Gao, Jiangchao Yao, Tao Qin, Bo Han
ICLR 2025
VHM: Versatile and Honest Vision Language Model for Remote Sensing Image Analysis
Chao Pang*, Xingxing Weng*, Jiang Wu*, Jiayu Li, Yi Liu, Jiaxing Sun, Weijia Li, Shuai Wang, Litong Feng, Gui-Song Xia, Conghui He
AAAI 2025
SG-BEV: Satellite-Guided BEV Fusion for Cross-View Semantic Segmentation
Junyan Ye, Qiyan Luo, Jinhua Yu, Huaping Zhong, Zhimeng Zheng, Conghui He, Weijia Li
CVPR 2024 (Highlight)
3D Building Reconstruction from Monocular Remote Sensing Images with Multi-level Supervisions
Weijia Li*, Haote Yang*, Zhenghao Hu, Juepeng Zheng, Gui-Song Xia, Conghui He
CVPR 2024
Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network
Junyan Ye, Zhutao Lv, Weijia Li, Jinhua Yu, Haote Yang, Huaping Zhong, Conghui He
ECCV 2024
Omnicity: Omnipotent city understanding with multi-level and multi-view images
Weijia Li, Yawen Lai, Linning Xu, Yuanbo Xiangli, Jinhua Yu, Conghui He, Gui-Song Xia, Dahua Lin
CVPR 2023

Note: * denotes equal contribution, † denotes corresponding author