Selected Publications
Data Intelligence
- Junyan Ye*, Baichuan Zhou*, Zilong Huang*, Junan Zhang*, Tianyi Bai*, Hengrui Kang, Jun He, Honglin Lin, Zihao Wang, Tong Wu, Zhizheng Wu, Yiping Chen, Dahua Lin, Conghui He†, Weijia Li†. LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models ICLR 2025 Spotlight
- Chi Zhang*, Huaping Zhong*, Kuan Zhang, Chengliang Chai†, Rui Wang, Xinlin Zhuang, Tianyi Bai, Jiantao Qiu, Lei Cao, Ju Fan, Ye Yuan, Guoren Wang, Conghui He†. Harnessing Diversity for Important Data Selection in Pretraining Large Language Models ICLR 2025 Spotlight
- Linke Ouyang*, Yuan Qu*, Hongbin Zhou*, Jiawei Zhu*, Rui Zhang*, Qunshu Lin*, Bin Wang*, Zhiyuan Zhao, Man Jiang, Xiaomeng Zhao, Jin Shi, Fan Wu, Pei Chu, Minghao Liu, Zhenxiang Li, Chao Xu, Bo Zhang, Botian Shi, Zhongying Tu, Conghui He†. OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations CVPR 2025
- Bin Wang*, Fan Wu*, Linke Ouyang*, Zhuangcheng Gu, Rui Zhang, Renqiu Xia, Bo Zhang, Conghui He†. Image Over Text: Transforming Formula Recognition Evaluation with Character Detection Matching CVPR 2025
- Qingyun Li*, Zhe Chen*, Weiyun Wang*, Wenhai Wang*, Shenglong Ye*, Zhenjiang Jin*, Guanzhou Chen*, Yinan He*, Zhangwei Gao*, Erfei Cui*, Jiashuo Yu*, Hao Tian*, Jiasheng Zhou*, Chao Xu*, Bin Wang*, Xingjian Wei*, Wei Li*, Wenjian Zhang*, Bo Zhang*, Pinlong Cai*, Licheng Wen*, Xiangchao Yan*, Zhenxiang Li*, Pei Chu*, Yi Wang*, Min Dou, Changyao Tian, Xizhou Zhu, Lewei Lu, Yushi Chen, Junjun He, Zhongying Tu*, Tong Lu, Yali Wang, Limin Wang, Dahua Lin, Yu Qiao, Botian Shi, Conghui He†, Jifeng Dai†. OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text ICLR 2025
- Xinlin Zhuang*, Jiahui Peng*, Ren Ma*, Yinfan Wang, Tianyi Bai, Xingjian Wei, Jiantao Qiu, Chi Zhang, Ying Qian, Conghui He†. Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models Arxiv 2025
- Siwei Wen*, Junyan Ye*, Peilin Feng, Hengrui Kang, Zichen Wen, Yize Chen, Jiang Wu, Wenjun Wu, Conghui He, Weijia Li†. Spot the fake: Large multimodal model-based synthetic image detection with artifact explanation Arxiv 2025
- Hengrui Kang*, Siwei Wen*, Zichen Wen*, Junyan Ye, Weijia Li†, Peilin Feng, Baichuan Zhou, Bin Wang, Dahua Lin, Linfeng Zhang, Conghui He†. LEGION: Learning to Ground and Explain for Synthetic Image Detection Arxiv 2025
- Tianyi Bai, Ling Yang, Zhen Hao Wong, Jiahui Peng, Xinlin Zhuang, Chi Zhang, Lijun Wu, Jiantao Qiu†, Wentao Zhang†, Binhang Yuan, Conghui He†. Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining Arxiv 2024
- Zhiyuan Zhao*, Hengrui Kang*, Bin Wang, Conghui He†. DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception Arxiv 2024
Large Language Models and Multimodal LLMs
- Jiaxing Sun*, Weiquan Huang*, Jiang Wu*, Chenya Gu, Wei Li, Songyang Zhang, Hang Yan, Conghui He†. Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations ACL 2024
- Yiqi Lin*, Conghui He*†, Alex Jinpeng Wang*, Bin Wang*, Weijia Li, Mike Zheng Shou. Parrot Captions Teach CLIP to Spot Text ECCV 2024
- Bin Wang*, Fan Wu*, Xiao Han*, Jiahui Peng*, Huaping Zhong*, Pan Zhang, Xiaoyi Dong, Weijia Li, Wei Li, Jiaqi Wang, Conghui He†. VIGC: Visual Instruction Generation and Correction AAAI 2024
- Runchuan Zhu*, Zhipeng Ma*, Jiang Wu*, Junyuan Gao, Jiaqi Wang, Dahua Lin, Conghui He†. Utilize the Flow Before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning AAAI 2025
- Baichuan Zhou*, Haote Yang*, Dairong Chen*, Junyan Ye*, Tianyi Bai, Jinhua Yu, Songyang Zhang, Dahua Lin, Conghui He†, Weijia Li†. Urbench: A comprehensive benchmark for evaluating large multimodal models in multi-view urban scenarios AAAI 2025
- Runchuan Zhu*, Zinco Jiang*, Jiang Wu*, Zhipeng Ma, Jiahe Song, Fengshuo Bai, Dahua Lin, Lijun Wu, Conghui He†. GRAIT: Gradient-Driven Refusal-Aware Instruction Tuning for Effective Hallucination Mitigation NAACL 2025 Findings
AI for Science
- Qizhi Pei, Rui Yan†, Kaiyuan Gao, Jinhua Zhu, Lijun Wu†. 3D-MolT5: Towards Unified 3D Molecule-Text Modeling with 3D Molecular Tokenization ICLR 2025
- Zizhuo Zhang, Lijun Wu†, Kaiyuan Gao, Jiangchao Yao, Tao Qin, Bo Han†. Fast and Accurate Blind Flexible Docking ICLR 2025
- Chao Pang*, Xingxing Weng*, Jiang Wu*, Jiayu Li, Yi Liu, Jiaxing Sun, Weijia Li, Shuai Wang, Litong Feng, Gui-Song Xia†, Conghui He†. VHM: Versatile and Honest Vision Language Model for Remote Sensing Image Analysis AAAI 2025
- Junyan Ye, Qiyan Luo, Jinhua Yu, Huaping Zhong, Zhimeng Zheng, Conghui He, Weijia Li†. SG-BEV: Satellite-Guided BEV Fusion for Cross-View Semantic Segmentation CVPR 2024 Highlight
- Weijia Li*, Haote Yang*, Zhenghao Hu, Juepeng Zheng, Gui-Song Xia, Conghui He†. 3D Building Reconstruction from Monocular Remote Sensing Images with Multi-level Supervisions CVPR 2024
- Junyan Ye, Zhutao Lv, Weijia Li†, Jinhua Yu, Haote Yang, Huaping Zhong, Conghui He†. Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network ECCV 2024
- Weijia Li, Yawen Lai, Linning Xu, Yuanbo Xiangli, Jinhua Yu, Conghui He†, Gui-Song Xia†, Dahua Lin. Omnicity: Omnipotent city understanding with multi-level and multi-view images CVPR 2023