Faculty & Researchers
Conghui He (何聪辉)
Young Scientist and Project Investigator (PI) at Shanghai Artificial Intelligence Laboratory
Previously Senior Researcher at WeChat (developed Plato).
Ph.D. from Tsinghua University (2013-2018), B.S. from Sun Yat-sen University (2009-2013).
Email: heconghui@pjlab.org.cn
Research Interests
High-Performance Computing, Computer Vision, Large Language Models, Data-Centric AI, Pre-training Data Preparation, Multimodal LearningAwards
- 2023 SenseTime Award (Top 1 team out of 100)
- 2021 SenseTime Outstanding Team Award (Top 10 teams out of 200)
- 2019 Tencent Technology Breakthrough Award - Gold (Top 1 team out of 50)
- 2018 Outstanding Doctoral Graduate Award
- 2017 ACM Gordon Bell Prize (Highest honor in HPC applications)
- 2013 IEEE-IBM Smarter Planet Challenge Global Winner (Team Leader, 1/54)
Key Research, Projects & Reports
- OpenDataLab: An open platform with 7700+ datasets, serving 40k+ developers.
- MinerU: A one-stop, open-source, high-quality data extraction tool for PDF, web, and e-books.
- InternLM: Series of 7B and 20B foundation and chat models.
- PDF-Extract-Kit: A comprehensive library for high-quality PDF content extraction.
- Report: "对话上海AI Lab何聪辉:从DeepSeek看数据的重要性,低成本实现"四两拨千斤""
- Report: "中国超算再获"戈登贝尔奖",成果对地震预测研究有借鉴意义"
Selected Publications
- Image Over Text: Transforming Formula Recognition Evaluation with Character Detection Matching, CVPR 2025
- OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations, CVPR 2025
- GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training, ICLR 2025
- OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text, ICLR 2025
- Mmbench: Is your multi-modal model an all-around player? ECCV 2024
Lijun Wu (吴郦军)
Young Scientist, Shanghai Artificial Intelligence Laboratory.
Formerly Research Scientist at ByteDance, Senior Researcher at Microsoft Research Asia (MSRA).
Email: lijun_wu@outlook.com
Research Interests
LLM (post-training, RLHF), Synthetic Data Optimization, AI4Science (LLM4Science, Drug Discovery)Awards
- 2013 IEEE-IBM Smarter Planet Challenge Global Winner
- 2018 MSRA Ph.D. Fellowship
- 2019 WMT Global Machine Translation Competition - 8 Track Championships
- 2021 OGB-LSC@KDD Cup - Runner up
- 2024 ACL Language + Molecule - 1st and 2nd place in two tracks
Key Research, Projects & Reports
- Report: "2018全球首个人工媲美的中英机器翻译系统"
- Report: "A study of reinforcement learning for neural machine translation"
- Report: "WMT 2019国际机器翻译大赛:微软亚洲研究院以8项第一成为冠军"
- Report: "R-Drop: A simple and effective regular method to correct the defects of Dropout"
- Report: "非自回归生成研究最新综述,近200篇文献揭示挑战和未来方向"
- Report: "230页长文,涵盖5大科学领域,微软团队使用GPT-4探索LLM对科学发现的影响"
- Report: "加速药物发现:基于生成式AI的靶点感知分子生成器TamGen"
- Report: "NatureLM:驱动科学发现与创新的跨领域AI大模型"
Selected Publications
- Nature Language Model: Deciphering the Language of Nature for Scientific Discovery, arxiv 2025
- 3D-MolT5: Towards Unified 3D Molecule-Text Modeling with 3D Molecular Tokenization, ICLR 2025
- Fabind+: Enhancing molecular docking through improved pocket prediction and pose generation, KDD 2025
- Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey, arxiv 2024
- Target-aware Molecule Generation for Drug Design Using a Chemical Language Model, Nature Communications, 2024
Bin Wang (王斌)
Young Researcher, Shanghai Artificial Intelligence Laboratory.
Ph.D. from University of Chinese Academy of Sciences (UCAS Scholar).
Algorithm Lead for MinerU project.
Email: wangbin@pjlab.org.cn
Research Interests
Intelligent Document Parsing and Understanding, Data Autonomous Iteration Agents, Multimodal Large ModelsAwards
- ImageNet Large Scale Visual Recognition Challenge (ILSVRC2016 VID) - 3rd Place Globally
- UCAS Zhu Li Yuehua Excellent Doctoral Scholarship
Key Research & Projects
- MinerU: https://github.com/opendatalab/MinerU
- PDF-Extract-Kit: https://github.com/opendatalab/PDF-Extract-Kit
- DocLayout-YOLO: https://github.com/opendatalab/DocLayout-YOLO
- OmniDocBench: https://github.com/opendatalab/OmniDocBench
Research Focus Areas
- Intelligent Document Parsing & Understanding: Developing practical algorithms for layout detection, table recognition, chemical element recognition, geometric parsing, etc., for RAG and AI4S.
- Multimodal Large Models: Focusing on vertical domain multimodal large models using data-centric algorithms, generative models, and reinforcement learning to address OOD problems.
- Data Autonomous Iteration Agents: Using agent technology to automate data iteration processes (quality improvement, distribution balancing, safety validation) for efficient AI model training.
Selected Publications
- Image Over Text: Transforming Formula Recognition Evaluation with Character Detection Matching, CVPR 2025
- OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations, CVPR 2025
- GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training, ICLR 2025
- OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text, ICLR 2025
- MinerU: An Open-Source Solution for Precise Document Content Extraction, Arxiv 2024
Jiang Wu (吴江)
Young Researcher, Shanghai Artificial Intelligence Laboratory.
B.S. and Ph.D. from Tsinghua University.
Email: wujiang@pjlab.org.cn
Research Interests
Large Language Models, Multimodal Large Models, Intelligent Document Parsing and UnderstandingAwards
- Led the development of an industry-leading satellite imagery analysis system, setting new technical benchmarks, and deployed it in multiple satellite and surveying centers.
Selected Publications
- Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations, ACL 2024
- VHM: Versatile and Honest Vision Language Model for Remote Sensing Image Analysis, AAAI 2025
- Utilize the Flow Before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning, AAAI 2025
- GRAIT: Gradient-Driven Refusal-Aware Instruction Tuning for Effective Hallucination Mitigation, NAACL 2025 findings
- OpenHuEval: Evaluating Large Language Model on Hungarian Specifics, arxiv 2025
Jiantao Qiu (邱剑涛)
Young Researcher, Shanghai Artificial Intelligence Laboratory.
B.S. and Ph.D. in Electronic Engineering from Tsinghua University.
Email: qiujiantao@pjlab.org.cn
Research Interests
Large Language Model Datasets, HTML Document Understanding, Energy-Efficient Neural Network Accelerator Design, Multi-Machine Collaborative System DesignAwards
- AI2000 Most Influential Scholar Award Honorable Mention in AAAI/IJCAI (2023, for work in FPGA) - Top 3 Rising Star in FPGA
Selected Publications
- Going Deeper with Embedded FPGA Platform for Convolutional Neural Network, FPGA 2016

Wentao Zhang (张文涛)
Assistant Professor, Researcher, and Doctoral Supervisor at the International Machine Learning Research Center, Peking University.
Research Consultant at Shanghai Artificial Intelligence Laboratory.
Formerly at Tencent Machine Learning Platform Department, Apple AIML, and Mila - Quebec AI Institute.
Email: zhangwentao1@pjlab.org.cn
Research Interests
Data-centric machine learning and large model data governance.Awards
- WWW'22 Best Student Paper Award (1/1822)
- AP-Web'23 Best Paper Runner Up Award
- CIKM'24 Best Student Full Paper Award (1/1496)
- Apple Scholar (2021, sole recipient in Asia-Pacific)
- World Artificial Intelligence Conference (WAIC) Yunfan Award (1 of 15 globally)
- Peking University/Beijing Municipal/Chinese Association for Artificial Intelligence Excellent Doctoral Dissertation Award, 2023
- Peking University "Weiming Young Scholar", 2024
- World Internet Conference Leading Scientific and Technological Achievement Award, 2024
- Huawei Spark Award, 2024
- Chinese Institute of Electronics Science and Technology Progress Award (First Prize), 2023
Key Research, Projects & Reports
- Angel: a high-performance distributed machine learning and graph computing platform, jointly designed by Tencent and PKU.
- SGL: a scalable graph learning toolkit for extremely large graph datasets.
- MindWare: a powerful AutoML system, which automates feature engineering, algorithm selection and hyperparameter tuning.
- OpenBox: an efficient open-source system designed for solving generalized black-box optimization (BBO) problems.
Selected Publications
- PAS: Data-Efficient Plug-and-Play Prompt Augmentation System, ICDE 2025
- DataSculpt: Crafting Data Landscapes for Long-Context LLMs through Multi-Objective Partitioning, ICDE 2025
- Facilitating Multi-turn Function Calling for LLMs via Compositional Instruction Tuning, ICLR 2025
- Towards Precise Scaling Laws for Video Diffusion Transformers, CVPR 2025
- Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models, NeurIPS 2024
Weijia Li (李唯嘉)
Associate Professor ("Hundred Talents Program"), Sun Yat-sen University.
Research Consultant, Shanghai Artificial Intelligence Laboratory.
Ph.D. from Tsinghua University, Postdoc at MMLab, CUHK.
Email: liweijia@pjlab.org.cn
Research Interests
Multimodal Large Models, Image Generation, Synthetic Data Detection, AI4EarthKey Research, Projects & Reports
- Report: "GPT-4o图像生成架构被"破解"了?自回归主干+扩散解码器,还有4o图像生成全面测评基准"
- Report: "ICLR 2025 Spotlight |合成数据伪装术 vs 大模型火眼金睛,中大&上海AI Lab提出合成检测基准LOKI"
- Report: "城市复杂环境下具身大模型基准测试!UrBench:综合评估多模态大模型在多视图城市场景中的基准"
- Report: "论文赏读 | CVPR24 | 结合卫星和街景图像实现精细的建筑属性分割,入选Highlight!"
Selected Publications
- LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models (ICLR 2025, Spotlight)
- Urbench: A comprehensive benchmark for evaluating large multimodal models in multi-view urban scenarios (AAAI 2025)
- Gpt-imgeval: A comprehensive benchmark for diagnosing gpt4o in image generation (Arxiv 2025)
- LEGION: Learning to Ground and Explain for Synthetic Image Detection (Arxiv 2025)
- Spot the fake: Large multimodal model-based synthetic image detection with artifact explanation (Arxiv 2025)
Joint Training Students
Hengrui Kang (康恒锐)
Current Institution: Undergraduate from Honors College, University of Electronic Science and Technology of China (UESTC)
Joint Ph.D. Program: Shanghai Jiao Tong University (SJTU) & Shanghai AI Laboratory (Currently 1st-year Ph.D. student)
Internship Experience: Joined OpenDataLab as an intern in early December 2023
Key Work: Participated in the lab's intelligent document parsing (MinerU project) and has made preliminary explorations in Trustworthy AI (synthetic image detection).
Research Interests: Synthetic data detection, intelligent document parsing and generation
Jiahe Song (宋家和)
Current Institution: Third-year Master's student, School of Computer Science, Peking University (PKU)
Joint Ph.D. Program: School of AI, Shanghai Jiao Tong University (SJTU) & Shanghai AI Laboratory (Starting Sep 2025)
Internship Experience: Started internship at Shanghai AI Laboratory (Beijing base) in October 2024 (Advisor: Dr. Jiang Wu)
Key Work: Co-first author of the paper "PM4Bench: A Parallel Multilingual Multi-Modal Multi-task Benchmark for Large Vision Language Model"
Research Interests: Multimodal large models, AI for science
Honglin Lin (林泓霖)
Current Institution: Final-year undergraduate student, Artificial Intelligence, Beijing University of Posts and Telecommunications (BUPT)
Joint Ph.D. Program: School of AI, Shanghai Jiao Tong University (SJTU) & Shanghai AI Laboratory (Starting Sep 2025)
Internship Experience: Has been interning at the lab for over half a year since postgraduate recommendation in 2024
Research Interests: Mathematical reasoning in large models, data synthesis, etc.
Junbo Niu (牛俊博)
Current Institution: Final-year undergraduate student, Automation, Beihang University
Joint Ph.D. Program: Peking University (PKU) & Shanghai AI Laboratory (Starting Sep 2025)
Research Interests: Multimodal Understanding (Video Understanding, OCR) & Data-Centric Machine Learning
Xin Gao (高鑫)
Current Institution: Software Engineering, University of Electronic Science and Technology of China (UESTC)
Joint Ph.D. Program: Shanghai Jiao Tong University (SJTU) & Shanghai AI Laboratory (Starting Sep 2025)
Internship Experience: Joined the lab as an intern in early August 2024
Research Interests: Data synthesis, evaluation, and filtering for large models, etc.
Yu Li (李宇)
Current Institution: Final-year undergraduate student, School of Cyber Science and Engineering, Wuhan University
Joint Ph.D. Program: University of Science and Technology of China (USTC) & Shanghai AI Laboratory (Starting Sep 2025)
Internship Experience: Joined the lab as an intern at the end of October 2024
Research Interests: Logical reasoning in large models, data synthesis, etc.
Zichen Wen (温子辰)
Current Institution: Undergraduate from University of Electronic Science and Technology of China (UESTC)
Joint Ph.D. Program: Shanghai AI Laboratory & Shanghai Jiao Tong University (SJTU) (Starting Sep 2025)
Research Interests: Efficient AI (including Lightweight and Efficient Large Models for Language/Multimodality, and Data-Efficient Artificial Intelligence)