Overview
The OpenDataLab research group (Data Platform Center) focuses on Data-Centric AI algorithm research, deeply involved in national major projects. The team has produced numerous achievements, including China's largest open-source data platform OpenDataLab and the leading PDF parsing tool in the open-source field, MinerU. They have propelled Shanghai AI Lab's Shusheng series of large models (such as InternLM, InternVL) to the top international level, with their GitHub open-source projects reaching . The center's research directions include (multimodal) large models, data synthesis and detection, intelligent understanding of scientific documents, and AI4Science. The team annually produces a large number of research papers published in Nature sub-journals, ICLR, NeurIPS, CVPR, ACL, etc., with over xx top conference papers published to date. The research group is led by mentors from Microsoft, ByteDance, Tsinghua University, and others, with rich industry and academic experience, serving as area chairs for multiple conferences such as NeurIPS, ACL, EMNLP, AAAI, etc. The center provides ample research computing resources for joint training students and regularly organizes various activities to enrich the research life, fostering a good research atmosphere.
Highlights
Joint Training Opportunities
We offer joint training opportunities with leading universities, professors, and industry partners to nurture the next generation of AI researchers.
Partner institutions include Shanghai Jiao Tong University (SJTU), Fudan University, University of Science and Technology of China (USTC), etc.
Academic Achievements
Our team recently had multiple papers accepted to top AI conferences including NeurIPS, ICLR, and ACL, covering research directions such as multimodal learning, data-centric AI, and intelligent document understanding.
Visit our publications page for more updates.
Recent Recognition
OpenDataLab has been widely recognized for its significant contributions to open data and AI research, receiving numerous industry and academic accolades for its innovative work.
Our platform has proudly become the largest and most comprehensive open data hub in China, serving a vast community of researchers and developers.
Our Mission
OpenDataLab is committed to democratizing access to high-quality data for AI research, creating innovative methods for efficient learning, and advancing state-of-the-art multimodal models. We believe better data leads to better models and are dedicated to pushing the boundaries of AI through our data-centric approach.