Using MinerU
Quick Model Source Configuration
MinerU uses huggingface as the default model source. If users cannot access huggingface due to network restrictions, they can conveniently switch the model source to modelscope through environment variables:
export MINERU_MODEL_SOURCE=modelscope
Quick Usage via Command Line
MinerU has built-in command line tools that allow users to quickly use MinerU for document parsing through the command line:
mineru -p <input_path> -o <output_path>
Tip
<input_path>: LocalPDF/ image /DOCXfile or directory<output_path>: Output directory- Without
--api-url, the CLI launches a temporary localmineru-api - With
--api-url, the CLI connects to an existing local or remote FastAPI service directly
For more information about output files, please refer to Output File Documentation.
Note
The command line tool will automatically attempt cuda/mps acceleration on Linux and macOS systems.
Windows users who need cuda acceleration should visit the PyTorch official website to select the appropriate command for their cuda version to install acceleration-enabled torch and torchvision.
If you need to adjust parsing options through custom parameters, you can also check the more detailed Command Line Tools Usage Instructions in the documentation.
Advanced Usage via API, WebUI, http-client/server
-
FastAPI calls:
mineru-api --host 0.0.0.0 --port 8000Tip
Access
http://127.0.0.1:8000/docsin your browser to view the API documentation.- Health endpoint:
GET /healthReturnsprotocol_version,processing_window_size,max_concurrent_requests, and task stats - Asynchronous task submission endpoint:
POST /tasks - Synchronous parsing endpoint:
POST /file_parse - Task query endpoints:
GET /tasks/{task_id},GET /tasks/{task_id}/result - API outputs are controlled by the server and written to
./outputby default - Uploads currently support
PDF, image, andDOCXfiles
POST /tasksreturns immediately with atask_id.POST /file_parseuses the same task manager internally, waits for the task to finish, and then returns the final result synchronously. When a task is waiting in the queue, both the submission response and task-status response may includequeued_aheadto indicate how many tasks are ahead of it. Tasks are tracked only in-process for a singlemineru-apiinstance. Task status is not preserved across service restarts,--reload, or multi-process deployments. Completed or failed tasks are retained for 24 hours by default, then their task state and output directory are cleaned automatically. After cleanup, task status and result endpoints return404. UseMINERU_API_TASK_RETENTION_SECONDSandMINERU_API_TASK_CLEANUP_INTERVAL_SECONDSto adjust retention and cleanup polling intervals. Use--enable-vlm-preload trueto warm up the local VLM model during service startup instead of waiting for the first VLM or hybrid request.Asynchronous task submission example:
curl -X POST http://127.0.0.1:8000/tasks \ -F "files=@demo/pdfs/demo1.pdf" \ -F "return_md=true"Synchronous parsing example:
curl -X POST http://127.0.0.1:8000/file_parse \ -F "files=@demo/pdfs/demo1.pdf" \ -F "return_md=true" \ -F "response_format_zip=true" \ -F "return_original_file=true"Poll task status and fetch results:
curl http://127.0.0.1:8000/tasks/<task_id> curl http://127.0.0.1:8000/tasks/<task_id>/result curl http://127.0.0.1:8000/healthHTTP asynchronous call code example: Python version
- Health endpoint:
-
Start Gradio WebUI visual frontend:
mineru-gradio --server-name 0.0.0.0 --server-port 7860Tip
- Access
http://127.0.0.1:7860in your browser to use the Gradio WebUI. - Without
--api-url, Gradio starts a reusable localmineru-api; with--api-url, it reuses an existing local or remote service. --enable-vlm-preload truemakes Gradio start its localmineru-apiduring WebUI startup and wait for VLM preload to finish. It is ignored when--api-urlpoints to an existing service.- The WebUI currently accepts
PDF, image, andDOCXuploads.
- Access
-
Use
mineru-routerfor multi-service / multi-GPU orchestration:mineru-router --host 0.0.0.0 --port 8002 --local-gpus autoTip
mineru-routerexposes the same/health,/tasks,/file_parse,/tasks/{task_id}, and/tasks/{task_id}/resultinterface set asmineru-api.- Repeat
--upstream-urlto aggregate multiple existingmineru-apiservices, or use--local-gpusto launch local workers automatically. --enable-vlm-preload trueonly applies to router-managed local workers. It does not preload remote services passed through--upstream-url.- It is intended for advanced multi-service, multi-GPU, and unified-entry deployments.
-
Using
http-client/servermethod:# Start openai compatible server (requires vllm or lmdeploy environment) mineru-openai-server --port 30000Tip
In another terminal, connect to openai server via http client
mineru -p <input_path> -o <output_path> -b hybrid-http-client -u http://127.0.0.1:30000vlm-http-clientis the lightweight remote client option and does not require localtorch.hybrid-http-clientrequires local pipeline dependencies such asmineru[pipeline]andtorch.
Note
All officially supported vllm/lmdeploy parameters can be passed to MinerU through command line arguments, including the following commands: mineru, mineru-openai-server, mineru-gradio, mineru-api, mineru-router.
We have compiled some commonly used parameters and usage methods for vllm/lmdeploy, which can be found in the documentation Advanced Command Line Parameters.
Extending MinerU Functionality with Configuration Files
MinerU is now ready to use out of the box, but also supports extending functionality through configuration files. You can edit mineru.json file in your user directory to add custom configurations.
Important
The mineru.json file will be automatically generated when you use the built-in model download command mineru-models-download, or you can create it by copying the configuration template file to your user directory and renaming it to mineru.json.
Here are some available configuration options:
-
latex-delimiter-config:- Used to configure LaTeX formula delimiters
- Defaults to
$symbol, can be modified to other symbols or strings as needed.
-
llm-aided-config:- Used to configure parameters for LLM-assisted title hierarchy
- Compatible with all LLM models supporting
openai protocol, defaults to using Alibaba Cloud Bailian'sqwen3-next-80b-a3b-instructmodel. - You need to configure your own API key and set
enabletotrueto enable this feature. - If your API provider does not support the
enable_thinkingparameter, please manually remove it.- For example, in your configuration file, the
llm-aided-configsection may look like:"llm-aided-config": { "api_key": "your_api_key", "base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1", "model": "qwen3-next-80b-a3b-instruct", "enable_thinking": false, "enable": false } - To remove the
enable_thinkingparameter, simply delete the line containing"enable_thinking": false, resulting in:"llm-aided-config": { "api_key": "your_api_key", "base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1", "model": "qwen3-next-80b-a3b-instruct", "enable": false }
- For example, in your configuration file, the
-
models-dir:- Used to specify local model storage directory
- Please specify model directories for
pipelineandvlmbackends separately. - After specifying the directory, you can use local models by configuring the environment variable
export MINERU_MODEL_SOURCE=local.