Skip to content

Advanced Command Line Parameters

vllm Acceleration Parameter Optimization

Performance Optimization Parameters

Tip

If you can already use vllm normally for accelerated VLM model inference but still want to further improve inference speed, you can try the following parameters:

  • If you have multiple graphics cards, you can use vllm's multi-card parallel mode to increase throughput: --data-parallel-size 2

Parameter Passing Instructions

Tip

  • All officially supported vllm parameters can be passed to MinerU through command line arguments, including the following commands: mineru, mineru-vllm-server, mineru-gradio, mineru-api
  • If you want to learn more about vllm parameter usage, please refer to the vllm official documentation

GPU Device Selection and Configuration

CUDA_VISIBLE_DEVICES Basic Usage

Tip

  • In any situation, you can specify visible GPU devices by adding the CUDA_VISIBLE_DEVICES environment variable at the beginning of the command line. For example:
    CUDA_VISIBLE_DEVICES=1 mineru -p <input_path> -o <output_path>
    
  • This specification method is effective for all command line calls, including mineru, mineru-vllm-server, mineru-gradio, and mineru-api, and applies to both pipeline and vlm backends.

Common Device Configuration Examples

Tip

Here are some common CUDA_VISIBLE_DEVICES setting examples:

CUDA_VISIBLE_DEVICES=1  # Only device 1 will be seen
CUDA_VISIBLE_DEVICES=0,1  # Devices 0 and 1 will be visible
CUDA_VISIBLE_DEVICES="0,1"  # Same as above, quotation marks are optional
CUDA_VISIBLE_DEVICES=0,2,3  # Devices 0, 2, 3 will be visible; device 1 is masked
CUDA_VISIBLE_DEVICES=""  # No GPU will be visible

Practical Application Scenarios

Tip

Here are some possible usage scenarios:

  • If you have multiple graphics cards and need to specify cards 0 and 1, using multi-card parallelism to start vllm-server, you can use the following command:

    CUDA_VISIBLE_DEVICES=0,1 mineru-vllm-server --port 30000 --data-parallel-size 2
    
  • If you have multiple graphics cards and need to start two fastapi services on cards 0 and 1, listening on different ports respectively, you can use the following commands:

    # In terminal 1
    CUDA_VISIBLE_DEVICES=0 mineru-api --host 127.0.0.1 --port 8000
    # In terminal 2
    CUDA_VISIBLE_DEVICES=1 mineru-api --host 127.0.0.1 --port 8001