Skip to content

**Issue with Dockerized DeepSpeed-MII Persistent Deployment (LLama 3 70B Model) – Tensor Parallelism Not Effective** #568

@InderjeetVishnoi

Description

@InderjeetVishnoi

Hi Team,

I’m facing an issue while containerizing a DeepSpeed-MII deployment using the persistent server mode.

Steps Performed:

  1. I created a mii_server.py script following the [DeepSpeed-MII documentation] using persistent mode:

    import mii
    
    MODEL_PATH = "./llama-3-70b-finetuned"  # LoRA+base model merged
    DEPLOYMENT_NAME = "test-deepspeed"
    
    client = mii.serve(
        model_name_or_path=MODEL_PATH,
        deployment_name=DEPLOYMENT_NAME,
        tensor_parallel=2,
        enable_restful_api=True,
        restful_api_port=8084,
        max_length=2048
    )
  2. This script runs successfully on my local machine using 2 GPUs.

  3. I built a Docker image based on the same setup and ran it using:

    docker run --gpus all --shm-size=10g -e CUDA_VISIBLE_DEVICES=0,1 -p 8084:8084 <image>

However, within the container, the model consistently runs into CUDA OOM errors. Both GPUs report approximately a 10Gi deficit, which is the same as trying to run the model without tensor parallelism on a single GPU.

This leads me to believe that tensor parallelism isn’t being correctly applied in the containerized environment, even though it works as expected locally.


Environment Details

  • Model: LLaMA-3 70B (merged LoRA)

  • CUDA Version: 12.1.1

  • Python Version: 3.10

  • Requirements:

    deepspeed-mii
    numpy==2.1.3
    triton==3.3.1
    

Is there any additional configuration or consideration required when deploying DeepSpeed-MII in Docker to ensure tensor parallelism is honored?

Any guidance or recommended best practices for dockerizing the MII persistent server with large models would be highly appreciated.

Thanks,
Inderjeet Vishnoi


Let me know if you want to include Dockerfile/volume/mount details too.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions