Skip to content

PrismAudio local inference OOM: VideoPrism JAX model attempts to allocate 95GB compute buffer on single 16GB GPU (RTX 5060 Ti) #58

@iad96

Description

@iad96

System: RTX 5060 Ti (16GB VRAM), 48GB RAM, WSL2 Ubuntu, CUDA 13.2
Issue: Running demo.sh locally fails during feature extraction with JAX OOM error:
RESOURCE_EXHAUSTED: Out of memory while trying to allocate 95651102720 bytes.
The byte size of input/output arguments (95950995456) exceeds the base limit (13682291507).
The VideoPrism JAX model attempts to allocate ~95GB on a 16GB GPU. JAX itself flags this as a computation error before even attempting allocation. This makes local single-GPU inference impossible on consumer hardware.
The HuggingFace Space works fine, so the issue is specific to the local inference pipeline — likely the model is compiled/batched for multi-GPU server setups without a single-GPU fallback.
Would appreciate any guidance on running this locally on a single consumer GPU.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions