Skip to content

InspireMusic-1.5B inference fails, while InspireMusic-Base works just fine #55

@chak14

Description

@chak14

I tried doing inference usign InspireMusic-Base and everything works, but when I try using the InspireMusic-1.5B model it fails giving me this error:

Exception in thread Thread-8 (llm_job):
Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/content/InspireMusic/inspiremusic/cli/model.py", line 148, in llm_job
    for i in self.llm.inference(**inference_kwargs):
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 57, in generator_context
    response = gen.send(request)
               ^^^^^^^^^^^^^^^^^
  File "/content/InspireMusic/inspiremusic/llm/llm.py", line 374, in inference
    top_ids = self.sampling_ids(logp, out_tokens, ignore_eos=i < min_len).item()
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA error: device-side assert triggered
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

The list of tensors is empty for UUID: 3e0648bc-3981-11f0-b707-0242ac1c000c

Do you have any idea of why this could be happening?

Thanks for your work on the repo!

P.S. I don't think it's related, but I updated qwen_encoder.py in order to use attn_implementation="eager" instead of attn_implementation="flash_attention_2". (just switched to eager as default).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions