Skip to content

Add dedicated 'audio' key for multimodal models #650

@HelloCheck0840

Description

@HelloCheck0840

The Problem:
Currently, to pass audio data to multimodal models like gemma4:e2b, people are forced to use the images key.

Why this is an issue:

  • Naming Confusion: Currently, audio data often has to be passed into a field called images. This is confusing because audio files are not images.
  • Future Multimodal Support: For models that support both image and audio simultaneously, a single images bucket creates ambiguity.

Suggested Solution:
Add dedicated 'audio' key.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions