-
Notifications
You must be signed in to change notification settings - Fork 25
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Feature request:
Transformer is the dominant architecture for modern LLMs, and its design has largely converged. For example, most state-of-the-art open-source models adopt GQA/MLA for the Attention layer and MoE for the MLP layer. As a result, a Transformer can be viewed as a composition of standardized building blocks. This enables further abstraction of a unified architectural specification across different open-source models, which can serve as the Transformer specification in ModelPack. Based on this specification, many valuable capabilities become possible.
Expected Outcome:
- Jointly complete a unified Transformer specification (an in-progress PR already exists)
- Using vLLM and SGLang, conduct POCs on three or more mainstream open-source Transformer models based on this specification
- Design a workflow or Claude skills that can automatically generate Transformer specification definitions from models in the Hugging Face transformers repository
Use case:
For inference engines, it enables automatic support for multiple Transformer models, so newly trained Transformer models can be supported without per-model adaptation.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request