Conversation
…ence FusedMatMul EP fusion: - Reject fusion when MatMul output has multiple consumers (fixes SwiGLU SiLU pattern where x*sigmoid(x) needs the raw MatMul output) - Extend HasSingleConsumer to treat graph outputs as consumers - Reject scale scalars whose rank exceeds the other Mul/Div input's rank GroupQueryAttention node support: - Skip missing optional inputs with null type info in data type check - Move CPU constant input bypass before DML unknown-type rejection so int64 inputs like GQA's total_sequence_length are accepted - Propagate input/output aliases to ORT kernel defs for KV cache sharing - Fix OrtStatus leak in kernel lookup Partial shape inferrer padding: - When a shape inferrer returns fewer outputs than the node has, fill remaining output shapes from the creation context (fixes DynamicQuantizeLinear whose scalar outputs were allocated with the input shape, breaking downstream ConvInteger)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
FusedMatMul EP fusion:
GroupQueryAttention node support:
Partial shape inferrer padding: