Skip to content

[Instruction] Support 2CTA mode of tensor core in blackwell#89

Merged
yaoyaoding merged 2 commits intomainfrom
tcgen05-mma-2sm
Mar 8, 2026
Merged

[Instruction] Support 2CTA mode of tensor core in blackwell#89
yaoyaoding merged 2 commits intomainfrom
tcgen05-mma-2sm

Conversation

@yaoyaoding
Copy link
Member

Summary

  • Add cta_group parameter to tcgen05.mma(), tcgen05.commit(), and tma.global_to_shared() to support 2-SM collaborative MMA operations
  • Add cluster.map_shared_addr() instruction (PTX mapa.shared::cluster) for mapping shared memory addresses across CTAs in a cluster
  • Fix tcgen05.commit multicast ctaMask type from uint64 to uint16 per PTX ISA spec

Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
@yaoyaoding yaoyaoding merged commit b0abb7c into main Mar 8, 2026
8 checks passed
@yaoyaoding yaoyaoding deleted the tcgen05-mma-2sm branch March 8, 2026 03:02
@yaoyaoding yaoyaoding mentioned this pull request Mar 12, 2026
17 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant