Skip to content

[Misc] Enhancements of shared memory layout and cluster api#87

Merged
yaoyaoding merged 2 commits intomainfrom
blackwell-gemm
Mar 6, 2026
Merged

[Misc] Enhancements of shared memory layout and cluster api#87
yaoyaoding merged 2 commits intomainfrom
blackwell-gemm

Conversation

@yaoyaoding
Copy link
Member

  1. Shared Layout Enhancements

    • Reshape operation for shared tensors (ReshapeSharedInst, shared_reshape(), inference/validation rules)
    • Canonicalization of SharedLayout (merge consecutive modes, remove singletons, normalize swizzle)
    • Added eq/hash to SharedLayout and Swizzle for structural equality
  2. TMA Multicast Support

    • CopyAsyncTensorGlobalToSharedInst and TMA load API now accept optional multicast_mask (uint16)
      to copy data to multiple CTAs in a cluster
  3. Cluster API Refactor

    • cluster.block_id() → cluster.blockIdx (property)
    • cluster.shape() → cluster.clusterDim (property)
    • cluster.block_rank() → cluster.blockRank (property)
    • Backed by predefined Var objects instead of inline declare() calls
  4. New Hidet Primitives

    • Predefined CUDA variables (clusterBlockIdx, clusterDim, clusterBlockRank, clusterIdx, etc.)
    • BindPredefinedVariablesPass — binds predefined vars to PTX intrinsics at function entry
    • Fence primitives (fence.proxy.async, fence.proxy.tensormap::generic)
    • Integer intrinsics (brev, bfe, bfi, popc, clz, fns, prmt)
  5. Misc

    • single_warp() convenience method on RootInstructionGroup
    • flash-attn → flash-attn-4 in CI, updated import paths
    • Blackwell example renamed matmul_v5.py → matmul_v7.py
    • Improved error messages in VectorizedEvaluator
    • Transpiler attribute access simplified (removed hasattr check)
    • Stub flatten() method on RootInstructionGroup (incomplete)

.
wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

wip

.

.

.

wip

Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
@yaoyaoding yaoyaoding merged commit 5159bf0 into main Mar 6, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant