Skip to content

Enabling support for process decompositions with empty pencils.#102

Closed
romerojosh wants to merge 2 commits intomainfrom
empty_pencil_support
Closed

Enabling support for process decompositions with empty pencils.#102
romerojosh wants to merge 2 commits intomainfrom
empty_pencil_support

Conversation

@romerojosh
Copy link
Copy Markdown
Collaborator

Fixes #101.

As it stands right now, cuDecomp will not allow users to create grid descriptors that potentially result in situations where processes have empty pencils with zero elements in any pencil orientation (i.e. x-pencils, y-pencils, z-pencils), which is very conservative. Users might desire using cuDecomp on only a subset of the pencil orientations and might not care if the other orientations have empty processes. For example, in #101, that user is interested in using halo-exchange routines on z-pencils with N x N x 1 dimensions. The current empty pencil checks will not allow this usage at all due to the x- and y- pencil orientations attempting to distribute the 1 dimension.

This PR addresses this limitation by:

  1. Loosening the empty pencil checks during halo autotuning to only the pencil orientation being tested.
  2. Removing the general restriction on creating grid descriptors resulting in empty pencils.
  3. Allowing cuDecomp to work on distributions resulting in empty pencils where possible:
    Transpose APIs will work in these scenarios as expected.
    Halo exchange APIs will throw an error if the halo exchange is along a dimension with an empty pencil as what the halo exchange operation should do in this configuration isn't very well defined.

An alternative option to (3) is to just have cuDecomp throw an error whenever it detects communication involving dimensions with empty pencils. This would still allow use cases like #101, but with reduced complexity introduced from the introduction of empty pencil handling. This may be the preferred solution but not sure at the moment.

Signed-off-by: Josh Romero <joshr@nvidia.com>
Signed-off-by: Josh Romero <joshr@nvidia.com>
@romerojosh
Copy link
Copy Markdown
Collaborator Author

Going to close this on in favor of #103.

@romerojosh romerojosh closed this Feb 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

autotuneHaloBackend aggressively rejects valid decompositions for flat 3D grids (e.g. NxNx1)

1 participant