Skip to content

Use native NVSHMEM synchronization APIs in NVSHMEM backends#107

Merged
romerojosh merged 11 commits intomainfrom
nvshmem_barrier_signal
Mar 4, 2026
Merged

Use native NVSHMEM synchronization APIs in NVSHMEM backends#107
romerojosh merged 11 commits intomainfrom
nvshmem_barrier_signal

Conversation

@romerojosh
Copy link
Collaborator

This PR replaces the current use of CPU-based MPI synchronization primitives in the NVSHMEM backends with NVSHMEM native APIs. This has been found to improve performance in many cases, as well as make the NVSHMEM backends CPU-synchronization free.

In particular, the non-pipelined NVSHMEM backend replaces the existing CPU synchronization pattern (quiet -> stream synchronize -> MPI_Barrier) with a call to nvshmemx_barrier_on_stream. For the pipelined NVSHMEM backend, the existing barrier synchronization is re-implemented using signal-based APIs, enabling more targeted synchronization between GPU pairs involved in each stage.

…nchornization routines. Use better scoped signal-based synchronization in pipelined backend.

Signed-off-by: Josh Romero <joshr@nvidia.com>
…rs to avoid quiet call.

Signed-off-by: Josh Romero <joshr@nvidia.com>
…e signal wait directly in consumer stream in NVSHMEM pipelined backend.

Signed-off-by: Josh Romero <joshr@nvidia.com>
Signed-off-by: Josh Romero <joshr@nvidia.com>
…vshmem_sync_event.

Signed-off-by: Josh Romero <joshr@nvidia.com>
Signed-off-by: Josh Romero <joshr@nvidia.com>
…needed conditional barrier logic in NVSHMEM pipelined backend.

Signed-off-by: Josh Romero <joshr@nvidia.com>
Signed-off-by: Josh Romero <joshr@nvidia.com>
@romerojosh
Copy link
Collaborator Author

/build

@github-actions
Copy link

github-actions bot commented Mar 3, 2026

🚀 Build workflow triggered! View run

@github-actions
Copy link

github-actions bot commented Mar 3, 2026

❌ Build workflow failed! View run

Signed-off-by: Josh Romero <joshr@nvidia.com>
@romerojosh
Copy link
Collaborator Author

/build

@github-actions
Copy link

github-actions bot commented Mar 3, 2026

🚀 Build workflow triggered! View run

@github-actions
Copy link

github-actions bot commented Mar 3, 2026

✅ Build workflow passed! View run

…tween ops.

Signed-off-by: Josh Romero <joshr@nvidia.com>
Signed-off-by: Josh Romero <joshr@nvidia.com>
@romerojosh
Copy link
Collaborator Author

/build

@github-actions
Copy link

github-actions bot commented Mar 4, 2026

🚀 Build workflow triggered! View run

@github-actions
Copy link

github-actions bot commented Mar 4, 2026

✅ Build workflow passed! View run

@romerojosh romerojosh merged commit d5716ce into main Mar 4, 2026
4 checks passed
romerojosh added a commit that referenced this pull request Mar 8, 2026
romerojosh added a commit that referenced this pull request Mar 8, 2026
…107)"

This reverts commit d5716ce.

Signed-off-by: Josh Romero <joshr@nvidia.com>
romerojosh added a commit that referenced this pull request Mar 8, 2026
…107)" (#110)

This reverts commit d5716ce.

Signed-off-by: Josh Romero <joshr@nvidia.com>
romerojosh added a commit that referenced this pull request Mar 10, 2026
…ckends (#107)" (#110)"

This reverts commit e401242.

Signed-off-by: Josh Romero <joshr@nvidia.com>
romerojosh added a commit that referenced this pull request Mar 11, 2026
* Revert "Revert "Use native NVSHMEM synchronization APIs in NVSHMEM backends (#107)" (#110)"

This reverts commit e401242.

Signed-off-by: Josh Romero <joshr@nvidia.com>

* Apply NVSHMEM_CUMEM_GRANULARITY workaround for older NVSHMEM versions.

Signed-off-by: Josh Romero <joshr@nvidia.com>

* Increase NVSHMEM_CUMEM_GRANULARITY to maximum of 2 GiB.

Signed-off-by: Josh Romero <joshr@nvidia.com>

---------

Signed-off-by: Josh Romero <joshr@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant