Enforce NVSHMEM minimum version of 2.6.0 to remove old workarounds. Update NVSHMEM usage guidance.#106
Merged
romerojosh merged 3 commits intomainfrom Feb 23, 2026
Merged
Conversation
…pdate NVSHMEM usage guidance. Signed-off-by: Josh Romero <joshr@nvidia.com>
Signed-off-by: Josh Romero <joshr@nvidia.com>
Collaborator
Author
|
/build |
|
🚀 Build workflow triggered! View run |
|
❌ Build workflow failed! View run |
Signed-off-by: Josh Romero <joshr@nvidia.com>
Collaborator
Author
|
/build |
|
🚀 Build workflow triggered! View run |
|
✅ Build workflow passed! View run |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
In preparation for some upcoming work on the NVSHMEM backends, this PR clears out some outdated NVSHMEM code to workaround issues that have long since been resolved. In place of the workarounds, a compile and runtime check of the NVSHMEM version has been added to enforce a minimum version of 2.6.0. Given that you have to go all the way back to NVHPC SDK 22.5 to get a version older than this, I think this minimum version is reasonable.
In addition to this, this PR changes the NVSHMEM usage guidance surrounding the use of
NVSHMEM_DISABLE_CUDA_VMM=1. At the time we added NVSHMEM support, MPI compatibility with CUDA VMM memory was often problematic leading to recommendation to use that setting, but it is better supported these days. Users are still encouraged to read the NVSHMEM documentation for any existing limitations on the platforms they are running on (e.g. systems using libfabric and Slingshot networking which still require this environment variable).