Infrastructure changes preparing for explicit graph construction#1762
Open
Andy-Jost wants to merge 2 commits intoNVIDIA:mainfrom
Open
Infrastructure changes preparing for explicit graph construction#1762Andy-Jost wants to merge 2 commits intoNVIDIA:mainfrom
Andy-Jost wants to merge 2 commits intoNVIDIA:mainfrom
Conversation
…work Rename cuda/core/_graph.py to cuda/core/_graph/__init__.py to create a package that will house the explicit graph construction module alongside the existing stream-capture-based implementation. Ref: NVIDIA#1317 Made-with: Cursor
Phase 1 groundwork for explicit CUDA graph construction (issue NVIDIA#1317): - Add HandleRegistry template for reverse-lookup of CUDA handles back to their owning shared_ptr (via weak_ptr), enabling reconstruction of Python objects from driver-returned handles. - Extend EventBox with metadata fields (timing_disabled, busy_waited, ipc_enabled, device_id, context) accessed via get_box() pointer arithmetic, replacing cached Python-level fields. - Add event and kernel reverse-lookup registries for handle recovery. - Add Event.from_handle() and Kernel reverse-lookup integration with library-mismatch warning. - Convert _graph.py to _graph/ package (rename only, no content changes). Closes NVIDIA#1317 (partial) Made-with: Cursor
Contributor
Contributor
Author
|
/ok to test b5f9970 |
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Groundwork for explicit CUDA graph construction (#1317).
Changes
their owning
shared_ptrviaweak_ptr, enabling reconstruction of Pythonobjects from driver-returned handles.
ipc_enabled, device_id, context) stored in C++ alongside the CUevent handle,
accessed via
get_box()pointer arithmetic. Replaces cached Python-level fields.events and kernels, with automatic registration/cleanup.
foreign CUevent handles.
Kernel.from_handlenow uses the kernel registrywith library-mismatch warning.
_graph.py→_graph/__init__.py(rename only).Test Coverage
test_module.pyforKernel.from_handlelibrary-mismatchwarning and foreign kernel handle wrapping.
Related Work