Skip to content

Jeeja 0 8 0 new#9

Open
kpjeeja wants to merge 5 commits into
intel-staging:libfabric_nixl_0.8.0from
kpjeeja:jeeja_0_8_0_new
Open

Jeeja 0 8 0 new#9
kpjeeja wants to merge 5 commits into
intel-staging:libfabric_nixl_0.8.0from
kpjeeja:jeeja_0_8_0_new

Conversation

@kpjeeja

@kpjeeja kpjeeja commented Jan 16, 2026

Copy link
Copy Markdown

Migrating intel_GPU patch to 0.8.0 branch, which includes these patchs: - - af25ad6 libfabric: Add SynapseAI dmabuf support

  • 4d8d473 libfabric: Add clear provider specific config
  • dea93e2 libfabric: better hw detection, FI_HMEM support
  • 417cf28 libfabric: Genericize discovery. Add verbs support.
  • 98c2f77 libfabric: add SYNAPSEAI support

tsg- added 5 commits December 15, 2025 16:52
- Enable smart auto-detection of device types
- Dual-path memory registration (GDR, FI_HMEM)
- Device type as custom param (priority: envvar, backend param,
  auto-detect)
yafshar pushed a commit that referenced this pull request Jun 10, 2026
* DDN Infinia NIXL plugin

commit 849540b737a961217c5a6d43582ed440a3729885
Merge: 59147d8f 187e0514
Author: Keyur Desai <kdesai@ddn.com>
Date:   Mon Apr 20 12:17:57 2026 -0400

    Merge pull request ai-dynamo#23 from 3rdParty/upstream_merge

    Upstream merge

commit 187e05148fdc616e2b20a13bb8969f7a088c2554
Merge: 59147d8f 4d80978
Author: Keyur Desai <kdesai@ddn.com>
Date:   Mon Apr 20 16:13:05 2026 +0000

    Merge remote-tracking branch 'upstream/main' into upstream_merge

commit 59147d8f05a2fcdf1afb5b010d2065ef1a8ed8b5
Merge: d263402c 14baff3b
Author: Keyur Desai <kdesai@ddn.com>
Date:   Mon Apr 20 10:31:22 2026 -0400

    Merge pull request ai-dynamo#22 from 3rdParty/RED-39451-1

    Cleaned up documentation and trimmed README

commit 14baff3b61977e3649d62fc0114e00199c22a211
Author: Keyur Desai <kdesai@ddn.com>
Date:   Mon Apr 20 13:55:58 2026 +0000

    Cleaned up documentation and trimmed README

commit d263402c1a2dea1fbf467555c9e5d3be28ee470c
Merge: f755d40e 152d6d50
Author: Keyur Desai <kdesai@ddn.com>
Date:   Fri Apr 17 16:59:17 2026 -0400

    Merge pull request ai-dynamo#21 from 3rdParty/RED-39451

    Convert DDN license to Apache 2 license

commit 152d6d50ce21ea53284ab35b54ba5ba4e4bd7380
Merge: d2c8e1c5 f755d40e
Author: Keyur Desai <kdesai@ddn.com>
Date:   Fri Apr 17 16:32:17 2026 -0400

    Merge branch 'main' into RED-39451

commit f755d40e8a8ea8113ded75303df0de40f1ae750b
Author: Joseph Skazinski <jskazinski@ddn.com>
Date:   Fri Apr 17 12:57:21 2026 -0700

    Simplify registerMem and use BatchTask<> (ai-dynamo#20)

    * Simplify registerMem and use BatchTask<>

    - Refactored registerMem to eliminate duplication and clarify logic
    - Switched xfer requests to use new BatchTask<> and removed old vector of BatchRequest
    - Added reserveOperations() to prevent vector reallocation issues
    - Validated with up to 10K operations per batch

    * fix(infinia): remove deprecated red_async_executor_t usage

    Update plugin for red_async.hpp API changes:
    - Remove red_async_executor_t (no longer exists in new API)
    - Fix RED_ASYNC_DEFAULT_MAX_RETRIES constant name
    - Correct memory type check in deregisterMem (VRAM_SEG/DRAM_SEG only)

    * infinia: Implement queryMem with RED async API and enhance test output

    - Implement queryMem() using red_async::BatchTask with HEAD operations
    - Add timing instrumentation for all test phases
    - Standardize data/throughput display to MiB/MiB/s
    - Add test summary with keys, object size, and phase breakdown
    - Update plugin documentation with async API patterns

    * Add RED async config parameters to INFINIA backend

    Support sthreads, num_buffers, num_ring_entries, and coremask
    configuration from infinia.conf. Update backend, client, tests,
    and documentation with new defaults and parsing logic.

    * enhance INFINIA init message

    * allow an empty string for coremask

commit d2c8e1c513ad3a9a84ebb88e7ba4bc48bedba40b
Author: Keyur Desai <kdesai@ddn.com>
Date:   Wed Apr 15 20:05:14 2026 +0000

    Convert DDN license to Apache 2 license

commit 0cdb085a1cc9f7c552eca862ca2ac6002a08de42
Author: Joseph Skazinski <jskazinski@ddn.com>
Date:   Mon Apr 13 10:42:08 2026 -0700

    RED-40574: Infinia plugin - remove mutex, enhance tracing, optimize (ai-dynamo#19)

commit b6025814c063e99143d7b0fc6eeb2336e26be29d
Author: Joseph Skazinski <jskazinski@ddn.com>
Date:   Fri Apr 10 07:24:31 2026 -0700

    remove chunking from infinia backend to storage (ai-dynamo#18)

    * remove chunking from infinia backend to storage

    * revert changes to nixl worker

    * revert changes to nixl worker

    * incorporate hpeng changes and additional cleanup

commit 9f288d5dc7ba6c6d7eef1e8047171b44c7fc02b5
Merge: 7b55446c 13da437d
Author: Hongbo Peng <hpeng@ddn.com>
Date:   Wed Apr 8 09:58:01 2026 +0800

    Merge pull request ai-dynamo#17 from hpeng/RED-39914

    move memory registration/deregistration to registerMem/deregisterMem

commit 13da437d6f89009691727c6ee4fa7e4b3215375b
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Tue Apr 7 04:52:02 2026 +0000

    move memory registration/deregistration to registerMem/deregisterMem instead of prepXfer

commit 7b55446cfbd723e4a2b212daf85b96d3d764fbe8
Author: Joseph Skazinski <jskazinski@ddn.com>
Date:   Mon Apr 6 10:45:55 2026 -0700

    fix(infinia): thread-safe devId map, OBJ_SEG-only storage, remove uns… (ai-dynamo#16)

    * fix(infinia): thread-safe devId map, OBJ_SEG-only storage, remove unsafe cast

    * Use NIXL_DEBUG level based on review comments

commit 9e3d4d7cd0ae1d022daaf3e966b5e211c24e9607
Merge: 7254c9c8 ea4d0e70
Author: Keyur Desai <kdesai@ddn.com>
Date:   Mon Mar 23 22:49:30 2026 -0400

    Merge pull request ai-dynamo#15 from skocol/sean/performance-optimizations

    nixlInfiniaBackendReqH::postTransfer: avoid large copy when crossing a thread boundary

commit ea4d0e70613d777fb04180f8283bc47a019393f5
Author: Sean Kocol <skocol@ddn.com>
Date:   Tue Mar 24 00:48:51 2026 +0000

    Avoid spawning a thread for each xfer request

commit 7254c9c81491d5d3c2c5edb3fd0126a830d40c53
Merge: edd3ca6e 201ff61a
Author: Keyur Desai <kdesai@ddn.com>
Date:   Mon Mar 23 15:17:52 2026 -0400

    Merge pull request ai-dynamo#14 from 3rdParty/RED-39416

    Add config-file parse support to infinia NIXL plugin

commit 201ff61a0ddf176c476c161f6288411d310cba50
Author: Keyur Desai <kdesai@ddn.com>
Date:   Sun Mar 22 03:54:51 2026 +0000

    Add --config-file parse support to infinia NIXL plugin

commit edd3ca6e4f04e76a27cb6ab4d1db27b4fb61f376
Merge: dd555160 859b059e
Author: Keyur Desai <kdesai@ddn.com>
Date:   Fri Mar 20 20:03:03 2026 -0400

    Merge pull request ai-dynamo#12 from 3rdParty/RED-39416

    RED-39416: Fix broken nixlbench after async lib integration

commit 859b059e6576b04a47abaee0d0d142e6b23719ad
Author: Keyur Desai <kdesai@ddn.com>
Date:   Fri Mar 20 13:48:21 2026 +0000

    RED-39416: Fix broken nixlbench after async lib integration

commit dd555160a78f4812aeeef1ff64c2f4d03e46c7bf
Author: Joseph Skazinski <jskazinski@ddn.com>
Date:   Thu Mar 19 21:09:27 2026 -0700

    RED-38463: async lib improvements (#11)

    * RED-38463: {async_lib} library performance improvements

    * RED-38463: {async_lib} readme doc updates

    * RED-38463: {async_lib} update nixl plugin hpeng with sugegsted changes

    * refactor(infinia): use red_async_executor_t for batching

    Replace manual batching with library's batch executor API.
    Adds auto-tuning, progress callbacks, and 6 tuning parameters.
    Fixes queue exhaustion errors, 6x performance improvement.

    * feat(infinia): add GPU Direct Storage for RDMA GPU-to-storage transfers

    Auto-detect VRAM, register with register_gpu_memory(), fallback to CPU staging if unavailable

    * RED-38463: fix(infinia): add executor null checks for GPU memory cleanup

    * RED-38463: fix(infinia): add red_async:: namespace qualifiers

    * RED-38463: {async_lib} renamed red_async.hpp

    * RED-38463: correct argument parsing for INFINA plugin

commit 80ab90136b8502999f9a9ff987fbb3e6e468935f
Merge: d94e662d a795db55
Author: Keyur Desai <kdesai@ddn.com>
Date:   Thu Mar 5 18:06:07 2026 -0500

    Merge pull request #9 from 3rdParty/RED-26691

    Initial commit of NIXL plugin with async kv interface

commit a795db553f55852f0416b9afc7313605b176d424
Author: Keyur Desai <kdesai@ddn.com>
Date:   Tue Feb 24 18:25:40 2026 +0000

    Initial commit of NIXL plugin with async kv interface

commit d94e662d38c7dd1fd79f52129bc3af2042a1ffeb
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Tue Feb 24 03:55:06 2026 +0000

    disable infinia backend as no red_aisdk lib at now

commit b05ef984fb51feaba3cca27e4e587be7ff54a508
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Tue Feb 24 03:54:20 2026 +0000

    add dependency for nixl common to support abseil path

commit 4391f688e4e7a417a0aeb9d4d898749c4f6e3b1d
Merge: 3edd527a 77b9ab7
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Tue Feb 24 03:13:58 2026 +0000

    Merge remote-tracking branch 'upstream/main'

commit 3edd527a32557c8b312c6b780eed8c3666aa69fa
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Tue Feb 24 03:05:51 2026 +0000

    Revert "Merge pull request #6 from ziliu/feat/nixlbench-support-infinia-plugin"

    This reverts commit b1c788c76aa44c25517d0074e6aa1c5888ee15f6, reversing
    changes made to 53d00bc661e030c27bac84420276d71f10470108.

commit b1c788c76aa44c25517d0074e6aa1c5888ee15f6
Merge: 53d00bc6 62e9c3cd
Author: Keyur Desai <kdesai@ddn.com>
Date:   Thu Jul 10 22:20:54 2025 -0400

    Merge pull request #6 from ziliu/feat/nixlbench-support-infinia-plugin

    nixlbench support infinia plugin

commit 62e9c3cd919e5882bb0b5bcf751c1b59054a8c41
Author: Zirui Liu <ziliu@ddn.com>
Date:   Fri Jul 11 02:11:03 2025 +0000

    add explaination to populate()

commit 4eeaef84231e4db5f0426764d95a2ccf875dc28f
Author: Zirui Liu <ziliu@ddn.com>
Date:   Thu Jul 3 10:56:43 2025 +0000

    fix populate() to avoid unnecesary offset check

commit f0b8e625d68cf59c7cc2bba73bf64746ef69218d
Author: Zirui Liu <ziliu@ddn.com>
Date:   Tue Jul 1 14:29:35 2025 +0000

    generate the same series of random keys

commit 22bb0e67803d2805732b52bf24712171118217b4
Author: Zirui Liu <ziliu@ddn.com>
Date:   Tue Jul 1 04:07:21 2025 +0000

    clean up comments

commit 66aa8d89d6762c72951d3c1a3d92acdd180ae135
Author: Zirui Liu <ziliu@ddn.com>
Date:   Tue Jul 1 01:52:34 2025 +0000

    additional changes to exchangeMetadata and exchangeIOV

commit 2559a2b784ea154d095ca9d91097c82a6375813e
Author: Zirui Liu <ziliu@ddn.com>
Date:   Fri Jun 27 09:10:03 2025 +0000

    initialize keys to support infinia plugin

commit 5b231682072a7e4e780056684f16b4b9248d9a8b
Author: Zirui Liu <ziliu@ddn.com>
Date:   Thu Jun 26 17:15:47 2025 +0000

    add micros for infinia

commit 8c9fae653738704e5155a3ccd1812f99cc20185f
Author: Zirui Liu <ziliu@ddn.com>
Date:   Thu Jun 26 17:14:56 2025 +0000

    add empty functions to include infinia backend. todo: initialize keys and values

commit 53d00bc661e030c27bac84420276d71f10470108
Merge: 22b88867 e704cf35
Author: Hongbo Peng <hpeng@ddn.com>
Date:   Wed Jun 25 09:57:12 2025 +0800

    Merge pull request #5 from hpeng/RED-27683

    eliminate memcpy in the infinia Nixl plugin and aisdk code path

commit e704cf350c410a4f599a3f232d9ff64e9ab48b81
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Tue Jun 24 16:00:32 2025 +0000

    rename red_aisdk_client_get/put for iomem

commit ffdb0decc06ea633a725d49e5b5440ed8a5cb56e
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Tue Jun 24 15:04:03 2025 +0000

    fix typo for GET

commit f9ad22f966fd9fb1a374b9b3a418ad8a8b498bc4
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Tue Jun 24 07:25:15 2025 +0000

    eliminate memcpy in the infinia Nixl plugin and aisdk code path
    free allocated mem in error case

commit 22b8886757d6bdf6580a9a434d06537b926e0786
Merge: ce7ff3a4 d3dc1953
Author: Hongbo Peng <hpeng@ddn.com>
Date:   Fri Jun 20 08:52:32 2025 +0800

    Merge pull request #4 from hpeng/RED-27294-2

    remove unsupported opts and replace GDS with Infinia

commit d3dc19536d8039d3a177c9ac30076366669e4fe6
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Thu Jun 19 04:33:33 2025 +0000

    remove unsupported opts and replace GDS with Infinia

commit ce7ff3a40c3a9efde3b24cedd39291cd870003b4
Merge: 3d41192a ab048cd8
Author: Hongbo Peng <hpeng@ddn.com>
Date:   Wed Jun 18 16:22:00 2025 +0800

    Merge pull request #3 from hpeng/RED-27294

    RED-27294 Infinia support in nixl plugin with enhanced unit test

commit ab048cd8abdc4d14c4aaf43641be91871cf36f44
Merge: 0114d347 249833d2
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Wed Jun 18 16:18:15 2025 +0800

    Merge branch 'main' into RED-27294 to resolve conflicts

commit 249833d20a1b6a7ce65fe63d3847be638da082b3
Merge: 3d41192a 0a60245a
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Wed Jun 18 15:29:13 2025 +0800

    Merge branch 'hpeng-RED-27294'

commit 0a60245a2d8176912108c7e06de1d9fc3e575447
Merge: 3d41192a 0114d347
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Wed Jun 18 15:25:57 2025 +0800

    Merge branch 'RED-27294' of github.red.datadirectnet.com:hpeng/nixl into hpeng-RED-27294

commit 0114d347ab6f1ba029235c74a2e6d1fa846a2325
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Wed Jun 18 06:01:50 2025 +0000

    replace printf with std::cerr

commit 7fa6691a9f8b4201825eaa7e4e9f41df6657d5ad
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Wed Jun 18 05:57:50 2025 +0000

    remove tail whitespace and TAB

commit 7c9185ca1b174f09f30ece8b6ca8db3f531ada4b
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Wed Jun 18 05:46:41 2025 +0000

    red_kv_put does NOT support append. Set maximum transfer size to 10M.

commit 59bab12284d3fd353cb59247a4824c1cd1da7ee9
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Wed Jun 18 05:12:05 2025 +0000

    set cpp_args correctly

commit 358663db870b07bf1e5cad1f54d717f23dd7f39d
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Wed Jun 18 04:52:46 2025 +0000

    replace TAB with whitespace
    set separate seed for key generator

commit bdd7e29610404e77f8fff956619ec1d49833a1aa
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Wed Jun 18 04:18:42 2025 +0000

    fix typo

commit 884c929bbedc232463aa7e74907df413d44657e2
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Wed Jun 18 04:10:25 2025 +0000

    Update return code from red_aisdk
    remove Async code branch

commit 01f76922034aafdf4121903eee353b90a0c1120c
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Wed Jun 18 02:48:05 2025 +0000

    remove unsupported options

commit 61782a9352cbb541b32d49780dfd59bef25a63dc
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Tue Jun 17 05:41:29 2025 +0000

    generate reproducible uuid for test and remove uuid dep

commit 9007afeb68cb9182f012d40ee1fe3a0c3db5ae90
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Tue Jun 17 05:05:51 2025 +0000

    no support to skip read/write

commit 5cb8bed42841fee64c1459cf23c726c8cc6d8975
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Tue Jun 17 03:11:50 2025 +0000

    rename macro USE_VRAM as HAVE_CUDA

commit 6c928dde7951189989d7ba96a9a49d06e17fe328
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Tue Jun 17 02:33:28 2025 +0000

    add macro USE_VRAM to run on DRAM only

commit 32c8798de6c0588e6b78ddebd7598eae401f8614
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Tue Jun 17 02:21:37 2025 +0000

    Code cleanup for commit

commit 8714093925b7dcab39d159d9883b73b0ab387e4a
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Tue Jun 17 02:20:26 2025 +0000

    Move check from postXfer to preXfer
    Code cleanup for commit

commit 747b552688afdbfd40a78f80546378f96f149bc4
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Fri Jun 13 02:06:44 2025 +0000

    Add more args for the Infinia backend unit test to support
        different size of keys
        different numbers of reqs
        multiple iters

commit be8778de311bed41d9a0039393d65e2026a119fd
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Fri Jun 13 02:05:50 2025 +0000

    Remove the memcpy for DRAM.
    Replace gds with infinia

commit 3d41192a7b5bc6d63f977f521d1222ea8c1e6aa9
Merge: a5206749 dbc32b75
Author: Keyur Desai <kdesai@ddn.com>
Date:   Mon Jun 9 22:42:41 2025 -0400

    Merge pull request #2 from hpeng/RED-25299-2

    code update to sync with upstream

commit dbc32b75572a556e1247d12c5a2728b680bda006
Author: Hongbo PENG <hpeng@ddn.com>
Date:   Mon Jun 9 07:38:32 2025 +0000

    1. code update for upstream changes: ai-dynamo#295
    2. fix typos

commit a520674908464ce3235bb03da90de9c470c631bf
Merge: e0b34a3 a3dcf962
Author: Keyur Desai <kdesai@ddn.com>
Date:   Tue May 27 22:00:17 2025 -0400

    Merge pull request #1 from kdesai/RED-25299

    Initial commit of NIXL Infinia backend plugin

commit a3dcf9624bcaa876adec21408c4827aa45a3fcca
Author: Keyur Desai <kdesai@bb-4.vms.virts.svc.devintel2.local>
Date:   Fri May 23 02:36:55 2025 +0000

    Initial commit of NIXL Infinia backend plugin

* SQUASHME: Addressed code review comments 1.

* SQUASHME: clang and precommit hook fix

* SQUASHME: GPU direct fix

* SQUASHME: Addressed 2nd round of coderabbit comments

* QUASHME: Addressed comments from Adit

* SQUASHME: clang fix

* SQUASHME: Addressed Adit's comment on remote support

* SQUASHME: Addressed comments from Colin - part 2

* SQUASHME: clang fixees

* SQUASHME: Removed unnecessary StrFormat

* SQUASHME: Removed C++ 20 override from Infinia meson.build

* SQUASHME: rem oved format, as reuested by Adit

---------

Co-authored-by: Adit Ranadive <aranadive@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants