-
Notifications
You must be signed in to change notification settings - Fork 210
Pull requests: SemiAnalysisAI/InferenceX
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[AMD] MiniMax-M3 FP4 MI355X vLLM STP: close gap vs ATOM (INT4 all-reduce + index-sharing + AR fusion)
#1969
opened Jul 1, 2026 by
Fangzhou-Ai
Collaborator
•
Draft
[AMD] Update MiniMax-M3 FP8 MI355X ATOM image and serving args (0630)
AMD
full-sweep-enabled
#1968
opened Jul 1, 2026 by
seungrokj
Collaborator
Loading…
8 tasks
[AMD] Update MiniMax-M3 FP4 MI355X ATOM image and serving args (0630)
AMD
full-sweep-enabled
#1967
opened Jul 1, 2026 by
seungrokj
Collaborator
Loading…
8 tasks
[AMD] Enable AITER MoE for MiniMax-M3 FP4 MI355X vLLM MTP (fix EP startup hang)
full-sweep-fail-fast
#1964
opened Jun 30, 2026 by
Fangzhou-Ai
Collaborator
Loading…
Add DSV4 FP4 B200 Dynamo-vLLM point-specific disagg recipes
full-sweep-enabled
#1963
opened Jun 30, 2026 by
RohitNagraj
Collaborator
Loading…
Update DSR1 B200 FP4 SGLang MTP config (image + low-latency search space)
full-sweep-enabled
#1962
opened Jun 30, 2026 by
RohitNagraj
Collaborator
Loading…
test the GB300 cluster after the node patch
full-sweep-enabled
#1961
opened Jun 30, 2026 by
richardhuo-nv
Collaborator
Loading…
chore(deps): bump the github-actions group across 1 directory with 2 updates
dependencies
Pull requests that update a dependency file
github_actions
Pull requests that update GitHub Actions code
#1960
opened Jun 30, 2026 by
dependabot
Bot
Loading…
[Klaud Cold] [AMD] Enable AITER MoE for MiniMax-M3 MI355X FP4 vLLM MTP benchmark
full-sweep-fail-fast
#1958
opened Jun 30, 2026 by
functionstackx
Collaborator
Loading…
Update Qwen3.5 FP4 MI355X MTP recipe with tuned env/flags
#1957
opened Jun 29, 2026 by
amd-fuyuajin
Collaborator
Loading…
[merging June 30 at 4pm PT] making this an hard guideline & enforcing consistent reviews on upstream sglang/vllm docker repo to PR CheckList
#1956
opened Jun 29, 2026 by
functionstackx
Collaborator
Loading…
[AMD] Enable AITER MoE for MiniMax-M3 MI355X vLLM MTP benchmarks
#1955
opened Jun 29, 2026 by
Fangzhou-Ai
Collaborator
•
Draft
2 of 3 tasks
[AMD] Tune MiniMax-M3 MXFP8 MI300X vLLM: async scheduling + big-prefill, fix conc256 EP8→EP1
full-sweep-enabled
#1951
opened Jun 29, 2026 by
ZhengGong-amd
Collaborator
Loading…
7 of 8 tasks
[AMD] Update MiniMax-M3-MXFP4 MI355X vLLM disagg perf and config
full-sweep-enabled
#1943
opened Jun 26, 2026 by
Duyi-Wang
Collaborator
Loading…
[AMD] Add MiniMax-M3-FP4 MI355X ATOMESH update 0623
AMD
evals-only
Suppress throughput and run only eval jobs; combine with all-evals to expand selection
#1940
opened Jun 26, 2026 by
seungrokj
Collaborator
Loading…
8 tasks
Add MiniMax-M3 MXFP8 B300 1k/1k sweep and update image
full-sweep-enabled
#1937
opened Jun 25, 2026 by
RohitNagraj
Collaborator
Loading…
Add MiniMax-M3 NVFP4 B200 single-node vLLM benchmark (EAGLE3 spec decode)
full-sweep-enabled
#1933
opened Jun 25, 2026 by
Ankur-singh
Collaborator
Loading…
[AMD] Add MiniMax-M3-FP8 MI355X ATOMESH update 0623
AMD
evals-only
Suppress throughput and run only eval jobs; combine with all-evals to expand selection
#1930
opened Jun 25, 2026 by
seungrokj
Collaborator
Loading…
8 tasks
[Do Not Merge][NV] dsv4-fp4-b200 sglang image to nightly
full-sweep-enabled
#1923
opened Jun 24, 2026 by
hshrivastava-droid
Collaborator
Loading…
Add GLM-5-FP8 GB300 multinode dynamo-sglang MTP benchmark
full-sweep-enabled
#1907
opened Jun 23, 2026 by
hshrivastava-droid
Collaborator
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2026-06-01.