-
Notifications
You must be signed in to change notification settings - Fork 259
Pull requests: radixark/miles
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[AMD CI] [1/N] Support Python 3.10 on ROCm base
#1339
opened Jun 12, 2026 by
XinyuJiangCMU
Contributor
Loading…
[fix] quantizer_fp8: emit deepgemm ue8m0 scales only when dense GEMM …
#1338
opened Jun 12, 2026 by
xiuhu17
Contributor
Loading…
[weight checker] enable in all CIs + ULP-based quant error tolerance
run-ci-image
#1336
opened Jun 12, 2026 by
yueming-yuan
Collaborator
Loading…
1 task
fix(rollout): drain engines before offload memory release
#1335
opened Jun 12, 2026 by
EazyReal
Loading…
fix(ray): retry transient ActorUnavailableError at bringup
#1333
opened Jun 12, 2026 by
EazyReal
Loading…
fix(sglang): authenticate control-plane and router calls
#1332
opened Jun 12, 2026 by
EazyReal
Loading…
[refactor] use begin/end_weight_update instead of post_process_weights
run-ci-low-precision
run-ci-megatron
#1329
opened Jun 12, 2026 by
yueming-yuan
Collaborator
Loading…
fix(chat-template): harden tool-call argument decoding
#1327
opened Jun 12, 2026 by
EazyReal
Loading…
fix(metrics): group pass-rate by real sample identity
#1326
opened Jun 12, 2026 by
EazyReal
Loading…
fix(rollout): apply rollout sample filter in the manager
#1324
opened Jun 12, 2026 by
EazyReal
Loading…
[OPD] [4/N] Teacher ensembles + exact tail-bucket top-k KL + scoring robustness
#1322
opened Jun 11, 2026 by
maocheng23
Contributor
Loading…
ROCm/support test_deepep_fp8: e2e docs, aiter/sglang patches, mori rollout harness on gfx950
#1320
opened Jun 11, 2026 by
kailashg26
•
Draft
feat: add FlashQLA backend for Qwen GDN linear-attention layers
#1318
opened Jun 11, 2026 by
Zhichenzzz
Contributor
Loading…
fix: load Qwen 3.5 checkpoint with unfused experts
#1317
opened Jun 10, 2026 by
lawrence-harmonic
Contributor
Loading…
[OPD] [3/N] Multi-teacher routing: per-sample teacher selection via --opd-teacher-urls
#1314
opened Jun 9, 2026 by
maocheng23
Contributor
Loading…
fix(qwen3-vl): per-segment mRoPE + vision under CP + THD packing
#1308
opened Jun 8, 2026 by
Zhichenzzz
Contributor
Loading…
fix(mtp): track megatron mtp_model_layer rename in raw converters
#1307
opened Jun 8, 2026 by
Zhichenzzz
Contributor
Loading…
DO NOT MERGE: CI test
run-ci-model-scripts
Run model script smoke tests
#1306
opened Jun 8, 2026 by
yueming-yuan
Collaborator
Loading…
Previous Next
ProTip!
Updated in the last three days: updated:>2026-06-12.