Skip to content

[bug] Switch to reload_weights API for loading weights in legacy inference codepath#1685

Merged
erictang000 merged 1 commit into
mainfrom
fix-load-weights-moe
Jun 1, 2026
Merged

[bug] Switch to reload_weights API for loading weights in legacy inference codepath#1685
erictang000 merged 1 commit into
mainfrom
fix-load-weights-moe

Conversation

@SumanthRH
Copy link
Copy Markdown
Member

What does this PR do?

Fixes #1680

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
@SumanthRH SumanthRH marked this pull request as ready for review May 18, 2026 05:33
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the weight loading logic in vllm_worker.py by introducing the set_current_vllm_config context manager and switching to model_runner.reload_weights. Feedback suggests optimizing memory efficiency by passing the weight generator directly to the loading function instead of accumulating tensors in an intermediate list.

Comment thread skyrl/backends/skyrl_train/inference_servers/vllm_worker.py
@erictang000 erictang000 merged commit db42526 into main Jun 1, 2026
9 of 10 checks passed
@erictang000 erictang000 deleted the fix-load-weights-moe branch June 1, 2026 23:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WorkerWrap.load_weights silently corrupts unquantized MoE rollouts after the first weight sync since vllm>=0.20.0

2 participants