[train] Use custom wheel for vllm-router for `/chat/completions` fix by SumanthRH · Pull Request #1601 · NovaSky-AI/SkyRL

SumanthRH · 2026-04-30T20:39:34Z

What does this PR do?

Addresses the issue with /chat/completions in the new inference codepath #1591 .

The issue is that vllm-router drops extra arguments (i.e arguments not in the OpenAI spec). I made a PR to fix this: vllm-project/router#162

While we wait for a new vllm-router release, we should ensure that SkyRL integrations that rely on /chat/completions don't break because of this.

This PR moves our vllm-router dependency to use a custom wheel built with cherry-picking the fix on top of the latest release (0.1.14). Wheel is currently built just for x86_64 arch for now.

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 1 additional finding.

gemini-code-assist

Code Review

This pull request introduces a custom wheel for vllm-router to resolve an issue with the /chat/completions endpoint. The review feedback points out an inconsistency: the wheel is restricted to x86_64 architectures, which may cause users on other Linux platforms (such as ARM64) to unknowingly use the buggy version from PyPI. It is recommended to use a git source or provide wheels for other architectures to ensure the fix is applied consistently across all supported platforms.

SumanthRH · 2026-05-01T19:22:43Z

It looks like the official PyPI wheel has some additional steps to package the wheel, and unfortunately the steps are not listed in the README.

Seeing the following errors on CI (inspecting the vllm-router logs):

Process vllm-router:
Traceback (most recent call last):
  File "/home/ray/anaconda3/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/home/ray/anaconda3/lib/python3.12/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ray/default/skyrl/backends/skyrl_train/inference_servers/vllm_router.py", line 44, in _run_router_with_logging
    launch_router(router_args)
  File "/home/ray/.cache/uv/builds-v0/.tmpkuayMU/lib/python3.12/site-packages/vllm_router/launch_router.py", line 52, in launch_router
    raise e
  File "/home/ray/.cache/uv/builds-v0/.tmpkuayMU/lib/python3.12/site-packages/vllm_router/launch_router.py", line 45, in launch_router
    raise RuntimeError("Rust Router is not installed")
RuntimeError: Rust Router is not installed

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

SumanthRH · 2026-05-16T05:34:33Z

GPU CI is passing: https://github.com/NovaSky-AI/SkyRL/actions/runs/25903325504/job/76131304194?pr=1601

I believe we are good to merge. For now. Will track building a wheel for ARM64 in an issue.

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

SumanthRH · 2026-06-07T06:33:46Z

GPU CI test failures after merge from main are unrelated : https://github.com/NovaSky-AI/SkyRL/actions/runs/26924876368/job/79432790863

I've done a manual run and all tests pass with the new wheel.

# What does this PR do? Excludes nixl cu13 from the packages. After #1601 , vllm model resolution fails with ```bash , ip=10.0.53.33, actor_id=4bb0516283686471f260700b04000000, repr=<skyrl.backends.skyrl_train.inference_servers.vllm_server_actor.VLLMServerActor object at 0x706173dea0f0>) File "/home/ray/anaconda3/lib/python3.12/concurrent/futures/_base.py", line 449, in result return self.__get_result() ^^^^^^^^^^^^^^^^^^^ File "/home/ray/anaconda3/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result raise self._exception ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/tmp/ray/session_2026-06-07_06-36-06_373207_3903/runtime_resources/working_dir_files/s3_anyscale-production-data-cld-hxkifz7xa22mwicp21nzkds1lw_org_xc6lv84h3d7m9dljcc17esfw2i_cld_hxkifz7xa22mwicp21nzkds1lw_runtime_env_packages_pkg_78d2d462c60fbf0f44597ff416ba6c12/skyrl/backends/skyrl_train/inference_servers/vllm_server_actor.py", line 279, in start await self._wait_until_healthy() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/tmp/ray/session_2026-06-07_06-36-06_373207_3903/runtime_resources/working_dir_files/s3_anyscale-production-data-cld-hxkifz7xa22mwicp21nzkds1lw_org_xc6lv84h3d7m9dljcc17esfw2i_cld_hxkifz7xa22mwicp21nzkds1lw_runtime_env_packages_pkg_78d2d462c60fbf0f44597ff416ba6c12/skyrl/backends/skyrl_train/inference_servers/vllm_server_actor.py", line 294, in _wait_until_healthy raise exc ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/tmp/ray/session_2026-06-07_06-36-06_373207_3903/runtime_resources/working_dir_files/s3_anyscale-production-data-cld-hxkifz7xa22mwicp21nzkds1lw_org_xc6lv84h3d7m9dljcc17esfw2i_cld_hxkifz7xa22mwicp21nzkds1lw_runtime_env_packages_pkg_78d2d462c60fbf0f44597ff416ba6c12/skyrl/backends/skyrl_train/inference_servers/vllm_server_actor.py", line 335, in _run_server self._engine = AsyncLLMEngine.from_engine_args( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ray/.cache/uv/builds-v0/.tmpHNSj9v/lib/python3.12/site-packages/vllm/v1/engine/async_llm.py", line 242, in from_engine_args vllm_config = engine_args.create_engine_config(usage_context) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ray/.cache/uv/builds-v0/.tmpHNSj9v/lib/python3.12/site-packages/vllm/engine/arg_utils.py", line 1627, in create_engine_config model_config = self.create_model_config() ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ray/.cache/uv/builds-v0/.tmpHNSj9v/lib/python3.12/site-packages/vllm/engine/arg_utils.py", line 1475, in create_model_config return ModelConfig( ^^^^^^^^^^^^ File "/home/ray/.cache/uv/builds-v0/.tmpHNSj9v/lib/python3.12/site-packages/pydantic/_internal/_dataclasses.py", line 121, in __init__ s.__pydantic_validator__.validate_python(ArgsKwargs(args, kwargs), self_instance=s) pydantic_core._pydantic_core.ValidationError: 1 validation error for ModelConfig Value error, Model architectures ['Qwen3MoeForCausalLM'] failed to be inspected. Please check the logs for more details. [type=value_error, input_value=ArgsKwargs((), {'model': ...nderer_num_workers': 1}), input_type=ArgsKwargs] For further information visit https://errors.pydantic.dev/2.13/v/value_error ``` The root cause is the `nixl` package. The `nixl` package on PyPI is a meta package that includes both `nixl-cu12` and `nixl-cu13`. `nixl-cu13` ships a binary that breaks vllm model inspection. I haven't figured out why the vllm-router upgrade triggered this, but the fix seems to work Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

use custom wheel for vllm-router for /chat/completions fix

d2dbc86

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

SumanthRH marked this pull request as ready for review April 30, 2026 20:40

SumanthRH added the run_train_gpu_ci label Apr 30, 2026

devin-ai-integration Bot reviewed Apr 30, 2026

View reviewed changes

gemini-code-assist Bot reviewed Apr 30, 2026

View reviewed changes

Comment thread pyproject.toml Outdated

x

0d8850f

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

SumanthRH added run_train_gpu_ci run_train_megatron_gpu_ci and removed run_train_gpu_ci labels May 15, 2026

Merge remote-tracking branch 'origin/main' into use-patched-vllm-router

4addedf

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

SumanthRH added run_train_gpu_ci and removed run_train_gpu_ci labels Jun 4, 2026

Merge remote-tracking branch 'origin/main' into use-patched-vllm-router

82e5710

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

SumanthRH merged commit 7d325f2 into main Jun 7, 2026
4 of 5 checks passed

SumanthRH mentioned this pull request Jun 7, 2026

[chore] Exclude nixl-cu13 from packages #1756

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[train] Use custom wheel for vllm-router for `/chat/completions` fix#1601

[train] Use custom wheel for vllm-router for `/chat/completions` fix#1601
SumanthRH merged 4 commits into
mainfrom
use-patched-vllm-router

SumanthRH commented Apr 30, 2026 •

edited

Loading

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

SumanthRH commented May 1, 2026 •

edited

Loading

Uh oh!

SumanthRH commented May 16, 2026

Uh oh!

SumanthRH commented Jun 7, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

SumanthRH commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

SumanthRH commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SumanthRH commented May 16, 2026

Uh oh!

SumanthRH commented Jun 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

SumanthRH commented Apr 30, 2026 •

edited

Loading

SumanthRH commented May 1, 2026 •

edited

Loading

SumanthRH commented Jun 7, 2026 •

edited

Loading