DRAT: fix: run Nemotron Nano v2 workplace assistant recipe#2868
Open
snowmanwwg wants to merge 2 commits into
Open
DRAT: fix: run Nemotron Nano v2 workplace assistant recipe#2868snowmanwwg wants to merge 2 commits into
snowmanwwg wants to merge 2 commits into
Conversation
Signed-off-by: Wenwen Gao <wenweng@cw-dfw-cs-001-vscode-01.cm.cluster>
Signed-off-by: Wenwen Gao <wenweng@cw-dfw-cs-001-vscode-01.cm.cluster>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The PR makes the Nemotron Nano v2 workplace-assistant NeMo-Gym recipe runnable with the current vLLM stack.
It fixes three concrete issues we hit:
Missing/invalid vLLM tool parser
The recipe was configured to use tool_parser: nemotron_json, but no working local parser plugin was wired in. The model’s HF-cache parser was also incompatible with the installed
vLLM package paths.
The PR adds nemo_rl/models/generation/vllm/tool_parsers/nemotron_json.py and points the recipe at it.
vLLM parser constructor mismatch
Current vLLM instantiates tool parsers as parser_cls(tokenizer, request.tools). The previous shim only accepted tokenizer, causing:
TypeError: init() takes 2 positional arguments but 3 were given
The new parser accepts both signatures, so it works with current vLLM and remains tolerant of older base parser constructors.
Slurm Ray job cleanup
After the driver command finished, ray.sub did not signal the Ray sidecars to stop, so the Slurm allocation could remain alive even after training completed.
The PR touches LOG_DIR/ENDED after a non-empty COMMAND exits and preserves the driver exit code.
Context-length overflow handling
In nemo_rl/models/generation/vllm/vllm_worker_async.py:716, I changed the async chat endpoint so our local ValueError for overlong prompts is converted to HTTP 400, same as vLLM
validation errors.
also makes malformed model tool-call output non-fatal and quieter: malformed ... generations fall back to normal content instead of crashing request handling
or spamming exception tracebacks.
Usage
# Add a code snippet demonstrating how to use thisBefore your PR is "Ready for review"
Pre checks:
Additional Information