APR-57: add hosted OPD CLI config#699
Conversation
|
@codex review |
|
Codex Review: Didn't find any major issues. Already looking forward to the next diff. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 45322b6646
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
|
Codex Review: Didn't find any major issues. You're on a roll. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
45322b6 to
961bcdb
Compare
Summary
loss = "opd"with[teacher]loss = "rl"as the generated template default while documenting OPD as an accepted option/rft/runspayloadScope
This is CLI-only. It does not implement hosted/platform acceptance or teacher-logprob runtime wiring.
Ready for review, but merge should remain ordered behind the hosted API/runtime path for OPD.
Verification
.venv/bin/python -m pytest packages/prime/tests/test_rl_config.py packages/prime/tests/test_rl_api.py(51 passed).venv/bin/ruff format packages/prime/src/prime_cli/commands/rl.py packages/prime/tests/test_rl_api.py packages/prime/tests/test_rl_config.py.venv/bin/ruff check packages/prime/src/prime_cli/commands/rl.py packages/prime/tests/test_rl_api.py packages/prime/tests/test_rl_config.pygit diff --checkReview Notes
Linear: APR-57
Note
Low Risk
Small validation and documentation changes plus tests; no auth or runtime logic. Users could submit OPD runs before the backend supports them if merged ahead of the API.
Overview
Hosted Training CLI now treats
loss = "opd"like SFT for config and API: OPD is documented in the generated template, validation no longer rejects OPD as unsupported, and[teacher]is required for bothsftandopd(still forbidden forrl).Template comments were broadened from SFT-only distillation to SFT/OPD with shared teacher setup. Tests cover loading OPD TOML, rejecting OPD without a teacher, and asserting
/rft/runspayloads includeloss: "opd"and teacher config.This is CLI-only; platform/runtime OPD support is assumed to land separately.
Reviewed by Cursor Bugbot for commit 961bcdb. Bugbot is set up for automated code reviews on this repo. Configure here.