Skip to content

APR-57: add hosted OPD CLI config#699

Open
tim0120 wants to merge 1 commit into
mainfrom
feat/hosted-opd-cli
Open

APR-57: add hosted OPD CLI config#699
tim0120 wants to merge 1 commit into
mainfrom
feat/hosted-opd-cli

Conversation

@tim0120

@tim0120 tim0120 commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Summary

  • allow Hosted Training configs to use loss = "opd" with [teacher]
  • keep loss = "rl" as the generated template default while documenting OPD as an accepted option
  • send OPD loss and teacher config through the existing /rft/runs payload

Scope

This is CLI-only. It does not implement hosted/platform acceptance or teacher-logprob runtime wiring.

Ready for review, but merge should remain ordered behind the hosted API/runtime path for OPD.

Verification

  • .venv/bin/python -m pytest packages/prime/tests/test_rl_config.py packages/prime/tests/test_rl_api.py (51 passed)
  • .venv/bin/ruff format packages/prime/src/prime_cli/commands/rl.py packages/prime/tests/test_rl_api.py packages/prime/tests/test_rl_config.py
  • .venv/bin/ruff check packages/prime/src/prime_cli/commands/rl.py packages/prime/tests/test_rl_api.py packages/prime/tests/test_rl_config.py
  • git diff --check

Review Notes

  • Codex CLI review found no actionable regressions
  • Claude review could not run because the org had hit its monthly usage limit at the time of the earlier review

Linear: APR-57


Note

Low Risk
Small validation and documentation changes plus tests; no auth or runtime logic. Users could submit OPD runs before the backend supports them if merged ahead of the API.

Overview
Hosted Training CLI now treats loss = "opd" like SFT for config and API: OPD is documented in the generated template, validation no longer rejects OPD as unsupported, and [teacher] is required for both sft and opd (still forbidden for rl).

Template comments were broadened from SFT-only distillation to SFT/OPD with shared teacher setup. Tests cover loading OPD TOML, rejecting OPD without a teacher, and asserting /rft/runs payloads include loss: "opd" and teacher config.

This is CLI-only; platform/runtime OPD support is assumed to land separately.

Reviewed by Cursor Bugbot for commit 961bcdb. Bugbot is set up for automated code reviews on this repo. Configure here.

@tim0120

tim0120 commented Jun 1, 2026

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@tim0120

tim0120 commented Jun 2, 2026

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 45322b6646

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/prime/src/prime_cli/commands/rl.py
@tim0120

tim0120 commented Jun 2, 2026

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. You're on a roll.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@AmeenP AmeenP force-pushed the feat/hosted-opd-cli branch from 45322b6 to 961bcdb Compare June 15, 2026 14:28
@AmeenP AmeenP marked this pull request as ready for review June 15, 2026 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants