[codex] align hosted training config examples by tim0120 · Pull Request #15 · PrimeIntellect-ai/lab-cookbook

tim0120 · 2026-06-01T21:54:16Z

Summary

update SFT and OPD cookbook configs to match the hosted CLI distillation surface
add loss = "opd" and remove unsupported public teacher knobs like teacher_tau, save, and replay
route training environment overrides through env.args so examples validate with the hosted train config parser
align the related RL continuation examples/docs with the same Hosted Training env schema

Why

The new OPD setup uses the public hosted CLI shape (loss = "opd" plus [teacher]) while PrimeRL/runtime handles teacher endpoint wiring internally. The cookbook should document that hosted surface instead of PrimeRL internal config fields or stale env override tables.

Hosted Training uses one env config schema across rl, sft, and opd: per-env taskset/harness overrides are passed through env.args, not top-level [env.taskset] or [env.harness] tables. Keeping the related RL examples on the old shape would leave cookbook training examples that the new parser rejects.

Validation

python3 + tomllib: parsed all 38 cookbook TOML files
uv run python: loaded changed hosted training TOMLs with prime_cli.commands.rl.load_config from the local OPD CLI branch
git diff --check

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[codex] align hosted training config examples#15

[codex] align hosted training config examples#15
tim0120 wants to merge 1 commit into
mainfrom
codex/hosted-distillation-configs

tim0120 commented Jun 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

tim0120 commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Validation

Related

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tim0120 commented Jun 1, 2026 •

edited

Loading