Skip to content

[codex] compact training sample transport payloads#2809

Open
samsja wants to merge 1 commit into
codex/orchestrator-r3-memoryfrom
codex/orchestrator-compact-training-samples
Open

[codex] compact training sample transport payloads#2809
samsja wants to merge 1 commit into
codex/orchestrator-r3-memoryfrom
codex/orchestrator-compact-training-samples

Conversation

@samsja

@samsja samsja commented Jun 14, 2026

Copy link
Copy Markdown
Member

Summary

Stacked on #2807. This continues the train-trace memory work by compacting the remaining trainer-bound TrainingSample list payloads at the orchestrator/train transport boundary.

What changed

  • Adds PackedArray fields to TrainingSample for byte-backed token IDs, masks, logprobs, per-token temperatures, teacher logprobs, and mm_token_type_ids.
  • Adds prime_rl.transport.compact helpers to compact samples before send, read compact lengths without inflation, and inflate only when preparing selected trainer microbatches.
  • Compacts batch.samples immediately after optional teacher logprobs and immediately before TrainingBatch send.
  • Updates trainer prepare_sample to read either legacy lists or packed arrays.
  • Updates MultiPacker validation/token-budget accounting to use compact lengths so buffered samples do not inflate just to schedule packing.

Synthetic Size Check

Single synthetic TrainingSample with 30k tokens (15k prompt + 15k completion), scalar completion temperature, no R3 bytes:

  • msgpack payload: 315,075 bytes -> 183,882 bytes (58.36% of previous)
  • approximate Python list payload for compacted fields: 2,643,088 bytes -> 184,943 bytes (7.00% of previous)

Validation

  • uv run pytest tests/unit/orchestrator/test_batch.py tests/unit/train/rl/test_packer_compact.py -> 15 passed
  • uv run pytest tests/unit/orchestrator tests/unit/train/rl/test_packer_compact.py -> 89 passed
  • uv run ruff check src/prime_rl/transport/types.py src/prime_rl/transport/compact.py src/prime_rl/trainer/batch.py src/prime_rl/trainer/rl/packer.py src/prime_rl/orchestrator/orchestrator.py tests/unit/orchestrator/test_batch.py tests/unit/train/rl/test_packer_compact.py -> passed
  • uv run ruff format --check src/prime_rl/transport/types.py src/prime_rl/transport/compact.py src/prime_rl/trainer/batch.py src/prime_rl/trainer/rl/packer.py src/prime_rl/orchestrator/orchestrator.py tests/unit/orchestrator/test_batch.py tests/unit/train/rl/test_packer_compact.py -> passed
  • git diff --check -> passed

Note

Medium Risk
Changes the hot orchestrator→trainer data path for all training samples; behavior is covered by roundtrip and packer tests and legacy list fields remain supported via accessors.

Overview
Shrinks orchestrator→trainer msgpack traffic and in-process memory by replacing large per-sample Python lists with byte-backed PackedArray fields on TrainingSample, compacting immediately before TrainingBatch send (after optional teacher logprobs).

Adds prime_rl.transport.compact to pack/unpack token IDs, masks, logprobs, temperatures, teacher logprobs, and mm_token_type_ids, expose lengths without inflating lists, and inflate only in prepare_sample when building microbatches. MultiPacker validation and token budgeting use those length helpers so buffered compact samples are not expanded just for scheduling.

Reviewed by Cursor Bugbot for commit b0ea1f9. Bugbot is set up for automated code reviews on this repo. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant