feat(agent): add GSP-B (full-broadcast) variant by jdbloom · Pull Request #16 · NESTLab/RL-CollectiveTransport

jdbloom · 2026-04-13T14:48:32Z

Summary

GSP-B is the full-broadcast sibling of GSP-N. Each agent's GSP input is `[self_prox, self_prev_gsp, other_0_prox, other_0_prev_gsp, ...]`, length `2 * n_agents`, self-first. Per-agent predictions. Same forward-pass shape as GSP-N.

Why

The Option A (direct-MSE) smoke test revealed that plain GSP is signal-starved: with only 4 prox readings and no previous-prediction feedback loop, the MSE optimum is the constant-mean baseline, so direct supervised training converges there and can't do better. See Stelaris `docs/research/2026-04-13-gsp-information-collapse-analysis.md`.

GSP-B gives plain GSP's full-broadcast conceptual model the same kind of enriched input that GSP-N gives its neighborhood — proximity AND previous predictions from every agent. Known limitation (same as plain GSP): input size is coupled to `n_agents`, so trained policies don't transfer across team sizes. That's literally why GSP-N exists, and this PR doesn't change that tradeoff.

Changes

`Agent.init`: new `broadcast: bool = False` param, mutually exclusive with `neighbors=True`. When set, `gsp_input_size` is overridden to `2 * n_agents`. Per-agent `gsp_observation` ring buffer allocation unified with the `neighbors` path.
`Agent.make_gsp_states_broadcast(prox, prev_gsp)`: new state builder, self-first ordering.
`Agent.choose_agent_gsp`: extended to route broadcast through the same per-agent forward-pass path as neighbors.
`Main.py`: GSP prediction and storage branches add `elif model.gsp_broadcast:` dispatches, parallel to the existing `gsp_neighbors` path. Reads `BROADCAST` from the config dict.

Test plan

8 new tests in `tests/test_agent/test_gsp_broadcast.py`:
- property wiring (`gsp_broadcast`)
- `gsp_network_input == 2 * n_agents` for n=4 and n=8
- state builder returns one state per agent
- self-first ordering
- others in ascending id order (skipping self)
- neighbors + broadcast raises ValueError
- plain GSP (neither) keeps legacy input
Full RL-CT suite: 120/120 pass (was 112, +8)

Companion

Stelaris launcher PR — `tools/dispatcher/launcher.py` `CONDITION_FLAGS` must add a `"GSP-B"` entry and pass the broadcast flag through `build_config` into the RL-CT `make_config`. Will open shortly.

🤖 Generated with Claude Code

GSP-B is the full-broadcast sibling of GSP-N: each agent's GSP input is the concatenation of (prox, prev_gsp) for self first, then every other agent in ascending id order. Total input length = 2 * n_agents. Plain GSP was limited to raw per-robot proximity values with no previous-prediction feedback loop, which left the predictor signal- starved under the new direct-MSE training path (see research note docs/research/2026-04-13-gsp-information-collapse-analysis.md in Stelaris — Option A revealed that plain GSP's 4-prox-flag input converges to the trivial-mean baseline because there isn't enough signal to beat it). GSP-B provides the same kind of enriched input that GSP-N gives its neighbor-hood, but broadcast to all agents. Known limitation (shared with plain GSP): the input size is coupled to n_agents, so a trained GSP-B policy does not transfer across team sizes. GSP-N is the transferrable variant by design — that was the whole reason GSP-N exists. Changes: - Agent.__init__ gains a `broadcast: bool = False` param. Mutually exclusive with `neighbors=True` (raises ValueError). When set, gsp_input_size is overridden to 2 * n_agents. - New property `gsp_broadcast`. - New method `make_gsp_states_broadcast(agent_prox_values, agent_prev_gsp)` that builds per-agent self-first views. Maintains the gsp_observation ring buffer the same way make_gsp_states does so recurrent/attention variants can layer on top later if wanted. - `choose_agent_gsp` extended to route broadcast through the same per-agent forward-pass path as neighbors (both are per-agent self-centric predictors with the same inference shape). - Main.py's GSP prediction and storage branches add a `gsp_broadcast` dispatch that calls `make_gsp_states_broadcast` and stores per-agent transitions analogous to the GSP-N path. Tests (tests/test_agent/test_gsp_broadcast.py, 8 cases): - broadcast=True flips gsp_broadcast property True - gsp_network_input is 2 * n_agents (parameterized over n_agents) - make_gsp_states_broadcast returns one state per agent - self-first ordering is correct for each agent - others appear in ascending id order (skipping self) - neighbors=True + broadcast=True raises ValueError - plain GSP (both False) keeps legacy input size Full RL-CT suite: 120/120 pass. Companion change required in Stelaris: launcher.py CONDITION_FLAGS must add a "GSP-B" entry and build_config must pass the broadcast flag through to the Agent constructor. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

jdbloom merged commit d452013 into master Apr 13, 2026
3 checks passed

jdbloom deleted the feat/gsp-b-broadcast branch April 13, 2026 14:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agent): add GSP-B (full-broadcast) variant#16

feat(agent): add GSP-B (full-broadcast) variant#16
jdbloom merged 1 commit intomasterfrom
feat/gsp-b-broadcast

jdbloom commented Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jdbloom commented Apr 13, 2026

Summary

Why

Changes

Test plan

Companion

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant