Move complex-level scores from per-interface to end-of-report (with PAE) by DimaMolod · Pull Request #14 · KosinskiLab/AlphaJudge

DimaMolod · 2026-06-01T13:49:18Z

Summary

Two focused tweaks to the AlphaJudge validation report layout, plus a README touch-up so the docs match.

1. Move complex-level features off the per-interface slider pages

confidence_score and pDockQ/mpDockQ are scalars per predicted complex, not per chain pair — they had the same value on every interface page of a given model, which was visually misleading and duplicated information.

_AF_DERIVED_FEATURES drops confidence_score and pDockQ/mpDockQ. iptm stays because AF3 reports a per-pair chain_pair_iptm.
New _COMPLEX_LEVEL_FEATURES = (confidence_score, pDockQ/mpDockQ).
New _complex_evidence_page renders the two complex-level sliders on top and the AlphaFold-DB-style green PAE heatmap below.
- Per-run report: one such page is always appended at the end (replaces the old PAE-only page).
- Aggregate report: appends a "Per-complex evidence" section after the per-interface slider pages, with one combined page per top-N complex.
runner.process_many stamps every aggregated row with an absolute source_dir so the aggregate report can locate each complex's PAE PNG without an extra flag.

2. Fix `--max-complexes` semantics in the aggregate report

With per-interface ranking, ranked[:max_complexes] was silently slicing interface rows rather than complexes — a single multimer could fill the cap and exclude every other complex. The cap now walks metascore-sorted and keeps every interface row whose complex is among the first max_complexes distinct complexes encountered. The per-complex evidence section honours the same cap (min(top_n, max_complexes)).

3. README touch-up (docs drift)

The README still described --aggregate_report as "one slider page per interface plus a cover" and report.pdf without mentioning the Complex-level confidence & PAE final page. Updated the --aggregate_report bullet and the two output bullets so they describe the actual page sequence.

Files changed

src/alphajudge/report.py — new _complex_evidence_page, updated _metric_rows_for_slider_panel, per-run + aggregate generators, max_complexes semantics.
src/alphajudge/runner.py — stamp source_dir on aggregated rows.
test/test_report.py — updated page-count expectations.
README.md — describe Per-complex evidence section.

No new dependencies, no metascore math change, no CLI changes.

Test plan

pytest test/test_meta_score.py test/test_report.py -q — 9 passed.
Per-run report on AF3 9-chain multimer (8hhy): 18 pages — cover, per-interface table, 15 slider pages, complex-evidence + PAE.
Aggregate report on mixed cohort (1 multimer + 2 random dimers): 21 pages — cover, 17 interface pages, 3 per-complex evidence pages with PAE.
--max-complexes 1 on the mixed cohort: 17 pages = cover + all 15 interfaces of the top complex + 1 evidence page (rather than the old behaviour of capping at 1 interface row total).

🤖 Generated with Claude Code

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0d3e01a841

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-01T13:53:20Z

+    ranked.sort(key=lambda t: t[3], reverse=True)
+
+    top_rows = [(label, score, r) for label, _, _, score, r in ranked[:top_n]]
+    ranked_per_page = ranked if max_complexes is None else ranked[:max_complexes]


Apply max_complexes to complexes, not raw interfaces

When --max-complexes is used on an aggregate report that includes multimers, ranked contains one entry per interface row, so this slice limits the PDF to the first N interfaces rather than the first N complexes. A single complex with many high-scoring chain pairs can consume the entire cap and exclude other complexes from the interface pages, which contradicts the max_complexes option/help and makes capped cohort reports misleading.

Useful? React with 👍 / 👎.

The per-interface slider panel previously repeated two scalars that are properties of the predicted complex (confidence_score and pDockQ/mpDockQ), even though they have the same value on every chain pair of a given model. That was visually misleading and duplicated the information across every page. Reorganisation: - _AF_DERIVED_FEATURES drops confidence_score and pDockQ/mpDockQ. iptm stays because AF3 provides per-pair chain_pair_iptm so it is meaningfully per-interface. - New _COMPLEX_LEVEL_FEATURES = (confidence_score, pDockQ/mpDockQ). - New _complex_evidence_page renders the complex-level sliders on top and embeds the PAE heatmap below. One such page is appended to every per-run report (it replaces the old PAE-only page), so the per-run report always carries PAE even when the matrix-derived PNG is missing. - generate_aggregate_report appends a "Per-complex evidence" section after all per-interface slider pages, with one combined page per unique complex (limited to top_n=10 to keep cohort PDFs bounded). Each evidence page locates its PAE PNG via the new source_dir column on every summary row. Runner change: - process_many stamps every aggregated row with an absolute source_dir so the aggregate report can resolve per-complex side files (e.g. the PAE PNG) without a separate --predictions-root flag. Tests updated to match the new page counts.

In the per-interface aggregate, ``ranked`` holds one entry per chain-pair interface, so a single multimer can fill the entire ``ranked[:max_complexes]`` slice and silently exclude every other complex. That contradicts the option name and help string. Walk metascore-sorted instead and keep every interface row whose complex is among the first ``max_complexes`` complexes encountered; this preserves the per-complex semantics. The per-complex evidence section now also respects the same cap (``min(top_n, max_complexes)``) so user-supplied caps shrink both sections consistently.

The README still described --aggregate_report as "one slider page per interface" plus a cover, which has been incomplete since the layout change in the previous commit (09f1053): the aggregate report now also appends a Per-complex evidence section with one combined slider+PAE page per top-N complex, and the per-run report's last page now combines the complex-level confidence sliders with the PAE heatmap. Updates the bullet for --aggregate_report and the two output bullets that describe report.pdf / aggregate PDF contents.

chatgpt-codex-connector Bot reviewed Jun 1, 2026

View reviewed changes

DimaMolod added 2 commits June 1, 2026 15:59

DimaMolod force-pushed the report_percentiles branch from 806983b to 86511f5 Compare June 1, 2026 13:59

DimaMolod changed the title ~~AlphaJudge percentile-style validation reports~~ Move complex-level scores from per-interface to end-of-report (with PAE) Jun 1, 2026

DimaMolod merged commit 99fb43a into main Jun 1, 2026
8 checks passed

DimaMolod deleted the report_percentiles branch June 1, 2026 14:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move complex-level scores from per-interface to end-of-report (with PAE)#14

Move complex-level scores from per-interface to end-of-report (with PAE)#14
DimaMolod merged 3 commits into
mainfrom
report_percentiles

DimaMolod commented Jun 1, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

DimaMolod commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

1. Move complex-level features off the per-interface slider pages

2. Fix --max-complexes semantics in the aggregate report

3. README touch-up (docs drift)

Files changed

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

DimaMolod commented Jun 1, 2026 •

edited

Loading

2. Fix `--max-complexes` semantics in the aggregate report