Skip to content

solve_global_scene: treat MP_MAXITER as calibration failure#354

Merged
bl4ckb0ne merged 1 commit intocollabora:masterfrom
rocketmark:upstream-pr/gss-maxiter-guard
Mar 23, 2026
Merged

solve_global_scene: treat MP_MAXITER as calibration failure#354
bl4ckb0ne merged 1 commit intocollabora:masterfrom
rocketmark:upstream-pr/gss-maxiter-guard

Conversation

@rocketmark
Copy link
Copy Markdown
Contributor

@rocketmark rocketmark commented Mar 13, 2026

Summary

solve_global_scene uses res <= 0 to detect solver failure, but MP_MAXITER (return code 5) is not ≤ 0. When the GSS solver hits its iteration limit without converging, status_failure is false, the unconverged lighthouse positions are accepted, and they are written to disk.

On the next launch those positions are loaded as the starting point for calibration. If the unconverged solve is significantly wrong, all subsequent tracking is corrupted and the only recovery is to delete config.json.

Demonstration

BEFORE fix:
GSS solver hits iteration limit
res = MP_MAXITER (5)
res <= 0 → false
status_failure = false
Unconverged lighthouse positions written to disk
Next launch: loads bad positions → tracking corrupted
Result: config.json must be deleted to recover

AFTER fix:
GSS solver hits iteration limit
res = MP_MAXITER (5)
res <= 0 || res == MP_MAXITER → true
status_failure = true
GSS result discarded, existing positions preserved
Result: calibration retried on next GSS cycle

The regular per-object pose solver (line ~522) already rejects res <= 0 correctly; solve_global_scene was missing the MP_MAXITER case.

Impact

Any deployment that triggers GSS calibration (lighthouse position solving) is exposed. The failure mode is silent. No crash, no warning, but tracking quality degrades permanently until config is cleared. More likely on resource-constrained hardware (embedded, Pi) where the solver may not converge within the default iteration budget.

Change

One line in src/poser_mpfit.c, inside solve_global_scene only. The per-object pose solver is not touched.

Found via

Observed in production: GSS calibration would occasionally produce wildly wrong lighthouse positions that persisted across restarts. Correlating with solver return codes showed MP_MAXITER returns were being silently accepted. Deleting config.json recovered tracking; adding res == MP_MAXITER to the failure condition prevented recurrence.

rocketmark pushed a commit to rocketmark/libsurvive that referenced this pull request Mar 13, 2026
Copy link
Copy Markdown
Collaborator

@bl4ckb0ne bl4ckb0ne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing Assisted-by tag to the commit message. You can also drop the Fix: part.

MP_MAXITER (return code 5) means the solver exhausted its iteration
budget without converging. The current code treats this as success
(res <= 0 is false), so the unconverged — and potentially wrong —
lighthouse positions are written to disk via survive_recording and
the GSS result is accepted.

On the next launch those positions are loaded as the starting point
for calibration. If the unconverged solve is significantly wrong,
all subsequent tracking is corrupted and the only recovery is to
delete config.json.

Adding res == MP_MAXITER to the failure condition rejects unconverged
GSS solves the same way as explicit solver errors.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@rocketmark rocketmark force-pushed the upstream-pr/gss-maxiter-guard branch from 748a039 to a8cd1ab Compare March 22, 2026 14:41
@rocketmark
Copy link
Copy Markdown
Contributor Author

Changes made.

@bl4ckb0ne bl4ckb0ne merged commit 2f4303c into collabora:master Mar 23, 2026
2 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants