Add gradient-based optimization (CG) and multi-start support#65
Open
Marius1311 wants to merge 1 commit intomainfrom
Open
Add gradient-based optimization (CG) and multi-start support#65Marius1311 wants to merge 1 commit intomainfrom
Marius1311 wants to merge 1 commit intomainfrom
Conversation
Add analytical Jacobian for the GPCCA rotation matrix objective, enabling gradient-based optimizers (CG, L-BFGS-B, BFGS) as alternatives to Nelder-Mead. Add multi-start optimization via SO(k) rotation perturbation to escape local optima. Key changes: - _jacobian(): backpropagates through _fill_matrix (row-sum constraint, max condition, rescaling) for the correct gradient - _perturb_rotation(): perturbs rotation matrix on the SO(k) manifold via expm(epsilon * S) with random skew-symmetric S - _opt_soft(): dispatches to scipy.optimize.minimize for gradient-based methods, keeps fmin for Nelder-Mead - _gpcca_core(): multi-start loop with degeneracy filtering - _fill_matrix(): optional _return_intermediates for Jacobian backward pass, avoiding forward-pass duplication - GPCCA.optimize(): new parameters method, n_starts, perturbation_scale, seed -- all backward compatible (defaults reproduce old behavior) Tests: - Jacobian vs finite differences (m=3, m=5) - CG vs Nelder-Mead crispness comparison - All 4 methods produce valid memberships - Full GPCCA pipeline with CG - _perturb_rotation preserves orthogonality - Multi-start >= single-start crispness - Seed determinism
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This adds gradient-based optimizers (CG, L-BFGS-B, BFGS) as alternatives to Nelder-Mead for the GPCCA rotation matrix optimization, along with a multi-start mechanism to improve solution quality.
Motivation
The existing Nelder-Mead optimizer works well for small numbers of macrostates but becomes very slow for m > 15 — it optimizes over O(m²) parameters without gradient information. In CellRank, users increasingly want to compute 20+ macrostates, which motivated exploring gradient-based alternatives.
What's in this PR
Analytical Jacobian (
_jacobian): Computes the exact gradient of the crispness objective by backpropagating through_fill_matrix. The three transformations in_fill_matrix— row-sum constraint, max condition, rescaling — introduce dependencies between matrix entries that a naive gradient (differentiating the objective w.r.t. individual entries) would miss. The implementation backpropagates through all three steps correctly. Verified againstscipy.optimize.approx_fprimein tests.Gradient-based optimizers:
_opt_softnow accepts amethodparameter. For"CG","L-BFGS-B", and"BFGS", it usesscipy.optimize.minimizewith the analytical Jacobian. Nelder-Mead remains the default and uses the existingfminpath — behavior is unchanged when no method is specified.Multi-start optimization (
_perturb_rotation,_gpcca_core): Forn_starts > 1, the first run uses the deterministic ISA initialization, and subsequent runs perturb the initial rotation matrix on the SO(k) manifold viaexpm(epsilon * S)with random skew-symmetric S. The result with the best crispness is kept. Degenerate solutions (where a cluster is never the argmax for any cell) are filtered out. Default isn_starts=1(fully backward compatible)._fill_matrixrefactoring: Added an optional_return_intermediatesparameter that returns the pre-scaling matrix, argmax indices, and scale factor needed by the Jacobian backward pass. This avoids duplicating the forward-pass logic between_fill_matrixand_jacobian.Testing we did
We benchmarked on the CellRank pancreas dataset (2,531 cells, PseudotimeKernel) and bone marrow dataset (5,780 cells) across m = 3..50:
Benchmark scripts are in a separate analysis repo.
Tests
14 new tests covering:
GPCCA.optimize()pipeline with CG_perturb_rotationpreserves orthogonality (det of applied rotation = ±1)API
All new parameters on
GPCCA.optimize():method:"Nelder-Mead"(default),"CG","L-BFGS-B","BFGS"n_starts: number of optimization runs (default 1)perturbation_scale: angular scale for SO(k) perturbation (default 0.1)seed: random seed for reproducibilityDefaults reproduce the existing behavior exactly.