Sparse pullback for big performance gain#2170
Conversation
Co-authored-by: Kaya Unalmis <kayaunalmis@proton.me>
Co-authored-by: Kaya Unalmis <kayaunalmis@proton.me>
|
when is this getting merged |
f0uriest
left a comment
There was a problem hiding this comment.
Still about 1/2 to go but leaving these here for now.
One big point is that if we're messing with custom AD stuff I think it would be good practice to add tests comparing AD of the relevant objectives to finite differences (can use very low res, don't care about physics convergence), both to check that the implementation is correct and also to guard against us accidentally applying the sparse pullback in places where its not strictly correct.
That is not true/possible. See the supplementary information in publications. Briefly, For nontrivial computational problems where not everything is C^infinty, an algorithm to solve a problem needs to have amazing convergence properties, and be robust to topology changes, for the duscretization error to be correlated enough nearby a given point in the optimization space for the finite difference derivative to have any chance if being accurate. (Again explained better in the pdf). You can see that finite difference derivatives only make sense at high resolution computations of the algorithm. Auto diff makes sense at any resolution because it estimates the derivative from only information at a single point in the optimization landscape. (Of course if discretization error is high then over an optimization it could still stall as varying discretization error can affect the decent direction,but that's unrelated for this discussion). In general you'll need high res to get finite diff to match auto diff. |
sparse_pullbackandsparse_pullback_mapbounce1doptimization.is_reshaped,is_fourier) that users said were confusing (backwards compatible) as well as the developer flagsBref,Lrefthat should not be there.pitch_batch_sizewas getting ignored. This fixes that by addingstrip_dim0flag tobatch_map.notes