fix: optimize segment compaction by reusing first vector index#114
fix: optimize segment compaction by reusing first vector index#114kgeg401 wants to merge 3 commits into
Conversation
|
Thank you for your contribution. The zvec maintainer team is currently on holiday for the Chinese New Year (Spring Festival) in mainland China. Please expect delayed responses, and we appreciate your patience. |
0cb23df to
98d5708
Compare
98d5708 to
17581fd
Compare
|
Added a quantized-path regression test for segment compaction:\n\n- New CompactTask_QuantizedVectorIndexThreeSegmentsRegression in segment_helper_test.cc.\n- Covers multi-segment, no-filter compaction with quantized vector indexes and verifies quantized index availability after compaction.\n\nThis is intended to directly exercise the get_quant_vector_indexer(...) merge-reuse path. |
|
Added the quantized-path regression test for segment compaction, and the PR is otherwise unchanged. This should be ready for maintainer review when bandwidth allows. |
|
Thanks a lot for the contribution, and sorry for the long delay on this review — we had some internal bandwidth constraints. We've made PR/issue handling to a regular cadence :) On the change itself: I feel this optimization is better placed in the core reducer layer( Also, nice improvement on deduplicating repeated merge logic. One more cleanup suggestion: merge_options construction could likely be moved out of the lambda and built once outside the loop. |
|
Following up on my earlier comment — after digging deeper into the implementation and thinking about where this is heading, I'd like to walk back the suggestion to push this into Reversing the layering call Moving "pick a base indexer / copy the file / fall back to full rebuild" breaks the abstraction. Also, future compaction/optimize improvements might bring more db-level scheduling (e.g., segments reordering). So we'd rather keep this layer in A few issues in the current implementation
Next steps We've taken your PR as a starting point, fixed the issues above, and refactored a bit. That work is up at #440. We'll close this PR once that's merged — thanks again for the initial version, it gave us a base to build on :) |
Summary
Root Cause
ReduceVectorIndex always rebuilt merged indexes from scratch, even when compaction had no filter and segments were layout-compatible. That caused avoidable index rebuild work during merge/compact.
Validation
Fixes #98