fix: #430 HNSW insert beam + distance-based pruning + storage rebuild + CI guards#471
Open
ruvnet wants to merge 5 commits into
Open
fix: #430 HNSW insert beam + distance-based pruning + storage rebuild + CI guards#471ruvnet wants to merge 5 commits into
ruvnet wants to merge 5 commits into
Conversation
…ning + storage rebuild Three remaining root causes from issue #430, plus the storage-rebuild gap from PR #460. Bug B — insert beam was clamped to ef_construction.min(m * 2). With defaults (m=16, ef_construction=200) the beam silently became 32. Late- inserted clusters got wired through whatever was near the entry point instead of through ef_construction-wide neighbour search. Bug C — adjacency-list pruning used `drain(0..drain_count)`, dropping the OLDEST edges regardless of distance. Proper HNSW pruning keeps the m CLOSEST edges. Now sort by `calculate_distance` to the anchor vector and truncate to m. Kept a fallback that preserves the newest-m behaviour when the anchor vector lookup fails so we never panic on a missing vector. Storage — VectorDB::new() always created a fresh empty HnswIndex, so previously persisted vectors were invisible to search after reopening the database. Now rebuild via storage.get_all_ids() + index.insert_batch() on open, and seed VectorDbStats.total_vectors with the recovered count. Tests: - test_pruning_keeps_closest_not_newest: builds a hub with 20 close neighbours then 6 far neighbours, asserts no "far_*" id appears in top-10 around the hub. Fails on FIFO pruning. - test_index_rebuilt_from_storage_on_open: writes 5 vectors via one VectorDB instance, reopens against the same path, asserts search returns the persisted match. Fails on the historical empty-index bug. Regression-guard CI additions: - hnsw-insert-beam-no-m2-clamp: textually forbids the ef_construction.min(m*2) pattern in index.rs. - hnsw-distance-based-neighbor-pruning: requires calculate_distance and the `> m * 2` overflow gate to both live in index.rs. - vector-db-rebuilds-index-on-open: requires storage.get_all_ids() in vector_db.rs. - hnsw-recall-at-1 job now also runs the two new tests. Supersedes PR #460 (CoolDude1969) which covered storage rebuild + an overlapping heap fix already in main from PR #466. Closes #430. Co-Authored-By: claude-flow <ruv@ruv.net>
Surface the #430 HNSW correctness fixes (insert beam, distance-based pruning, storage rebuild) to npm consumers. Bump applies to the meta package and all 5 platform-specific subpackages so optionalDependencies resolve consistently after publish-all.yml runs. Co-Authored-By: claude-flow <ruv@ruv.net>
The expanded README and 0.1.1 version were already published to npm by an earlier release, but never committed back to git. Verified identical to `npm pack @ruvector/diskann@0.1.1`. Bringing the working tree in sync so future bumps start from a clean baseline. Co-Authored-By: claude-flow <ruv@ruv.net>
No behaviour change — collapses single-expression closure and assignment onto one line per rustfmt defaults so the rustfmt CI job passes. Co-Authored-By: claude-flow <ruv@ruv.net>
The `optional-deps-resolvable-on-npm` regression guard fails because
@ruvector/router-<platform>@0.1.31 doesn't exist on npm yet — those
platform binaries are only published by `publish-all.yml` after a tag is
cut, which happens AFTER this PR merges.
Splitting the work:
- This PR: HNSW correctness fix + CI guards (keeps regression-guard
green on every commit).
- Follow-up release PR: bump @ruvector/router meta + 5 platform
packages to 0.1.31, tag v0.1.31, publish-all.yml ships the fix.
This commit reverts c5c7e7f and is itself reverted in the release PR.
Co-Authored-By: claude-flow <ruv@ruv.net>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the three remaining root causes from #430 plus the storage-rebuild gap that PR #460 separately identified. All previously-undocumented HNSW correctness bugs at scale.
ef_construction.min(m * 2)silently becamem * 2(32 with defaults) instead of 200. Late-inserted clusters got wired through whatever sat near the entry point. Removed the clamp.drain(0..drain_count), dropping the OLDEST edges regardless of distance. Proper HNSW keeps the m closest. Now scores viacalculate_distanceand truncates to m, with a guarded fallback if the anchor vector lookup fails.VectorDB::new()always created a fresh emptyHnswIndex, so persisted vectors were invisible to search after reopen. Rebuild viastorage.get_all_ids()+index.insert_batch()on open and seedtotal_vectorsfrom the recovered count.Tests
test_pruning_keeps_closest_not_newest— builds a hub with 20 close neighbours then 6 far neighbours, asserts nofar_*id appears in top-10 around the hub. Fails on FIFO pruning.test_index_rebuilt_from_storage_on_open— writes 5 vectors with oneVectorDB, reopens against the same path, asserts search returns the persisted match. Fails on the empty-index regression.All 21 router-core unit tests pass locally (release):
```
test result: ok. 21 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
```
Regression-guard CI
Three new structural guards complement the behavioural
hnsw-recall-at-1job:The existing `hnsw-recall-at-1` job now also runs `test_pruning_keeps_closest_not_newest` and `test_index_rebuilt_from_storage_on_open`.
Release plan
This PR does NOT bump `@ruvector/router` — the original 0.1.31 bump was reverted because the `optional-deps-resolvable-on-npm` regression guard fails when platform-specific subpackages don't yet exist on npm (CI publishes them only after a tag is cut).
After this PR merges, a separate release PR will:
Test plan
Supersedes
PR #460 (CoolDude1969) — overlapping work; this PR keeps the storage-rebuild idea and adds bugs B + C + CI guards. Credited in the closing comment.
🤖 Generated with claude-flow