feat(memory): pluggable storage backends#998
feat(memory): pluggable storage backends#998erma07 wants to merge 2 commits intoRightNow-AI:mainfrom
Conversation
jaberjaber23
left a comment
There was a problem hiding this comment.
Security Audit — APPROVED (with rebase needed)
Verdict: APPROVE — clean, well-architected refactor. Cannot merge due to conflicts.
Security findings:
-
No data exfiltration — The HTTP semantic backend only routes to a user-configured
memory_api_urlfrom config, with automatic fallback to local storage on failure. No hardcoded external endpoints. -
No SQL injection — PostgreSQL backend uses parameterized queries (
$1,$2etc.) throughout. Noformat!()with SQL + user data found. -
Unsafe blocks are legitimate — All 5
unsafeblocks are forsqlite_vec::sqlite3_vec_initFFI registration (SQLite vector extension). This is an existing pattern that was moved, not introduced. The transmute is required for the SQLite auto-extension API. -
No secrets in diff — Docker Compose test credentials (
POSTGRES_PASSWORD: openfang) are for local integration testing only, under thetest/dbprofile. Acceptable. -
No new unsafe code paths — The trait-based architecture is actually more secure than before:
usage_conn()which leaked rawrusqlite::Connectionto external crates has been removed. All external callers now go throughArc<dyn UsageBackend>trait objects. -
Line count justified — 7171 additions across 22 Rust files + Cargo.lock changes. The breakdown: ~1300 lines Cargo.lock dependency updates, ~2000 lines PostgreSQL backend (11 files mirroring SQLite), ~700 lines Qdrant backend, ~300 lines HTTP backend rename/extension, ~800 lines backend traits + helpers, ~500 lines migration code, ~500 lines tests, ~1000 lines SQLite file reorganization (moved, not new).
-
Qdrant backend — Uses official
qdrant-clientcrate via gRPC. Auto-creates collections with cosine similarity. No authentication bypass.
The PR has merge conflicts (DIRTY state). Author needs to rebase before merge.
|
This PR has merge conflicts. Please rebase onto the latest main branch and resolve conflicts so we can merge. |
f2ec259 to
d42e5f7
Compare
|
resolved |
jaberjaber23
left a comment
There was a problem hiding this comment.
Thanks for the ambitious pluggable-memory work @erma07 — the direction (trait split, SQLite vs Postgres vs Qdrant, HTTP store) is something we want. But at +7243/-2112 across 48 files, this has to land clean. Requesting changes.
Blockers
HttpSemanticStore::rememberreturns a randomMemoryId::new()instead of the server's ID (crates/openfang-memory/src/http/semantic.rs— around therememberimpl). It logsresp.idbut discards it.forgetandupdate_embeddingthen go to fallback with wrong IDs — silently broken for the entire HTTP mode.HttpSemanticStore::recallfabricates a freshMemoryId::new()per row — IDs are unstable across calls, so dedup/forget/update semantics don't work at all for HTTP recall.MemorySubstrate::open_postgresusestokio::runtime::Handle::current().block_on(...). Panics ifMemorySubstrate::openis ever called outside a Tokio runtime. Confirm the kernel's boot path and either guarantee a runtime is active or switch to async-open.- Docs vs code disagreement on
semantic_backend.substrate.rsheader comments list"qdrant"and"postgres+qdrant"as valid values, butselect_semantichas no"postgres"arm —Ok(default)fallback silently accepts unknown values. Either implement Postgres-backed semantic or remove it from the docs. Cargo.lockshows workspace version0.5.9 → 0.5.5and dropsrustlsfromopenfang-kernel. Almost certainly a wrong-base-branch artifact. Please rebase on currentmainand regenerate the lockfile.- Breaking API changes without a migration note.
usage_conn()is removed and module paths (consolidation,knowledge, etc.) moved undersqlite::…. External consumers usinguse openfang_memory::knowledge::…will break. Either add re-exports for one release or add a clear MIGRATION note. - Silent audit-log restore failure.
AuditLog::with_backendusesif let Ok(rows) = backend.load_entries(...)and proceeds as if empty on Err. For a security audit trail, a failed load should at least warn loudly, ideally fail-closed.
Concerns
unsafeFFI registration forsqlite-vecviasqlite3_auto_extension+transmutein the productionSqliteBackend::openpath — the PR claim "no new unsafe" is inaccurate. The usage is standard for the crate, but please be explicit in the docs/changelog.QdrantSemanticStore::recallreturnsOk(vec![])whenquery_embeddingisNone. Should returnError an explicit "embedding required" error.- JSONL mirror writes
sessions_dir.join(format!("{}.jsonl", session.id.0))— document thatSessionIdis UUID-only so reviewers can rule out path traversal. docker-compose.ymladdsPOSTGRES_PASSWORD: openfang— fine for thedevprofile, please mark it so nobody copies it to prod.- Whether
openfang-kernel/ the binary crates enable thepostgres/qdrantfeatures onopenfang-memoryisn't visible in the snippets I reviewed — please confirm end-to-end wiring.
New dependencies (all direct adds trustworthy, some supply-chain work needed)
sqlite-vec, hex, sha2, tokio-postgres, deadpool-postgres, pgvector, qdrant-client — names match well-known crates, not typosquats. sqlite-vec is specialized; please confirm it's maintained and has wheels/builds for all platforms in the release matrix (Windows in particular). The transitive set (tonic, prost-types, hyper-timeout, deadpool, stringprep, etc.) is typical for gRPC + Postgres and acceptable.
Recommendation
Land as-is: no. Land after fixes: yes, this is valuable. Specifically:
- Fix the HTTP ID handling (blockers 1 & 2).
- Either implement or remove
semantic_backend = "postgres". - Rebase and regenerate
Cargo.lockcleanly against currentmain. - Resolve the Tokio-handle concern for
open_postgres. - Restore behavior on audit-log load failure.
- Integration-test the hybrid matrix (
backend = "sqlite"+semantic_backend = "qdrant", etc.) — thebackend_integration.rstest file is there, just make sure CI actually runs it with thepostgres,qdrantfeatures enabled.
Happy to re-review once those are in.
…t support Redesign the openfang-memory crate for pluggable storage backends. The main backend (sqlite or postgres) and semantic backend (sqlite, postgres, qdrant, http) are independently configurable. Architecture: - substrate.rs is 100% backend-agnostic (zero rusqlite imports) - 9 backend traits: Structured, Semantic, Knowledge, Session, Usage, PairedDevices, TaskQueue, Consolidation, Audit - SessionBackend has 5 default trait impls (create_session, canonical_context, append_canonical, store_llm_summary, etc.) - Shared helpers.rs for serialization/parsing across backends - JSONL session mirror extracted to standalone filesystem utility Backends: - sqlite/ — 11 files, full implementation with sqlite-vec vectors - postgres/ — 11 files, full implementation with pgvector - qdrant/ — semantic-only, gRPC vector similarity search - http/ — semantic-only, remote memory-api gateway with fallback External callers migrated: - kernel uses memory.usage_arc() and memory.audit() (was usage_conn()) - api routes use memory.usage() trait method - runtime AuditLog uses AuditBackend trait (was raw rusqlite Connection) - MeteringEngine accepts Arc<dyn UsageBackend> (was Arc<UsageStore>) Config: [memory] backend = "sqlite" # or "postgres" semantic_backend = "qdrant" # independently: sqlite/postgres/qdrant/http Docker: pgvector/pg18 + qdrant services for integration testing. 65 tests (40 unit + 25 integration) across all backends.
…ructure Resolves the seven blockers and five concerns from the pluggable-memory PR review, plus a restructure pass that tightens correctness + ergonomics. Blockers - HTTP semantic store parses server-emitted UUIDs via a new parse_memory_id helper instead of fabricating MemoryId::new(); forget/update_embedding return explicit Err instead of silently delegating to a disconnected fallback store (B1, B2). - New async MemorySubstrate::open_async does all Postgres init via real .await; no more nested Handle::current().block_on. Kernel boot uses it; sync open() errors cleanly for async-only backends (B3). - PostgresSemanticStore implemented over deadpool-postgres + pgvector. Config is now typed: MemoryBackendKind / SemanticBackendKind with rename_all=snake_case + deny_unknown_fields. Dispatch is an exhaustive enum match; unknown values are rejected at parse time (B4). - Rebased onto v0.5.10, Cargo.lock regenerated cleanly (B5). - MIGRATION.md documents every break with before/after snippets (B6). - AuditLog::with_backend now returns OpenFangResult<Self>; load errors and integrity-check failures both tracing::error! and propagate via ?. Boot fails closed on a broken audit trail (B7). Concerns - SAFETY comments on both sqlite-vec FFI registration blocks (C1). - Qdrant recall returns Err on None embedding; Qdrant/HTTP init now fail-fast with the backend name + URL in the error message, no silent SQLite fallback (C2). - jsonl.rs documents that SessionId wraps uuid::Uuid, so the filename component is path-traversal safe (C3). - docker-compose.yml is fully env-overridable with dev-only comments (C4). - Feature forwarding wired end-to-end: openfang-memory -> openfang-kernel -> openfang-cli (C5). Test feature_gated_backend_errors_cleanly_when_ feature_off locks the behavior in. Restructure pass - Dropped DEFAULT from the Postgres task_queue v1 migration so it matches SQLite's strict NOT NULL semantics. - Error classification fix: "no Postgres pool available" is now Config (caller misuse), consistent with sibling feature-gate mismatches. - Qdrant recall skips + warns on malformed payloads instead of fabricating MemoryId/AgentId; mirrors the HTTP-side fix. - QdrantCollectionGuard RAII teardown deletes test collections on Drop. - MemoryConfig connection fields grouped into typed structs (PostgresConnConfig, QdrantConnConfig, HttpMemoryConnConfig) via serde(flatten) -- TOML shape is byte-identical; Rust consumers that construct MemoryConfig by hand must update to nested fields. - Unit tests for parse_memory_id; error-path test for Postgres pool misuse; four hybrid-matrix integration tests (sqlite+qdrant, sqlite+postgres, postgres+qdrant, qdrant_unreachable_fails_fast). - Removed dead PgSemanticStore alias and RawGraphRow::r_id field. CI: new integration-backends job brings up pgvector/pgvector:pg16 and qdrant/qdrant:latest as services and runs cargo test -p openfang-memory --features postgres,qdrant --test backend_integration, exercising the hybrid matrix end-to-end. Verification: cargo build across default / --no-default-features / --features postgres,qdrant all green; 2302 workspace lib tests pass; clippy clean on the same matrix with -D warnings.
d42e5f7 to
32196b7
Compare
|
@jaberjaber23 , please review the changes |
Summary
Redesign the
openfang-memorycrate with pluggable storage backends. The main storage backend (SQLite or PostgreSQL) and the semantic/vector backend (SQLite, PostgreSQL, Qdrant, HTTP) are now independently configurable, allowing mix-and-match deployments like PostgreSQL for structured data with Qdrant for vector search.Architecture
The orchestration layer (
substrate.rs) is now 100% backend-agnostic — zero database-specific imports. All storage is abstracted through 9 backend traits with implementations per database:Each backend lives in its own folder (
sqlite/,postgres/,qdrant/,http/) with identical file structure (11 files each for SQLite and PostgreSQL). Shared serialization and parsing logic is extracted intohelpers.rsto eliminate cross-backend duplication.SessionBackenduses Rust default trait implementations for 5 methods (create_session,create_session_with_label,append_canonical,canonical_context,store_llm_summary) — new backends only need to implement the storage primitives.Changes
PairedDevicesBackend,TaskQueueBackend,ConsolidationBackend,AuditBackend(added to existingStructuredBackend,SemanticBackend,KnowledgeBackend,SessionBackend,UsageBackend)sqlite/, all database-specific code isolated in its backend folderArc<dyn UsageBackend>,AuditBackend) — no more leakedrusqlite::Connectiontypessemantic_backendallows independent vector backend selectionpgvector/pgvector:pg18andqdrant/qdrant:latestservices for integration testingTesting
cargo clippy --workspace --all-targets -- -D warnings passes
cargo test --workspace passes (65 tests: 40 unit + 25 integration)
Integration tested against live PostgreSQL (pgvector/pg18) and Qdrant
All SQLite, PostgreSQL, and Qdrant backends verified with identical test suite
Security
No new unsafe code (existing sqlite-vec FFI registration unchanged)
No secrets or API keys in diff
User input validated at boundaries
usage_conn() removed — no more raw database connection leaks to external crates
Configuration