Experiment Memory bloomnet

Building our own frame-graph storage in SQLite with tantivy BM25 and HNSW vector search will achieve the same retrieval quality as Memvid while preserving our typed graph structure and audit pipeline

Approach C delivered: 711 frames indexed, hybrid search working, 72 scanners running, 86% audit pass rate on first run. SQLite + tantivy + HNSW provid

April 7, 2026

architectureknowledge-graphcompetitive-analysisrust

Hypothesis

Building our own frame-graph storage in SQLite with tantivy BM25 and HNSW vector search will achieve the same retrieval quality as Memvid while preserving our typed graph structure and audit pipeline

Result: confirmed

Key Findings

Approach C delivered: 711 frames indexed, hybrid search working, 72 scanners running, 86% audit pass rate on first run. SQLite + tantivy + HNSW provides all the retrieval capabilities without Memvid's single-writer lock or flat SPO triplet limitations.

Background

The vault needed to evolve from a manually maintained Obsidian wiki into an agent-accessible knowledge engine. The ideas/2026-04-08-vault-engine-architecture idea explored three candidate architectures, each with different tradeoffs around retrieval quality, graph fidelity, and integration complexity.

The Three Approaches

Approach A: Memvid as Core Dependency

Use Memvid’s video-frame knowledge store as the primary storage layer. Vault notes would be encoded as Memvid frames, with search delegated to Memvid’s built-in BM25 + vector pipeline.

Pros: minimal code, proven retrieval quality, active community. Cons: single-writer lock prevents concurrent indexing; flat SPO triplet model cannot represent our typed dimensions, lifecycle chains, or visibility gates; Memvid’s frame schema is designed for video segments, not structured knowledge notes.

Approach B: Memvid as Sidecar

Keep the existing vault file structure but run Memvid alongside it as a search sidecar. Notes would be dual-indexed: filesystem for direct access, Memvid for retrieval.

Pros: retains file-based vault, adds retrieval without restructuring. Cons: dual-write consistency problem (filesystem and Memvid can drift); still inherits Memvid’s flat triplet model for graph queries; adds a Python dependency to an otherwise pure-Rust stack; sidecar process management adds operational complexity.

Approach C: Frame-Inspired Native Implementation

Build our own frame-graph storage natively in Rust, borrowing Memvid’s “smart frame” concept (content + metadata + embeddings in a unified record) but implementing it in SQLite with tantivy BM25 and HNSW vector search. Full control over schema, indexing, and query semantics.

Pros: typed graph schema matches our 14 vault dimensions exactly; single SQLite database, no dual-write; pure Rust, no Python sidecar; visibility gates and temporal queries built into the query layer; Karpathy ratchet enforced at the storage level. Cons: more code to write (estimated 10-15K lines); must implement BM25 and vector search from scratch (mitigated by tantivy and HNSW crates); no community validation until we benchmark.

Decision

Approach C was selected. The decisive factors:

Graph fidelity: our vault has 14 typed dimensions with explicit lifecycle chains (idea -> experiment -> breakthrough). Neither Memvid approach preserves this structure natively.
Audit pipeline: the 72 scanners need direct access to typed fields, not SPO triplets.
Stack coherence: rusty-bloomnet and rusty-dakka are pure Rust. Adding a Python sidecar would break the deployment model.
Concurrency: Memvid’s single-writer lock would block the file watcher from indexing during search queries.

Results

Approach C was implemented over April 8-9 in two phases: build (Apr 8) and integration (Apr 9).

Build day: 5 crates, 12,600 lines, 112+ tests. vault-core with 12 SQLite tables, tantivy BM25, HNSW vectors, ONNX embeddings (bge-small 384d), RRF fusion. vault-scanners with 72 deterministic scanners. vault-eval with KQI computation + ratchet. vault-mcp with 20 tools + 5 resources. vault-cli with 14 subcommands.

Integration day: rv init indexed 711 frames with 19,475 scanner checks at 86% pass rate. MCP server registered, peon and stella wired, pre-commit hook and launchd file watcher deployed, 5 Leptos pages built.

The hypothesis is confirmed: SQLite + tantivy + HNSW delivers the retrieval capabilities we need while preserving the typed graph structure, audit pipeline, and visibility model that Memvid’s architecture cannot accommodate.