Journal

Vault knowledge engine: 12,600 lines across 5 Rust crates in one day

vault-enginerustknowledge-graphhybrid-search

2026-04-08

Signal

Built the complete vault knowledge engine in a single day: 5 Rust crates (vault-core, vault-scanners, vault-eval, vault-mcp, vault-cli) totaling 12,600 lines. The competitive analysis against five comparable systems drove the architecture decisions. The vault went from an interesting wiki with manual audit skills to a real knowledge engine with agent-queryable search, deterministic quality gates, and programmatic access.

Evidence

Competitive analysis: crawled 6 systems via parallel Opus sub-agents, producing a weighted ranking across 18 dimensions. The ranking showed we were leading by a hair (6.09 vs 6.08 for the nearest competitor). That margin was not comfortable. It said we had a good architecture and no moat. The day’s work was about building the moat.

Architecture decision: chose Approach C, a frame-inspired native implementation, over two alternatives that would have treated a third-party tool as either a dependency or a sidecar. The native path is more work upfront but avoids coupling the vault’s data model to someone else’s schema. That coupling is what turns a promising system into a maintenance burden six months later.

vault-core: SQLite with 12 tables, tantivy BM25, HNSW vectors, ONNX embeddings (bge-small at 384 dimensions), reciprocal rank fusion, visibility gates, temporal queries, graph traversal, file watcher. This is the foundation. Every other crate in the system builds on this.

vault-scanners: 72 deterministic scanners across 7 categories. The scanners run before any LLM touches the data, which means the quality floor is predictable. LLMs get bolted on for the jobs that actually need judgment, not for work that deterministic code can do better and cheaper.

vault-eval: KQI computation with 7 weighted components plus a Karpathy ratchet for continuous optimization. The ratchet keeps pushing the metric up one small step at a time, which is the only reliable way I have found to move a composite score without over-fitting to a single dimension.

vault-mcp: 20 MCP tools and 5 resources over stdio JSON-RPC. This is the agent-facing surface. Every tool here is something an agent can call without me being present, which is what makes the whole system useful when I am not in the loop.

vault-cli: the rv binary with 14 subcommands. This is the human-facing surface. If the MCP tools are for agents, rv is for me.

Testing: 112 tests passing, subagent-driven development with spec and quality reviews for each crate. No crate shipped without both a spec and an audit pass.

So What

The competitive analysis drove the scope. When you are leading by 0.01 on a composite score, you either need a radically different architecture or a dramatic expansion in what you cover. I chose expansion. The key insight from the analysis: no system combines deep typed knowledge graph plus lifecycle chains plus audit ratchet plus hybrid retrieval plus temporal frames plus MCP. We now do. The projected score with these additions jumps to 8.30, which is 36% ahead of the next system. That is a real margin.

The day itself was an exercise in what a 12K-line Rust sprint looks like when the spec work has already been done. Each crate took a handful of subagent turns because the boundaries were clear: core owns the data, scanners own the quality checks, eval owns the metrics, mcp owns the protocol, cli owns the human surface. When the module boundaries are that clean, the implementation rate can be close to linear in lines of spec.

This transforms the vault from a static document store into an active, queryable, self-auditing knowledge system. It is also the largest single-day build the vault has seen, which is a nice artifact of how much leverage you get when the preparatory work is real.

What’s Next

The immediate next step is dogfooding. Every day I work in the vault, the engine should surface at least one thing I would otherwise have missed. If it does not, the retrieval layer is underperforming and the RRF fusion weights need tuning. If it does, the system is already paying rent.

After that, the extension path is clear: more scanners, more MCP tools, more aggressive ratchet cycles on the KQI. The hard part is shipped. The rest is iteration.

Log

  • Sessions: long sprint, subagent-driven
  • Repos: rusty-bloomnet (5 new crates)
  • Lines: 12,600 across 5 crates
  • Tests: 112+ passing
  • Mode: spec-then-implement via parallel subagents