Pitfall ai-agents

Circular Knowledge Corruption

April 2, 2026

pitfallai-agentsKNOWLEDGE

What Happened

In the LLM Knowledge Base pattern’s compounding loop (output filed back as raw, then recompiled), errors in LLM-generated articles get re-ingested as authoritative source material. Each compilation cycle amplifies the original error because the system cannot distinguish between:

Human-curated raw sources (high authority, externally verified)
LLM-generated outputs filed back (lower authority, potentially erroneous)

Over multiple compilation cycles, a small factual error grows into an established “fact,” self-reinforced by multiple articles that all trace back to the same erroneous LLM output. This is the knowledge-management equivalent of training an LLM on its own outputs: model collapse through self-reinforcement.

Root Cause

See definitions/root-cause-analysis for the analytical framework. Specific cause: the compounding loop lacks provenance tracking. Filed-back LLM outputs enter the raw/ directory with the same authority as human-curated sources. The compilation step weights all raw sources equally, giving LLM-generated content the same trust as verified external content.

How to Avoid

Provenance tagging: tag all filed-back outputs with source: llm-generated and a timestamp. Weight human-curated sources higher at compile time
Validation gate: add a verification step before filing outputs back to raw/ : cross-check claims against external sources
Periodic human review: randomly sample compiled articles and verify factual accuracy against original sources
Hash-based drift detection: track article content hashes across compilations. Flag articles that drift significantly between cycles for manual review
Generation depth tracking: track how many compilation cycles each fact has survived. Facts supported only by other LLM-generated facts (no external anchor) get flagged
Periodic full recompilation: occasionally recompile the entire wiki from original raw sources only (excluding filed-back outputs) to detect accumulated drift

Provenance tracking is not optional. It is the difference between a virtuous cycle and a vicious one.

research/2026-04-02-karpathy-llm-knowledge-base-pattern : primary source
topics/self-improving-agent-patterns : universal anti-pattern: feedback corruption
ideas/2026-04-02-vault-as-llm-knowledge-base : applying the pattern to this vault
topics/pitfalls/rubric-overfitting : related: both are cases where the system’s own outputs corrupt its evaluation

What Happened

Root Cause

How to Avoid

Related