122K Lines Deleted: Red-Team Server to Visualization MCP
149 sessions, $341.93, cache hit rate 0.9972. Three of the four breakthroughs this week arrived fully formed in a single day each: the 122K-line deletion, the 15,030-line scaffold, and the 1,404-line audio notification system. The clearest diagnostic signal ran in the opposite direction: 45 sessions across two days on the same project, $72 spent, zero commits. The sprint-to-zero pattern is as informative as the breakthrough one.
122K Lines Deleted: Red-Team Server to Visualization MCP
The before state was 520 files and 122,675 lines of red-team attack scaffolding. The after state was 4 brand templates, 36 visualization scripts, and a 108-check quality validator. Both states live in the same repository. The delta happened in one commit on Feb 25.
The decision rule was direct: the codebase no longer matched the question being asked. The red-team attack automation had been the right shape for an adversarial testing project. That project was complete. The next question was visualization tooling for a different domain entirely. Rather than retrofitting attack scaffolding toward visualization, the entire surface was replaced.
The new architecture is a pure context provider: an MCP server that serves plan sections, prompt templates, visualization scripts, and quality checklists via MCP resources and tools. No API keys required. The client LLM does all script generation. The server supplies brand context and quality rules. The 39,950-line top commit added the brand token system, 36 viz scripts, and the unified quality validator in a single push.
Four brand templates (slate, aurora, earth, journal) carry custom typography, palettes, and spacing. The quality validator enforces 12 checks per chart family across legibility, contrast, palette consistency, axis formatting, and export size: 108 checks total. Four chart families were rewritten with domain-specific best practices: raincloud plots, force-directed networks, Kaplan-Meier survival curves, multiline time series. 38 new implementation rules landed across 4 prompt templates.
OAuth client persistence was fixed on the same day, ending the requirement to re-authorize on every server restart. The MCP SDK was pinned at 1.10.1 after testing found that SDK version 1.26.0 auto-enables DNS rebinding protection that rejects Railway’s internal hostname routing, and Starlette’s Mount issues a 307 redirect that Claude.ai does not follow for POST requests. Five commits in 28 minutes to diagnose; the fix required pinning and adding ASGI path rewrite middleware.
The technical architecture decisions proved durable: 259 R charts generated in a later wiki enrichment sweep trace back to the templates introduced here. The server is a stable context substrate, not a generation engine.
The harder question is why deletion is rare. When code is generated quickly, the sunk cost feels larger than when it is written slowly. AI-assisted development makes large codebases fast to produce, which makes them harder to abandon. The decision to delete 122K lines required treating generation velocity as irrelevant to the value judgment.
Transferable insight: The courage to delete code is rarest when code is cheapest to produce. Sunk cost scales with generation speed. The right question is whether the codebase matches the current question, not whether it took effort to write.
The Ralph Loop: Autonomous Builds Without Human Handoffs
Mixin: how narrow scope makes agent loops reliable
The hypothesis was direct: a bash loop invoking Claude Code once per spec file could autonomously build a multi-component CLI system without human handoffs. The Ralph Loop experiment confirmed it, with three design decisions making the difference.
First: scoping via a RALPH_SPEC environment variable. Agents given full project context inject unnecessary dependencies (the prompt anti-dependency bias). Agents given a single spec by file path reference stay within scope. The anti-dependency bias finding is a generalizable rule, not a project-specific quirk.
Second: removing set -e from the loop. Agent failures are recoverable events, not terminal conditions. A rigid shell that exits on the first error wastes all prior successful work. The correct model treats each spec as independently failable.
Third: a plan-reviewer plus coder-replan escape hatch. A separate Claude invocation reads the spec and the proposed plan before execution commits to it. If they diverge, the agent replans rather than proceeding off-track. This handles the long tail of agent failures without requiring human review of every step.
The kiro-cli-factory scaffolding was generated without a single manual handoff. The pattern subsequently propagated to two other projects. Autonomous build loops work when the scope is narrow and the escape hatch is real.
Transferable insight: Autonomous AI build loops fail at scale because scope creeps and errors cascade. Narrow scoping per spec, error tolerance, and a pre-execution plan-review step make the difference between a reliable loop and an expensive dead end.
15,030 Lines in One Day: The Monorepo That Survived Three Renames
Mixin: domain boundaries as the durable architectural unit
Thursday, Feb 26: the AutoHunt job automation monorepo scaffolded at 15,030 lines across 107 files and 9 packages. Turborepo plus pnpm workspaces, Drizzle ORM over SQLite, XState v5 for per-application state machines, 10 platform adapters, Next.js 14 dashboard with Socket.io. One day.
Every interface boundary set on Feb 26 survived three project renames: autohunt to autojob to autosearch to jobs-apply. It survived a full SaaS migration. The reason is that the architecture encoded domain constraints rather than implementation choices. Per-platform variance requires adapters. Per-application state requires a state machine. Real-time observability requires a persistent connection. These are constraints imposed by the problem, not choices made by the developer. Constraints encoded in structure do not break on rename.
The two days immediately following destroyed the pattern. Friday: 36 sessions averaging 1.2 minutes, Haiku-dominant, zero commits. Saturday: 9 sessions averaging 1.3 minutes, 100% Opus, still zero commits. $72 across two days with no durable artifacts. The model shift from Haiku to Opus on Saturday did not change the outcome, ruling out model selection as the variable. Session length (1.2-1.3 minutes both days) is the structural constraint: 45 sessions that are each too short to reach a commit are not equivalent to 3 sessions that are each long enough to complete a unit of work.
The MQI trajectory tells the same story. Monday opened at 0.2827. By Thursday’s 69-session peak the week’s lowest MQI registered: 0.1538, composite Z of -1.0202. The week’s highest session-count day was the week’s lowest quality day. MQI correlates with session volume and project breadth, not with model tier. More sessions across more projects in shorter average durations produces lower composite quality scores, even when the individual model selections are high-capability.
Commit gating is the fix: require at least one WIP commit per session, even a single markdown bullet summarizing what was found and what blocked progress. The commit exists as an artifact even if the code does not.
Transferable insight: Get the domain boundaries right on day one and the rest is renaming. Get them wrong and no rename saves you. The durability test is whether the structure survives the question changing, not whether the code compiles.
Zeitgeist
By the Numbers
| Metric | Value |
|---|---|
| Sessions | 149 |
| Total cost | $341.93 |
| Largest single day | $194.90 (Thu Feb 26, 69 sessions) |
| redcorsair sessions | 46 |
| redcorsair cost | $173.32 |
| Lines deleted (redcorsair) | 122,675 |
| Lines added (redcorsair) | 44,634 |
| Top commit additions | 39,950 |
| AutoHunt lines day 1 | 15,030 |
| AutoHunt packages | 9 |
| AutoHunt platform adapters | 10 |
| Zero-commit autohunt days | 2 (Fri-Sat) |
| Zero-commit autohunt cost | $72 |
| Avg cache hit rate | 0.9972 |
| MQI delta vs prior week | -0.1865 |
Changelog
260507: Generated by journalize-weekly (topic-first format, v2 regeneration)
Rewrote from per-project format to topic-first. Primary: 122K line deletion pivot. Mixins: ralph loop narrow-scope pattern + monorepo domain boundary durability. Stripped private project refs from frontmatter and body per Phase 4 rules. Added definitions: cascade-attack, seeded-prng, snapshot-testing, monorepo. Title trimmed from 77 to 57 chars to meet under-70 rule.