Event-driven A/B system: 0% to fully autonomous in one session
0% operational: consent gate split-brain dropping 100% of events, 7/8 experiments paused, evaluator never run, no promotion loop (30 days dormant) -> Fully autonomous: 8 experiments running, threshold-triggered Bayesian evaluation, auto-promotion with challenger rotation, 18 events verified in Neon via browser test
0% operational to fully autonomous. A 9-issue audit revealed the A/B experiment system (built 2026-03-28) had never collected a single event or run a single evaluation in 30 days. The consent gate was split-brain (localStorage vs cookie), 7 of 8 experiments were paused, and conversion goals targeted a dead waitlist metric. 15-task rebuild via Subagent-Driven Development delivered an event-driven architecture in one session.
What Broke
The original system had a fatal measurement gap: CookieBanner wrote ja_consent=1 to document.cookie, while hasConsent() read localStorage.cookie_consent. Every visitor who accepted cookies got zero events tracked. This meant zero data in the evaluator, zero verdicts, zero promotions. The system was architecturally incapable of producing results.
On top of that: 7 of 8 experiments were paused in the static JSON config, the evaluator cron was never wired up, conversion-goals.ts still targeted waitlist_submit (the product pivoted to download+signup on 2026-04-14), and there was no promotion loop to act on verdicts even if they existed.
What Changed
Replaced the entire architecture:
| Before | After |
|---|---|
Static funnel-experiments.json | DB experiments table with 60s TTL cache |
| Manual cron evaluation | Threshold-triggered via event route (fire-and-forget) |
| No promotion loop | promotion-engine.ts: conclude, dequeue challenger, create successor |
| No observability | /api/internal/ab-health with staleness detection |
| Consent split-brain | Unified on localStorage (cookie-banner.tsx was already deleted) |
| 7/8 experiments paused | All 8 seeded as running in Neon |
The Bayesian evaluation uses Beta-Binomial posterior with 20k MCMC samples and 6-tier stopping rules (fast at 80% confidence, standard at 90%, precision at 95%, plus loser/inconclusive/timeout tiers). Adaptive thresholds scale evaluation frequency to traffic: clamp(daily_rate / 2, min=25, max=100).
Verification
Browser test against production (2026-04-27): opened site in isolated Playwright context, accepted consent, browsed 6 pages. Results in Neon:
- 18 analytics events recorded
- 8 experiment assignments created (one per experiment)
event_countincrementing on all experiments- Evaluation will fire automatically when count crosses threshold
Pattern
This is a textbook case of the Infrastructure Without Operations anti-pattern: building the capability without building the operational loop. The code was correct, the tests passed, but the system was architecturally incapable of producing results because the measurement layer was broken and nobody was checking.