Breakthrough Preferences jobs-apply

Event-driven A/B system: 0% to fully autonomous in one session

0% operational: consent gate split-brain dropping 100% of events, 7/8 experiments paused, evaluator never run, no promotion loop (30 days dormant) -> Fully autonomous: 8 experiments running, threshold-triggered Bayesian evaluation, auto-promotion with challenger rotation, 18 events verified in Neon via browser test

April 26, 2026

breakthroughjobs-applyab-testingbayesian

Key Metric

Before

0% operational: consent gate split-brain dropping 100% of...

→

↓

After

Fully autonomous: 8 experiments running, threshold-trigge...

0% operational to fully autonomous. A 9-issue audit revealed the A/B experiment system (built 2026-03-28) had never collected a single event or run a single evaluation in 30 days. The consent gate was split-brain (localStorage vs cookie), 7 of 8 experiments were paused, and conversion goals targeted a dead waitlist metric. 15-task rebuild via Subagent-Driven Development delivered an event-driven architecture in one session.

What Broke

The original system had a fatal measurement gap: CookieBanner wrote ja_consent=1 to document.cookie, while hasConsent() read localStorage.cookie_consent. Every visitor who accepted cookies got zero events tracked. This meant zero data in the evaluator, zero verdicts, zero promotions. The system was architecturally incapable of producing results.

On top of that: 7 of 8 experiments were paused in the static JSON config, the evaluator cron was never wired up, conversion-goals.ts still targeted waitlist_submit (the product pivoted to download+signup on 2026-04-14), and there was no promotion loop to act on verdicts even if they existed.

What Changed

Replaced the entire architecture:

Before	After
Static `funnel-experiments.json`	DB `experiments` table with 60s TTL cache
Manual cron evaluation	Threshold-triggered via event route (fire-and-forget)
No promotion loop	`promotion-engine.ts`: conclude, dequeue challenger, create successor
No observability	`/api/internal/ab-health` with staleness detection
Consent split-brain	Unified on `localStorage` (`cookie-banner.tsx` was already deleted)
7/8 experiments paused	All 8 seeded as `running` in Neon

The Bayesian evaluation uses Beta-Binomial posterior with 20k MCMC samples and 6-tier stopping rules (fast at 80% confidence, standard at 90%, precision at 95%, plus loser/inconclusive/timeout tiers). Adaptive thresholds scale evaluation frequency to traffic: clamp(daily_rate / 2, min=25, max=100).

Verification

Browser test against production (2026-04-27): opened site in isolated Playwright context, accepted consent, browsed 6 pages. Results in Neon:

18 analytics events recorded
8 experiment assignments created (one per experiment)
event_count incrementing on all experiments
Evaluation will fire automatically when count crosses threshold

Pattern

This is a textbook case of the Infrastructure Without Operations anti-pattern: building the capability without building the operational loop. The code was correct, the tests passed, but the system was architecturally incapable of producing results because the measurement layer was broken and nobody was checking.