Journal

LinkedIn 40% to 100%: Quality Ratcheting in 10 Runs

rust-migrationratchetvault-buildoil-model

40% to 100% in 10 runs, 18 days. That is the LinkedIn Easy Apply arc: Run 1 failed 3 of 5 attempts; Run 10 submitted 6 of 6 with no silent failures. The same week, the oil model’s Monte Carlo time-window bug produced its first Polymarket-beating Brier score, and the vault launched with 200+ notes, zero edges, and a lifecycle chain that existed on paper only. Three projects, three feedback loops that did not exist until measurement forced them into existence.

LinkedIn 40% to 100% in 10 Runs

The 18-day ratchet arc had three distinct failure layers. Runs 1-5 exposed click-level failures: buttons not found, modals not confirmed, form state not verified. Runs 6-8 were the anti-detection detour forced by the account restriction: Gaussian timing, reading simulation, Bezier mouse paths, 3-submissions-per-30-minute safety rail. Runs 9-10 exposed field-level failures in select dropdowns, autocomplete inputs, and multi-step form flows.

The restriction at Run 6 looked like a setback. It was the forcing function. Until LinkedIn flagged the session, the automation was optimizing submission speed. The restriction required slowing down, adding behavioral signals, confirming every modal state. Those fixes made the system undetectable and reliable at the same time.

Run 9 (2026-04-02) landed proactive modal scrolling, CDP selectedIndex <= 0 placeholder detection replacing the flawed value !== "" check, and autocomplete ArrowDown+Enter for city/location/school fields. Result: 2/3 (67%). One 240-second timeout on a Databricks/Pyspark dropdown: the detection caught it, the fix hadn’t landed yet.

Run 10 deployed all 10 remaining fixes in a single batch: maxSteps raised from 10 to 15, stuck-cycle recovery when sameButtonCount >= 2, resume radio auto-select, CDP select placeholder fix (the Databricks root cause), expanded tech pattern recognition (20+ variants), select fallback to highest option when no pattern matches, autocomplete city fix, scroll-to-bottom before submit, progressive verification retries at 7s/3s/5s, and safety rail reduced from 15 to 7 minutes. Result: 6/6 (100%).

The structural lesson: layered failure systems require layered fixes applied in order. You cannot reach field-level failures until click-level failures are fixed. You cannot reach form-level failures until field-level failures are fixed. And you cannot reach any of them reliably until the behavioral anti-detection is solid enough that the platform lets you keep running.

The ratchet methodology: fix the top failure mode, measure, repeat: worked here because the failure modes surfaced sequentially and each round’s evidence pointed directly at the next layer. 26 individual fixes. 18 days. One metric that only went up.

On the same day Run 10 closed (2026-04-02), the CHRO audit system returned its first scan: 6-dimension review, 1,335 findings, 0/100. Five pipeline bugs fixed. A WAF (web application firewall) false-positive was blocking resume uploads via the /api/profiles/import path: Cloudflare’s rules flagged the binary PDF payload. Fix: base64+JSON encoding, path renamed to /parse-resume. The firewall was optimized for web browsing, not API payloads. The WAF is not the adversary; the WAF was never configured to know what it was protecting.

Transferable insight: Layered failure systems require layered repair in sequence. Fixing a downstream failure before the upstream failure is fixed means re-encountering the upstream failure once the downstream one is patched. Measure after each fix; let the evidence determine the order.

The Vault That Has Zero Edges

Mixin: lifecycle chain declared validated, deep audit found zero actual edges

The vault launched on 2026-03-30: 13 linked repos, 18 symlinks, 6 plugins, 200+ notes, zero lint violations after 7 migration batches. The pitfall-to-idea-to-experiment-to-skill-to-breakthrough lifecycle chain was declared validated.

The 2026-04-01 deep audit found it had zero actual edges. Notes were isolated islands. Pitfall files did not link to ideas. Experiments did not link to the breakthroughs they produced. Skills existed with no experiments that generated them. The knowledge graph had nodes and no edges.

The gap between “notes exist” and “notes are connected” is the gap between a filing system and a knowledge system. The former tells you where things are stored. The latter tells you how they relate, what caused what, which pitfalls produced which experiments, which experiments produced which skills. None of that was wired.

Remediation began the same day: 21 parallel Sonnet agents refreshed 580 vault files by 2026-04-03, adding bidirectional wikilinks, category tags, and cross-references. 259 R charts generated and embedded. Three daily hooks deployed: vault-audit-daily.sh, vault-system-status.sh, failure-flush.sh. The lifecycle chain was re-measured after the edge-wiring pass. It had edges.

The cron hardening on 2026-04-02 found 6 silently-broken scheduled jobs. Three fixed: keychain-locked usage-probe that failed silently on every run, GNU-only head -z crash in obsidian-cron, and exit-code inversion in vault-stale that reported success on failure. Scheduled infrastructure fails silently. No crash, no log, no alert: just tasks that do not run.

Transferable insight: Declaring a system works without measuring its connections is the knowledge-graph equivalent of shipping untested code. Node count is not edge count. Measure edges, not files.

A Monte Carlo Time-Window Bug That Made the Model Predict the Past

Mixin: lookahead bias in the oil model produced degenerate 0%/100% probabilities

The oil model’s v18 Monte Carlo tracked a mar31Day peak variable: the realized WTI crude futures peak for March: and then compared each simulation run against April Polymarket markets using that same variable. Once time advanced past March 31, every simulation locked to the already-known March peak. Lookahead bias: the model was describing the past while appearing to predict the future.

The symptom was degenerate probabilities. Positions the model assessed as 0% probable had already happened. Positions it assessed as 100% certain were already locked in. A model that outputs only 0% and 100% is not a probability model: it is a history lookup. It cannot be calibrated and cannot be improved.

Fixing the time window: scoping mar31Day to only data available at each simulation’s prediction point: immediately unlocked gradient. A 28-step Karpathy ratchet followed: 6 accepted changes, 22+ rejected. Terminal result: R² 0.9134, MAPE 2.26%, +2.16pp Brier vs Polymarket uniform, 55% win probability across all thresholds. v18.1 is the first version that functions as a competitive forecaster rather than a backtested artifact.

Financial models that accidentally use future data will always pass backtests. The bug is structurally undetectable by normal testing: the model looks calibrated on historical data because it has seen the historical data. The only detection method is checking that every variable in the model is scoped to data available at prediction time, not data available at evaluation time.

Transferable insight: Lookahead bias is the silent killer of time-series models. A model with future data in its inputs always passes backtests. Variable scope at prediction time, not evaluation time, is the only check that catches it.

Zeitgeist

@Fried_rice
Claude Code source code leaked via npm source map files: an unintentional exposure that made the tool’s internals visible to the developer community before any official disclosure.
@claudeai
Claude Code Security: AI-powered codebase vulnerability scanning in limited research preview. Agents that can audit their own codebase for security vulnerabilities are a different class of tool than agents that only generate code.
@claudeai
Computer use lands in Claude Code: open apps, click through UI, test from CLI. The agentic surface area expands from files to the full desktop.

By the Numbers

MetricValue
Repositories11
Commits200
Largest commit+58,259 lines (oil v18.1 data refresh)
LinkedIn success rate arc40% (Run 1, Mar 15) to 100% (Run 10, Apr 2)
Individual fixes applied26 across 10 runs
Rust crates compiled3 (rusty-dakka, rusty-bloomnet, rusty-dakka-editor)
Rust lines landed14,800
Oil model Brier improvement+2.16pp vs Polymarket after MC time-window fix
Oil R20.9134
Vault notes at launch200+
Vault lifecycle chain edges at launch0
Files refreshed by parallel agents580
Silently-broken cron jobs found6 (3 fixed)
CHRO audit first scan0/100 (1,335 findings)

Changelog

260507: Generated by journalize-weekly (topic-first format, v2 regeneration)

BLACKOUT week: session telemetry unavailable, article synthesized from git commits (200 total across 11 repos), daily journals (2026-03-30 through 2026-04-05), vault experiment/breakthrough/pitfall files, and 2026-W14-packet.json. No session counts, costs, or MQI data exist for this week. Topic-first format replaces project-by-project reporting. New definitions created: lookahead-bias.