About Gutierrez Public Lab
I build self-improving systems across 12+ codebases. Not as a concept: as working infrastructure with 607 tests, verified metrics, and feedback loops that make each system better without me.
The problem: nobody can see any of it. This site exists to solve a distribution problem. It turns private experiments into public stories with real outcomes you can verify.
Two Focuses
Learning Experiments
Every project decision becomes a testable hypothesis. Quant finance, browser automation, data quality, developer analytics: domains I had zero expertise in, tracked with research-lab rigor.
Self-Improving Agents
Systems that make themselves better. Self-improving skills, self-healing tests, self-documenting history, self-managing context, self-improving data models, self-orchestrating compute.
The Lifecycle Chain
Every failure eventually enables a win. The chain connects them with verifiable metrics.
This is the connective tissue. Every pitfall points to the experiment that investigated it. Every breakthrough traces back to the failure that motivated it. The chain is navigable.
The Thesis
I believe we're in a specific moment: AGI capabilities exist, but the infrastructure for individual AI behavior does not. Five pillars need to exist for AI agents to behave as individuals: personality (the foundation), persistence, memory, preferences, and social modeling.
Every project in this lab is a piece of the puzzle. Read the full thesis →
Projects (9)
Monte Carlo pricing engine for WTI crude with calibration-proper dual-platform ratchet against Polymarket and Kalshi prediction markets.
6-channel job automation with CDP browser control, behavioral adaptation, and J-Score v2 matching.
Parallel Claude Code orchestrator with live API rate limit monitoring, spawn/kill UI, and Tauri v2 desktop.
Unified developer intelligence system: usage analytics, dynamic session context curation with adaptive forgetting, and multi-channel agent notifications.
This site. Turns private experiments into public stories with real metrics. The distribution layer for everything else.
Computable voice profile distilled from a decade of X/Twitter archive. Four measured lanes, 8 live posting rounds, brand-voice skill integrated into writing tasks.
Second personality vector. Voice at the email register: how tone, formality, and structure shift when the channel changes. Gmail import pipeline feeds the same distillation engine as brand-voice.
Third personality vector. Visual personality extracted from the Apple Photos library using on-device ML metadata. Integrated as Step 15d of the BloomNet ingest.
Fourth personality vector. Consumption side of the flywheel: Meta feed and YouTube watch history ingested as vault frames, reconciled against the brand-voice production lanes. Project frame and two ingest experiments authored 2026-04-17; API and Takeout pipelines pending first run.
Tools
Obsidian vault, 609+ notes, 14 dimensions, 33 skills, BloomNet for session management
R + ggplot2, Paul Tol colorblind-safe palettes, Slate brand system, 118+ charts
Astro + Tailwind CSS, MDX, static generation, Cloudflare Pages
Methodology
- Pitfalls document failures with root causes
- Ideas propose solutions to those failures
- Experiments test ideas with measurable metrics
- Skills codify what works into reusable patterns
- Breakthroughs verify the wins with before/after data
Everything lives in an Obsidian vault connected via wikilinks and tracked with frontmatter schemas. This site syncs from that vault, transforms the content, and generates publication-quality visualizations using R and ggplot2.