Definitions

The Glossary

Every technical term used in this lab, explained for humans. Hover over a dotted-underline term anywhere on the site for a quick tooltip, or click through for the full explanation with a visualization.

Terms Defined

A/B Test

Split your audience in two, show each group a different version, and let the numbers pick the winner.

jobs-apply

Ablation Study

Remove one piece, measure the drop. Tells you which component is actually doing the work.

oil

Adversarial-Competing Loop

Iterate by competing against self or other agents, using adversarial pressure to force capability improvement. IO-6 in the iteration-objective taxonomy.

agent-mqi

Appcast

An RSS-based XML feed that desktop applications poll to discover and download new versions automatically.

Automation Ratio

The fraction of AI pipeline session volume attributable to automated batch processes versus interactive human-driven sessions. The key metric for distinguishing genuine productivity from inflated session counts.

Benchmark Comparison

Measure your system against a known standard. Not 'is it good?' but 'is it better than the alternative?'

oil

Bezier Curve

Mathematical curve for smooth, human-like mouse trajectories. Straight-line movements are a bot fingerprint.

jobs-apply

Brier Score

A measure of how good your predictions are. 0 is perfect, 0.25 is coin-flip guessing.

oil

Canary Deployment

Route a small percentage of traffic to the new version. If nothing breaks, gradually increase.

Canvas 2D Rendering

Browser-native pixel drawing API for games, visualizations, and interactive graphics.

bloomnet

Cascade Attack

A multi-turn manipulation strategy that escalates context incrementally until a safety boundary is crossed.

CDP (Chrome DevTools Protocol)

The protocol for controlling Chrome programmatically. How the lab automates a real browser.

jobs-apply

Chaos Engineering

Deliberately break things in production to find weaknesses before real failures do.

CVaR (Conditional Value at Risk)

The average loss in the worst-case scenarios. Measures how bad things get when they get bad.

oil

Designed-Then-Built

A development mode where the design is fully resolved before the first line of code is committed, producing large initial commits and minimal post-initial structural change. Contrasts with scaffold-then-accrete.

DQI (Data Quality Index)

Composite score measuring how complete and correct your data is. Seven components, one number.

jobs-applyoil

Drawdown

The fall from peak to trough. How much you lost before recovering, and how long it took.

oil

Electron

Framework for building desktop apps with web technologies (Chromium + Node.js).

dakka

Embeddings

Text converted to numbers that capture meaning. Similar ideas land near each other in vector space.

jobs-apply

Exponential Backoff

Retry strategy that doubles wait time after each failure. Prevents thundering herd while keeping recovery fast.

jobs-applydakka

Fine-Tuning

Taking a pre-trained AI model and teaching it something specific with targeted examples.

Futures Contract

An agreement to buy or sell something at a set price on a future date. The backbone of commodities.

oil

Gaussian Distribution

Bell curve probability distribution. Used for timing jitter that looks human-natural, not robotic-uniform.

jobs-apply

Goal-Seeking Loop

Iterate until a discrete completion condition is met. Build new things from specs. IO-1 in the iteration-objective taxonomy.

dakkajobs-apply

Goodhart Gaming

Greedy Parameter Sweep

Try every combination of settings, pick the best. Brute force but reliable when done right.

oil

Hallucination

When AI confidently says something that isn't true. Plausible fiction presented as fact.

jobs-apply

Hybrid Search

Combining BM25 keyword search with vector similarity search, fused via Reciprocal Rank Fusion (RRF).

bloomnet

Iteration Objective

The second axis of self-improvement taxonomy: what the agent is iterating TOWARD (goal, metric, correctness, diversity, robustness, equilibrium).

Karpathy Ratchet

A quality metric that only goes up, never down. Each iteration must beat the previous best.

jobs-applyoil

Lifecycle Chain

The connective tissue of this lab: failure to pitfall to experiment to breakthrough. Nothing is wasted.

public-lab

Lookahead Bias

Using future data to predict the past. Always improves backtest results. Always fails in production.

oil

MAE (Mean Absolute Error)

The average distance between your predictions and reality. Lower is better.

oil

MAPE (Mean Absolute Percentage Error)

MAE as a percentage. Makes errors comparable even when the numbers are wildly different scales.

oil

MCP (Model Context Protocol)

Anthropic's standard for giving AI models tools and data. Like USB, but for AI connections.

quick-fincontext-curator

Model Escalation

The pattern where an AI pipeline shifts to a more capable (and costly) model as task difficulty increases, either through explicit routing logic or emergent queue composition.

Model-Capability Harness

Eval archetype that measures raw model abilities using deterministic scoring. Think SAT for language models.

oil

Monorepo

A single version-controlled repository containing multiple related packages or services with shared tooling.

Monte Carlo Simulation

Run thousands of random scenarios to map the range of possible outcomes. Dice rolls for decisions.

oil

OAuth

Authorization protocol that lets apps access user data without sharing passwords.

jobs-apply

Ownership Model

Rust's compile-time memory management system: every value has one owner, borrows are checked at compile time, race conditions become impossible.

bloomnetdakka

Pearson Correlation

Measures linear relationship between two variables, -1 to +1. Dangerous when used to validate derived scores against themselves.

oil

Placeholder Sentinel

Polymarket

A prediction market where people bet real money on future events. The lab's oil model beats it.

oil

Population-Evolving Loop

Iterate by maintaining a diverse population of agents or solutions, selecting the fittest, and mutating. IO-5 in the iteration-objective taxonomy.

Progressive Deployment

Deploy, measure breakage, fix, deploy again. Each iteration hardens the system through real-world contact.

jobs-apply

Proof of Concept

Build the smallest possible version to prove the idea works before investing fully.

dakka

R-squared (R²)

How much of reality your model explains. 1.0 means it captures everything, 0 means it captures nothing.

oil

Ralph Loop

An autonomous build loop: iterate over spec files, invoke an AI agent per spec, assemble the whole system.

dakka

Rate Limiting

Controlling how fast you can make requests. A bouncer for APIs that prevents overload and abuse.

jobs-applydakka

Red Team Testing

Hire someone to attack your system on purpose. Find vulnerabilities before a real attacker does.

Reflection-Accumulating Loop

Iterate by building verbal self-critique memory. Each failure adds to an episodic buffer that steers future attempts. IO-3 in the iteration-objective taxonomy.

public-lab

Regression Testing

Run all existing tests after every change. Make sure fixing one thing didn't break ten others.

dakka

RMSE (Root Mean Square Error)

Average prediction error that penalizes big misses more than small ones. Lower is better.

oil

Seeded PRNG

A pseudo-random number generator initialized with a fixed seed, producing identical sequences on every run.

Session Architecture

How AI pipeline sessions are structured, batched, and sequenced, including model selection per session type, target ordering, and session-to-task granularity. The primary cost driver in high-volume AI pipelines.

Sharpe Ratio

Return per unit of risk. Measures whether you're getting paid enough for the danger you're taking.

oil

Snapshot Testing

Test by comparing output against a committed golden file; any change forces an explicit decision.

Target Pool Composition

The distribution of task difficulty tiers within an AI pipeline's work queue. The primary determinant of effective cost per session; harder pools force model escalation regardless of configured routing defaults.

Tenant Isolation

Multi-tenant data architecture where one user's data cannot be read or written by another user, enforced at multiple layers.

jobs-apply

Token

How AI reads text: not words, but chunks. A token is roughly 3/4 of a word on average.

Transformer

The neural network architecture behind GPT, Claude, and every modern LLM. Built on attention.

Volatility

How much a price bounces around. High volatility means big swings, not just big losses.

oil

WebSocket

A persistent two-way connection between client and server. Like a phone call, not sending letters.

jobs-applydakka