Model Escalation
The pattern where an AI pipeline shifts to a more capable (and costly) model as task difficulty increases, either through explicit routing logic or emergent queue composition.
The pattern where an AI pipeline shifts to a more capable (and costly) model as task difficulty increases, either through explicit routing logic or emergent queue composition.
Model escalation occurs when a pipeline transitions from a faster, lower-cost model to a deeper, higher-cost one as target complexity increases. The trigger can be explicit (a routing rule based on target metadata) or emergent (the easy targets were consumed first, leaving only hard targets in the queue).
In a well-architected research pipeline, escalation should be visible in the data: session count drops, average session length grows, cost per session rises, but total weekly cost stays within the same band because the volume also drops.
The failure mode is uncontrolled escalation: hard targets queued in parallel without difficulty gating, forcing expensive models to run at sweep volume rather than reasoning volume.