Iteration Objective
The second axis of self-improvement taxonomy: what the agent is iterating TOWARD (goal, metric, correctness, diversity, robustness, equilibrium).
The second axis of self-improvement taxonomy: what the agent is iterating TOWARD (goal, metric, correctness, diversity, robustness, equilibrium).
The iteration objective is what an agent’s improvement loop is converging TOWARD. It is the second axis of the self-improvement taxonomy: orthogonal to “what improves” (harness, skills, config, knowledge, decisions, weights). Two agents can both improve config (same “what”) but use completely different iteration strategies: one ratchets a scalar metric (IO-2), another explores a tree of candidates (IO-4). Eight objectives documented: goal-seeking, metric-ratcheting, reflection-accumulating, search-exploring, population-evolving, adversarial-competing, state-reconciling, stress-hardening.
How It Works
Classify by asking: “When the agent finishes one iteration, what signal tells it whether to keep the result?” Goal-seeking checks a binary completion condition. Metric-ratcheting checks a scalar. Reflection checks verbal self-critique. Search checks node evaluation. Evolution checks fitness. Adversarial checks win/loss. Reconciliation checks state delta. Stress checks survival. The signal type determines the convergence properties and failure modes.
The Huang Constraint
Every working iteration objective relies on external feedback (test suites, environment rewards, scalar metrics, adversarial opponents). Huang et al. (ICLR 2024) proved that pure intrinsic self-correction: where the model judges its own work with no external signal: degrades performance. This is the foundational constraint of the taxonomy. See topics/pitfalls/self-correction-without-external-feedback.
The Eight Objectives
| IO | Name | Signal Type | Vault Instance | Convergence |
|---|---|---|---|---|
| 1 | [Goal-Seeking](/definitions/goal-seeking-loop) | Binary (done/not done) | Ralph Loop | Terminates or stalls |
| 2 | [Metric-Ratcheting](/definitions/karpathy-ratchet) | Scalar (higher is better) | Karpathy Ratchet | Monotonic, plateaus |
| 3 | [Reflection-Accumulating](/definitions/reflection-accumulating-loop) | Verbal critique | Lifecycle Chain | Unbounded (Huang risk) |
| 4 | [Search-Exploring](/definitions/search-exploring-loop) | Node evaluation | Ratchet hypothesis phases | Branch-bounded |
| 5 | [Population-Evolving](/definitions/population-evolving-loop) | Fitness function | Multi-persona audit | Open-ended |
| 6 | [Adversarial-Competing](/definitions/adversarial-competing-loop) | Win/loss signal | Agent-MQI | Arms race |
| 7 | [State-Reconciling](/definitions/state-reconciling-loop) | State delta | Stella hooks, repair-sweep | Equilibrium-seeking |
| 8 | [Stress-Hardening](/definitions/stress-hardening-loop) | Survival under stress | Probe suite | Antifragile |
Related
- topics/self-improving-agent-patterns: master topic hub
- skills/self-improving-agent-patterns: pattern selection skill
- definitions/karpathy-ratchet: IO-2 canonical instance
- definitions/ralph-loop: IO-1 canonical instance
- definitions/lifecycle-chain: IO-3 vault instance