// term 79 · Agentic Systems

Reflection Loop

Self-Review Mechanism

An agent reviewing and critiquing its own output before finalizing it — generate, examine, revise, repeat. The reflection loop builds a draft-and-review discipline into the system itself, catching errors at the moment they're cheapest to fix: before anyone else sees them.

Self-CritiqueRevisionQualityDraft-Review

// Pattern

draft → critique → revise

The writing-process discipline, systematized — first attempts treated as drafts by architecture.

// Lift

consistent

Reflection measurably improves accuracy, completeness, and constraint adherence across task families — one of agentic AI's most reliable patterns.

// Cost

2–3x tokens

Critique and revision passes multiply per-output compute — quality bought with latency and spend, tiered to stakes.

// full definition

What Reflection Loop actually is

First drafts are first drafts, whether human or machine — and single-pass generation amounts to shipping them. The reflection loop adds what good work always had: a review step. The system generates, then examines its own output — against the task's requirements, against the evidence, against explicit quality criteria — produces a critique, and revises. The error caught in reflection costs tokens; the same error caught downstream costs trust, rework, or consequences.

The mechanism works because critique is easier than creation. Models detect flaws in presented work more reliably than they avoid those flaws while generating — the same asymmetry that makes human editors valuable. Effective reflection sharpens this further with explicit criteria: not “improve this” but “check every figure against the source,” “verify each requirement is addressed,” “list claims lacking citations.” Vague reflection produces cosmetic revision; targeted reflection produces corrections.

Variants tune the pattern to stakes. Single-pass reflection catches the worst cheaply; iterative loops repeat until criteria pass or budgets cap; separate critic configurations — a fresh model instance unanchored to the draft's reasoning, sometimes a different model entirely — strengthen the review by removing the author's bias toward their own work. In agent workflows, reflection extends beyond outputs to actions: plans reviewed before execution, results examined after, the loop standing in wherever a careful human would pause and check.

The pattern's limits deserve clear eyes. Reflection inherits the reviewer's blind spots — a model that doesn't know a fact is wrong won't flag it; self-review without external grounding polishes more than it corrects. Convergence isn't guaranteed: loops can oscillate or drift past their best draft, which is why iteration caps and pass-fail criteria matter. And the economics are real: two to three times the tokens per output buys the quality lift — a price worth paying at consequence and worth skipping for the disposable. Tier the loop to the stakes, like every other reliability layer.

// how it works

Building review into the loop

Reflection inserts a critic between generation and delivery — the output examined against criteria, revised against findings, and only then released.

Generation

The first attempt produces — treated by the architecture as a draft, not a deliverable.

Criteria Recall

The review's standards assemble — task requirements, evidence checks, quality rubrics — the lens for examination.

Critique

The draft is examined against criteria — flaws, gaps, and unsupported claims surfaced as explicit findings.

Revision

The draft updates against the findings — corrections made, gaps filled, claims grounded or removed.

Convergence Check

Criteria pass, or the loop repeats within its budget — iteration bounded by caps and pass-fail gates.

Release

The reviewed output delivers — with critique history logged as the quality trail of how it got there.

// anatomy

The components teams must understand

Critic Role

The internal reviewer

The configuration examining the draft — same model re-prompted, fresh instance, or separate model, in rising independence.

Review Criteria

The explicit lens

Concrete checks replacing vague improvement — the specificity that turns reflection from cosmetic to corrective.

Critique Artifact

Findings, recorded

The explicit list of flaws and gaps — actionable input for revision and a logged quality trail afterward.

Revision Pass

Findings applied

The draft corrected against the critique — the step where review becomes improvement.

Iteration Bounds

Convergence discipline

Caps and pass-fail gates preventing oscillation and budget burn — loops that end, by design.

External Anchors

Grounding the review

Sources, tests, and validators feeding the critique — the outside truth self-review alone can't supply.

// strategic implications

What this changes for the business

01 · Quality

Cheap insurance at the moment of creation

Reflection catches errors where they're cheapest — before delivery — and lifts accuracy, completeness, and constraint adherence across task families. For any output with consequences, the 2–3x token cost is among the best-priced reliability available.

02 · Design

Criteria make or break the loop

Vague self-review polishes; explicit checks correct. Invest in the rubric — verifiable criteria, evidence requirements, pass-fail gates — and strengthen independence where stakes rise: fresh critic instances see what authors don't.

03 · Limits

Reflection isn't verification

Self-review inherits the model's blind spots — unknown errors stay unflagged, and polish can masquerade as correction. Anchor critiques in external truth (sources, tests, validators) and keep reflection one layer in the reliability stack, not the stack itself.

// common misconceptions

What Reflection Loop is not

Myth

“Models can't meaningfully review their own work.”

Reality

Critique is empirically easier than creation — models catch flaws in presented drafts they failed to avoid while writing. The asymmetry is the pattern's foundation, and the measured lifts are consistent.

Myth

“More reflection iterations mean better output.”

Reality

Returns concentrate in the first pass or two — beyond, loops oscillate, drift past their best draft, and burn budget. Bounded iteration with pass-fail criteria captures the value and skips the pathology.

Myth

“Reflection makes external review unnecessary.”

Reality

Self-review shares the author's blind spots — it complements grounding, verification, and human gates rather than replacing them. The loop is one reviewer on the committee, not the committee.

// from literacy to leverage

Know the term. Now build the strategy.

Vocabulary is the entry fee. Turning these primitives into pipeline, moats, and margin is the work. That's the conversation.

AI innovation, applied

Reflection Loop

What Reflection Loop actually is

Building review into the loop

The components teams must understand

What this changes for the business

What Reflection Loop is not

Explore the wider architecture

Know the term. Now build the strategy.