// term 74 · Agentic Systems

AI Agent

Autonomous AI Operator

An AI system that perceives its environment, makes decisions, and takes goal-directed actions — using tools, APIs, and multi-step reasoning to complete open-ended work. The agent is the unit of AI labor: not a model answering questions, but a system pursuing outcomes.

AutonomyToolsGoal-DirectedOperators

// Anatomy

model + harness

An LLM provides the judgment; orchestration, tools, memory, and guardrails provide everything else an operator needs.

// Span

minutes–days

Task horizons from single workflows to standing responsibilities — far beyond the single-response interactions that preceded them.

// Reliability law

compounding

Per-step success rates compound across steps — 95% per action is 36% across twenty, which is why agent engineering is reliability engineering.

// full definition

What AI Agent actually is

An agent differs from a chatbot in kind, not degree: it is built to pursue outcomes rather than produce responses. Given a goal — resolve this incident, prepare this analysis, manage this queue — the agent decides what to do, does it, observes what happened, and continues until done or blocked. The language model supplies the per-step judgment; the surrounding system — orchestrator, tools, memory, permissions — turns judgment into an operator capable of touching real systems.

The loop is the architecture. Perceive: the agent assembles its picture from the goal, its memory, and fresh observations. Decide: the model selects the next action against the plan. Act: a tool call executes — a query, an API invocation, a code run, a message. Observe: results return as new information, errors included, feeding the next cycle. This grounding in real feedback is the agent's defining strength: plans collide with reality every step, and reality wins — the agent adjusts rather than narrates.

The engineering discipline is dominated by one piece of arithmetic: errors compound. A 95% per-step success rate yields 36% across twenty steps — which is why production agents are wrapped in reliability machinery: validation between steps, bounded retries, checkpoints that preserve progress, escalation paths when confidence drops, and permission scopes that cap the blast radius of any single wrong action. Capability gets the demos; reliability engineering gets the deployments.

Strategically, agents change the unit of AI value from answers to outcomes — priced per completed task and benchmarked against the fully loaded cost of the process they absorb. They also change the management problem: agents occupy roles, with responsibilities, permissions, performance reviews (evaluations), and audit trails. The organizations adopting them well treat agent design as workflow design — decomposing processes, placing human gates at consequence boundaries, and measuring task completion the way they would for any operator, silicon or otherwise.

// how it works

From goal to completed work

An agent runs a continuous loop — perceive, decide, act, observe — with a model supplying judgment and a harness supplying hands, memory, and limits.

Goal & Scope

The agent receives its objective, constraints, permissions, and escalation rules — the contract under which it operates.

Perception

Context assembles — task state, memory, fresh observations — the picture from which the next decision is made.

Decision

The model selects the next action against plan and feedback — judgment applied one step at a time.

Action

A tool executes the choice — query, API call, code run, message — judgment becoming effect in a real system.

Observation

Results and errors return as information — reality's feedback grounding the loop's next iteration.

Completion or Escalation

Done, blocked, or out of bounds — the agent delivers with its audit trail, or hands up to a human with context intact.

// anatomy

The components teams must understand

Reasoning Core

The judgment engine

The LLM making per-step decisions — its quality setting the ceiling on task complexity the agent can navigate.

Tool Interface

Hands on systems

The defined action set — search, databases, APIs, code execution — through which decisions become effects.

Memory & State

Continuity machinery

Working context, task state, and durable memory — what keeps long-running work coherent across steps and sessions.

Orchestrator

The loop runner

The harness sequencing perceive-decide-act-observe — retries, timeouts, checkpoints, and the discipline of the cycle.

Permission Envelope

Scoped authority

What the agent may do unsupervised, what needs approval, what's forbidden — autonomy bounded by policy in code.

Audit Trail

The accountability record

Every decision, action, and observation logged — the artifact that makes delegated work reviewable and defensible.

// strategic implications

What this changes for the business

01 · Value

Agents are priced per outcome

The economic unit shifts from tokens to completed tasks — benchmarked against the fully loaded cost of the process absorbed, error handling included. That math justifies far more than chat ever did, and it demands honest measurement of completion rates, not demo reels.

02 · Reliability

Compounding errors are the engineering problem

Per-step success compounds brutally across long tasks — production agents are mostly reliability machinery around a reasoning core. Evaluate agents on multi-step completion rates under realistic conditions; single-step accuracy flatters every system.

03 · Management

Agents occupy roles, not features

Permissions, responsibilities, evaluations, and audit trails — the management apparatus of delegation applies in full. Define the role before deploying the agent: what it owns, where humans gate, and who answers for its mistakes.

// common misconceptions

What AI Agent is not

Myth

“An agent is a chatbot with plugins.”

Reality

The loop changes the category: goal pursuit, real actions, compounding consequences, and accountability needs that no response-generator carries. The engineering and governance are different disciplines, not extensions.

Myth

“Better models will make agent harnesses unnecessary.”

Reality

Stronger reasoning raises per-step quality; compounding still demands validation, checkpoints, permissions, and audit. The harness is where reliability and accountability live — model progress shifts its emphasis, not its necessity.

Myth

“Agents either work or they don't.”

Reality

Agent performance is a distribution over tasks and conditions — completion rates, escalation rates, error severities. Deployment readiness is a measured threshold per use case, not a binary the demo settles.

// from literacy to leverage

Know the term. Now build the strategy.

Vocabulary is the entry fee. Turning these primitives into pipeline, moats, and margin is the work. That's the conversation.

AI innovation, applied

AI Agent

What AI Agent actually is

From goal to completed work

The components teams must understand

What this changes for the business

What AI Agent is not

Explore the wider architecture

Know the term. Now build the strategy.