What does an AI agent actually do?

An AI agent uses a language model to take real actions in software — looking up data, drafting emails, updating CRM records, scheduling meetings, calling other APIs. Crucially, it loops: it observes the result of each action and decides what to do next, until the goal is met.

Is ChatGPT an AI agent?

Plain ChatGPT (the chat interface) is not — it generates a response and stops. ChatGPT with the Operator / Agent features (released 2024–2025) is an agent: it can browse, click, type, and iterate. Custom GPTs with actions are also agents in the strict sense once they call external APIs in a loop.

What are the 5 types of AI agents?

In classical AI taxonomy: (1) simple reflex agents (pure stimulus-response), (2) model-based reflex agents (maintain internal state), (3) goal-based agents (plan toward objectives), (4) utility-based agents (optimize for measurable outcomes), and (5) learning agents (improve over time from experience). Modern LLM-driven agents are usually goal-based or utility-based with a learning component.

What is the difference between an AI agent and a chatbot?

A chatbot generates text in response to a prompt and stops. An AI agent uses tools to take real actions, observes the results, and loops — deciding what to do next based on what just happened. The loop and the tools are the difference.

How much does an AI agent cost to run?

Per-task LLM costs are usually $0.01–$0.30 depending on model and steps. The economics work easily for any task with revenue or labor value above $1. The bigger cost is build time — a production-ready agent for a specific task takes 1–4 weeks to ship.

Do AI agents replace humans?

In narrow operational tasks (lead enrichment, inbox triage, support tier-1, meeting scheduling), they replace 50–90% of the manual work. They do not replace the human judgment around hiring, strategy, customer relationships, or anything where being wrong is expensive.

What is an AI agent? Operator definition + real examples

An AI agent isn't a chatbot with extra steps. The difference is whether the model writes a plan and revises it as the world responds — or just generates the next message. That single distinction is the whole article.

Concretely: an AI agent is software that uses an LLM to (1) decide what to do next, (2) take that action via a tool — an API, a database query, an external service — and (3) adapt its plan based on the result. It is named after the same concept in classical AI: a goal-directed system that perceives an environment and acts within it.

The phrase got hijacked by marketing in 2025–2026, so the definition above matters. If a product calls itself an "AI agent" but only generates text — without taking real actions — it's a chatbot.

Editorial illustration: two side-by-side panels. Left panel is a single dialogue bubble next to a stick-figure brain. Right panel is the same brain with three connecting arrows reaching to a wrench, a database stack, and an envelope. — Chatbot: prompt in, text out. Agent: goal in, tool calls and side effects out.

The three parts of a minimum viable agent

A model. Usually an LLM (Claude, GPT, Gemini, Llama). The model produces structured output that says "call this tool with these arguments" or "the goal is met, stop".
A tool registry. A list of functions the model is allowed to call (e.g. send_email, lookup_customer, schedule_meeting). Each tool has a name, a description, and a parameter schema.
A loop. Code that takes the model's output, executes the requested tool call, feeds the result back to the model, and asks "what next?" Until the goal is met or a stop condition fires.

Strip away any of these three and it stops being an agent. A model with no tools is a chatbot. Tools without a loop is an API gateway. A loop with no model is a workflow.

Real-world AI agents we build

Four canonical agent patterns we ship for clients. All share the same shape: narrow scope, 5–15 tools each, 3–8 step typical loop, human-in-the-loop on irreversible actions.

▶ Production agent patterns

Agent	What it does	Tools	Typical loop
AI Virtual Assistant	Reads inbound email, drafts replies, schedules meetings, routes urgents	Gmail/Outlook · Calendar · CRM · Slack	4–6 steps
AI Receptionist (voice)	Answers inbound calls, qualifies, books meetings, sends confirmation	Vapi/Retell · Calendar · CRM	3–5 steps
AI SDR	Researches leads, drafts personalized outreach, runs follow-up cadence, hands warms to humans	Enrichment API · Email · Sequencer · LinkedIn	5–8 steps
AI Customer Support	Reads tickets, deflects via KB, routes to right human queue with summary	Helpdesk · Vector search · CRM	3–6 steps

Each agent is built for one job. We have not shipped a "general business assistant" — that scope reliably fails.

How an agent actually runs — a concrete trace

Goal: "Reply to this customer email asking about pricing for the Pro tier."

Agent: I need the pricing for the Pro tier. → Calls lookup_pricing(tier="Pro").
Tool returns: { price: 299, billing: "monthly", features: [...] }.
Agent: I should also check if this customer has any active discounts. → Calls get_customer(email="...").
Tool returns: { tier: "Free", discount_eligible: true, signup_date: "2025-11-12" }.
Agent: Customer is on Free tier and discount-eligible. I'll mention the standard Pro price and offer the upgrade discount. → Calls draft_reply(...).
Tool returns the drafted email. Agent stops, hands off to a human for review.

That's a 3-tool, 5-step agent. Most production agents look exactly like this — short, focused, observable.

What separates real agents from "wrapper" products

Most consumer products labeled "AI agents" in 2026 are one of:

A chatbot with a single function call (e.g. "ask the AI to schedule a meeting" — the model just calls one calendar API once and stops).
A workflow that prompts an LLM at one step (e.g. an email tool that uses GPT to draft, then sends — no looping).
A persona wrapper around a chat UI ("our AI marketing director") with no tool integration at all.

You can spot a real agent by asking: does it loop? Does it correct itself when a tool returns an error? Can it use information from one tool to decide which tool to call next? If no — it's automation marketing, not an agent.

Frameworks for building agents (mid-2026)

Claude Agent SDK / OpenAI Assistants API — managed, opinionated, fast to ship.
LangGraph — explicit state graphs, best for multi-step agents with branching.
CrewAI — multi-agent teams with role-based coordination.
n8n + LLM nodes — when you want a workflow-bound agent with full visual control.
Custom (just call the API in a loop) — surprisingly often the right answer for production.

How to evaluate an AI agent

Demos are theater. Real evaluation looks like:

Define the task and a success criterion (booked meetings, resolved tickets, qualified leads).
Run the agent on a held-out set of 50–200 real cases.
Score each run: success / partial / failure.
Look at the failures specifically. Are they "tool returned weird data" failures or "agent picked the wrong action" failures? The fix is different.
Track success rate over time as you tune prompts, tools, and the loop.

A production-ready agent for a narrow task usually clears 85%+ success rate. Anything below 70% means the scope is too wide or the tools are too loose.

What AI agents cannot do (in 2026)

Open-ended creative or strategic work. Models still can't replace good judgment about what to build or who to hire.
Tasks requiring real-world physical action without robotics integration.
Anything where the cost of a wrong action is much higher than the value of a right action (legal filings, large financial transactions, irreversible code deploys without review).
Long-horizon planning over weeks of work without checkpoints.

Within those constraints, the operational ceiling is high. Most service businesses have 5–15 specific workflows that an agent can shave 50–90% of the time off.

What is an AI agent?