What is an AI agent?
An AI agent is a program built around a large language model (LLM) that can take actions, not just generate text. A regular chatbot receives a prompt and returns a response. An agent receives a goal, decides what steps are needed, and executes those steps by calling external tools, reading data, or modifying systems.
The key difference: a chatbot answers questions. An agent does work.
Tool calling
Agents interact with the outside world through tool calls. A tool is any function the agent can invoke: a web search, a database query, a code interpreter, an API request, a file write.
The LLM does not execute the tool directly. Instead, it outputs a structured request (typically JSON) describing which tool to call and with what arguments. The host program executes the tool and feeds the result back to the LLM, which then decides what to do next.
Example flow:
- Agent decides it needs to look up a user's order status
- Agent outputs:
{"tool": "get_order", "args": {"order_id": "12345"}} - Host program runs the function and returns the result
- Agent reads the result and formulates a response or takes another action
Tools are defined ahead of time. The agent picks from a fixed set. It cannot invent new tools on the fly.
The agent loop
The core execution pattern of an agent is a loop:
- Prompt -- The agent receives a goal or instruction
- Think -- The LLM reasons about what to do next
- Act -- The agent calls a tool or produces output
- Observe -- The agent reads the result of its action
- Repeat -- Back to step 2 until the goal is met or the agent decides to stop
This loop runs until the agent reaches a terminal condition: the task is done, a maximum number of steps is hit, or an error stops execution. Without a termination condition, agents can loop indefinitely.
Multi-agent systems
A multi-agent system uses more than one agent to complete a task. Each agent may have a different role, set of tools, or area of expertise.
Examples:
- A research agent gathers information, then hands it to a writing agent that drafts a report
- A planning agent breaks a task into subtasks and delegates each to a worker agent
- Multiple agents work in parallel on independent parts of a problem, then a synthesis agent combines the results
Agents in a multi-agent system communicate by passing messages or sharing a common workspace. The main benefit is specialization: each agent can have a focused set of tools and instructions, which tends to produce better results than one agent trying to do everything.
MCP (Model Context Protocol)
MCP is a standard protocol for connecting AI agents to external tools and data sources. It was created by Anthropic and is an open specification.
Without MCP, every agent framework defines its own way of describing tools, passing arguments, and returning results. MCP provides a common interface so that a tool built once can be used by any agent that supports the protocol.
MCP defines three core primitives:
- Tools -- Functions the agent can call (e.g., search a database, send an email)
- Resources -- Data the agent can read (e.g., files, documentation, API responses)
- Prompts -- Reusable prompt templates the server can expose to the agent
An MCP server exposes tools and resources. An MCP client (the agent) connects to one or more servers and discovers what is available. This is similar to how a web browser connects to any web server using HTTP.
Context windows as a constraint
Every LLM has a context window: a maximum amount of text it can process at once. This includes the system prompt, the conversation history, tool results, and the agent's own reasoning.
When an agent runs a long task, the context window fills up. Once full, older information gets dropped. The agent effectively forgets what it did earlier.
Strategies for managing this:
- Summarization -- Periodically compress the conversation history into a shorter summary
- Retrieval -- Store information externally and fetch only what is relevant for the current step
- Scratchpads -- Write intermediate results to a file or database instead of keeping them in context
- Windowing -- Keep only the most recent N messages and a summary of everything before
Context management is one of the hardest practical problems in building agents. An agent that forgets a constraint mentioned 50 messages ago will violate that constraint.
Orchestration
Orchestration is the layer that controls what agents do and in what order. In a single-agent system, orchestration is the agent loop itself. In a multi-agent system, orchestration decides:
- Which agent runs next
- What information each agent receives
- When to hand off between agents
- When the overall task is complete
Orchestration can be explicit (a fixed pipeline where agent A always runs before agent B) or dynamic (an orchestrator agent that decides at runtime which agent to call). Common patterns include sequential chains, parallel fan-out/fan-in, and hierarchical delegation.
Agentic drift
When multiple agents work in parallel, they can make conflicting decisions. This is called agentic drift (sometimes called divergence).
Example: two agents are both editing the same codebase. Agent A refactors a function. Agent B, working from the original code, writes new code that calls the old version of that function. When their work is merged, things break.
Drift happens because parallel agents do not share real-time state. Each operates on its own snapshot of the world. Mitigations include locking shared resources, frequent synchronization points, and having a reviewer agent check for conflicts before merging results.
Guardrails
Guardrails are constraints that prevent agents from doing harmful, unauthorized, or unintended things. Without guardrails, an agent with access to a production database could delete data, an agent with email access could send messages to the wrong people, and an agent with code execution could run destructive commands.
Types of guardrails:
- Input validation -- Reject or sanitize prompts that attempt to override the agent's instructions (prompt injection defense)
- Output filtering -- Check the agent's responses for harmful content, PII leakage, or policy violations before delivering them
- Tool restrictions -- Limit which tools an agent can call, or require human approval for high-risk actions
- Budget limits -- Cap the number of steps, API calls, or tokens an agent can consume
- Sandboxing -- Run code execution tools in isolated environments with no network access or filesystem permissions
Guardrails are not optional. They are a required part of any production agent system.
Observability
Observability means being able to see what an agent is doing and why. Since agents make autonomous decisions, you need logs and traces to understand their behavior after the fact.
Key things to observe:
- Trace of actions -- Every tool call, its arguments, and its result
- Reasoning -- The LLM's chain-of-thought at each step (if available)
- Token usage -- How much context is being consumed and how much each step costs
- Latency -- How long each step takes
- Errors -- Failed tool calls, timeouts, rate limits
- Drift detection -- Whether the agent is staying on task or going off-track
Without observability, debugging an agent that produces wrong results is nearly impossible. You cannot fix what you cannot see.
Durable execution
Agents can run for minutes or hours. Servers crash. Networks drop. If an agent loses its state mid-task, it has to start over unless the system supports durable execution.
Durable execution means persisting the agent's state (conversation history, tool results, current step) so that it can resume from where it left off after a crash or restart. This is the same concept as durable workflows in backend engineering (think Temporal, AWS Step Functions, or Vercel Workflow).
Key requirements:
- Checkpoint the agent's state after each step
- Store state in a persistent backend (database, object storage)
- On restart, reload state and continue from the last checkpoint
- Handle idempotency: if a tool call was made but the result was not recorded, decide whether to retry or skip
Without durable execution, long-running agents are fragile. Any interruption means lost work.
Chatbot vs. copilot vs. agent
These three terms describe different levels of autonomy:
Chatbot -- Responds to messages. Has no tools. Cannot take actions. It generates text based on a prompt and conversation history. Examples: a customer support bot that answers FAQs, a basic ChatGPT conversation.
Copilot -- Suggests actions but requires human approval. Has access to tools and context (your code, your documents, your email) but operates in an advisory role. The human decides what to accept. Examples: GitHub Copilot suggesting code completions, an AI assistant that drafts emails for you to review and send.
Agent -- Acts autonomously toward a goal. Has tools, makes decisions, and executes actions without asking for permission at every step. The human sets the goal and constraints, then the agent works independently. Examples: an agent that researches a topic and writes a report, an agent that triages and responds to support tickets.
The boundaries are not sharp. Many real systems blend these modes: an agent that acts autonomously on low-risk tasks but escalates to a human for high-risk decisions.
Current limitations
AI agents are useful but far from reliable. Key limitations as of early 2025:
- Hallucination -- Agents still fabricate facts, invent tool arguments that do not exist, and confidently produce wrong answers. This is an inherent property of LLMs, not a bug that will be patched soon.
- Planning failures -- Agents struggle with tasks that require long-horizon planning. They can miss steps, go in circles, or pursue dead-end strategies.
- Fragile tool use -- Small changes in tool descriptions or argument formats can cause agents to misuse tools or fail to call them at all.
- Cost -- Each step in the agent loop costs tokens. Complex tasks with many steps can get expensive fast.
- Latency -- Each LLM call takes time. An agent that needs 20 steps to complete a task means 20 round trips to the model.
- Security -- Agents that process untrusted input are vulnerable to prompt injection, where adversarial text in the input hijacks the agent's behavior.
- Human oversight is still required -- For any task where errors have real consequences (financial transactions, medical advice, legal documents, production deployments), a human needs to review the agent's work before it takes effect.
Agents are a tool for augmenting human work, not replacing human judgment. Build accordingly.
