What is an AI Agent, Really?

If you've used ChatGPT, Claude, or Gemini, you've interacted with what's called a "language model." These are prediction engines. Type something, they predict what words should come next, and you get a response.

But there's a new category of AI system showing up everywhere: agents.

The Shift from Answers to Actions

A language model working alone is like a knowledgeable coworker who starts fresh every day. They don't remember what you talked about yesterday, can't look at your calendar, and can't send an email on your behalf. Every time you start a new chat, it's like meeting them for the first time.

An agent is what happens when you wrap that language model with systems that give them memory, tools, and workflows.

What Actually Makes an Agent

Not every AI tool that calls itself an "agent" actually is one. Here are the five capabilities that separate real agents from fancy chatbots:

1. It gathers its own context

When you ask a chatbot to "prep me for my 2pm meeting," it gives you generic advice because it has no access to your actual situation. You'd have to paste your calendar, summarize past emails, and explain who these people are just to get a useful answer.

An agent finds what's relevant itself. It checks your calendar to see the meeting, reads related emails, reviews past notes about these attendees, and even checks for any instructions you've given it about how you like to prepare. You don't feed it context—it retrieves what it needs based on your request.

This is how agents gradually "read your mind." Like a real assistant who knows your preferences and can apply judgment about what matters, an agent learns what you typically need for different situations and brings that forward without you having to spell it out every time.

2. Actions, Not Just Advice

A language model can look at your calendar and tell you it looks busy, but it can't actually do anything about it. The tool it would need—the ability to send a reschedule email—isn't available to it.

An agent has access to both kinds of tools: ones that let it gather information (checking calendars, reading files) and ones that let it take action (sending emails, updating documents, creating calendar events). When it sees a conflict, it can suggest a new time and, with your approval, actually send the reschedule email itself.

This is the difference between having visibility into your systems and having the ability to change them.

3. It remembers

Language models start every conversation from zero. They don't know who you are or what you prefer.

Gathering context (from section 1) is about going out and finding what's relevant right now—checking your calendar, reading recent emails. Memory is different: it's what the agent already knows about you from past conversations.

Agents maintain memory in two ways:

Session memory: In a long back-and-forth, the agent maintains a continuous record of what you've discussed so far. If you're working on a document together and you mention in message 3 that the tone should be "casual but professional," the agent still knows that in message 23 when you're reviewing the final draft. You don't have to remind it of things you already said.
Persistent memory: Between conversations, the agent stores what it's learned about you—how you like your emails written, what "the usual" means for you, which projects you're focused on. This carries over from one session to the next, so each time you return, it already knows your preferences.

Memory is selective in a useful way. An agent doesn't store every word you've ever said to it. Instead, it extracts patterns—what you typically want in certain situations, what approaches have worked with you before, what you usually correct or reject. This is why working with an agent over time feels different than working with one that's fresh out of the box. The experienced one has developed instincts about how you think.

4. It manages its own attention

Language models have a limited window of text they can consider at once. In a long working session that window fills up fast.

You've probably experienced this yourself. In a long conversation with ChatGPT or Claude, you may notice the quality drops after a while. It starts contradicting itself, forgetting details from earlier in the chat, or losing the thread of what you were working on. That's the window filling up.

A simple system keeps stuffing in new information until it forgets where it started. An agent compresses: keeping recent details crisp, summarizing older information, and dropping redundant data. This is often the difference between an agent that stays coherent over a long project and one that quietly loses track.

5. It keeps working until the job is done

A chatbot responds once and waits for you. An agent runs in a loop: observe the situation, decide what to do, take action, check the result, and repeat if needed.

If the first web search doesn't find what it needs, it refines the query and tries again. If a drafted email doesn't match your usual tone, it revises it.

Agents Working Together

So far we've talked about a single agent helping you. But agents can also work with each other—and even create new agents—to handle complex tasks.

This changes how you think about getting work done with AI. I no longer just ask Claude to draft an email. Instead, I tell it to follow a workflow that spawns multiple agents: one focused on technical accuracy, one on tone and clarity, and one checking that it matches my goals for the conversation. They review each other's work, argue about what matters, and produce something better than any single agent would have written alone. (You can try this workflow already built into bymorning. I wrote more about how it works in Editing AI Writing.)

Agents can also chain together. The first agent researches a topic and writes a brief. A second agent reads the brief and generates a presentation. A third checks the presentation against your brand guidelines. Each does what it's good at, passing work to the next.

Here's the counterintuitive part: this approach often beats using the "best" model. Turning up the reasoning or giving a single agent more time to think doesn't always help. In fact, a slow agent with too much time can talk itself into worse answers—overcomplicating simple problems, second-guessing good decisions, or chasing irrelevant tangents.

A swarm of fast, focused agents running in parallel almost always produces better results than one slow "smarter" agent trying to do everything itself.