Reference

The Components of an AI Agent

Most people who use AI day-to-day are using it through a chat window. You type, it replies. That interaction is useful, but it's also the least powerful shape that modern models can take. To do useful work, models need tools. Systems where models can use tools are called agents.

This article is a practical reference to AI agents. I'm writing it for people using AI day-to-day for their work but aren't software developers or hardcore AI enthusiasts. AI is powerful because it allows people to automate their work, and even some of their judgment, without the rigidity and expertise of programming. I'm writing this guide because using AI still requires some amount of skill, and there are very few plain-language explanations of how agents work so that you can use them effectively.

An LLM vs. an Agent

A Large Lgnguage Model predicts what text should come after other text, and it is extremely good at it. You give it text, it gives you back text. It is good at giving plausible sounding answers, but it doesn't know what day it is, it can't open your inbox, can't save a note for next week, and can't do anything except produce the next stretch of words. An advanced version of LLMs are reasoning models, which have been trained to spend more compute creating intermediate text to simulate thinking, verification, and self-correction before it answers. This is what ChatGPT and Claude are, and why they have gotten a lot better over the last few years.

An agent isn't an LLM. An agent takes an LLM, like ChatGPT, then wraps other software around it to make it even smarter. Given some goal, the agent runs a model (or several) to decide what to look at, what tools to call, how to update its notes, and when it's done. The model is just one part of the agent. It generates text and commands, but the agent is the whole system that does the work.

The main components of an agent are:

  • A system prompt that tells it who it is and how to behave
  • A context window that holds everything it's currently aware of
  • A set of tools it's allowed to call
  • A permission model that controls what it can do on its own
  • A memory layer that survives across conversations
  • A control loop that ties all of this together, turn by turn

Let me walk through each of these.

1. The System Prompt

The system prompt is the agent's job description. It's a piece of text that gets injected at the top of every conversation, invisible to the user, that defines who the agent is, what it should care about, what tone to use, what to avoid, and what shortcuts it's allowed to take.

If you've ever written onboarding docs for a new hire, you already understand most of this. You don't write "be helpful." You write "when a customer emails about billing, escalate to finance; when they ask for a refund under $50, approve it; when they seem upset, don't use exclamation points." The more specific, the more consistent the behavior you get.

A good system prompt answers a few questions cleanly:

  • What is this agent for?
  • Who is it talking to, and what do they value?
  • What does "good work" look like here?
  • What should it never do without asking?

Just like an onboarding document, if it gets too long, the more likely the agent is to forget what's in it. A 3,000-word system prompt sounds thorough, but it's also 3,000 words of signal that the model has to weigh against your actual request on every turn. The best system prompts are short. Every sentence has a reason to be there.

2. Context

Context is what the model is aware of at any given moment. It includes the system prompt, the conversation so far, any documents that have been loaded in, the outputs of any tools that have been called, and whatever notes the agent is carrying forward. It's all one big pile of text, fed into the model every turn.

The thing that surprises people about context is that bigger isn't better. Models advertise longer and longer context windows — 200,000 tokens, a million tokens, and so on — and it's tempting to treat this as free space. It's not.

Think of context the way you'd think of the top of a manager's desk. You can pile every file, every email, and every meeting note on there if you want. The desk is physically big enough. But now the manager has to look at the whole desk every time they make a decision, and most of what's on it isn't relevant to the thing in front of them. The more clutter, the more likely they miss the one thing that matters.

Long contexts degrade in subtle ways. Models pay less attention to information in the middle of a long prompt (a well-documented pattern researchers call "lost in the middle"). Irrelevant material introduces noise that the model treats as signal. API costs scale with token count, and session-ending "run out of context" failures tend to happen right when you're deepest into a problem and least want the wheels to come off.

This is why bymorning is aggressive about keeping context small, and even automatically collapses conersations with agents into a summary if it gets too long. In a long working session with a human assistant, you wouldn't hand them the full transcript of every previous conversation before asking them a new question. You'd say "remember that thing we discussed yesterday? Here's the update." The assistant carries forward a summary, not a recording. Agents should work the same way.

In practice that looks like:

  • Clipping. Long tool outputs (a 4,000-line file, a twenty-message email thread) get truncated to the parts that actually matter.
  • Summarization. Older parts of the conversation get compressed into a shorter, structured note.
  • Deduplication. If the agent reads the same file three times, it doesn't carry three copies.
  • Delegation. Heavy pieces of work get handed off to a subagent so they never land on the main agent's desk in the first place (more on this below).

Context isn't storage, it's working memory.

3. Tools

Tools are what turn a chatbot into an agent. A tool is a named action the agent is allowed to take, like "search the web," "send an email," "read a file," "create a calendar event," or "post to Slack."

Without tools, the model can tell you what it would do. With tools, it can do it.

The pattern is always the same:

  1. The model decides it needs to take an action.
  2. It emits a request to use a tool
  3. The surrounding system checks the request. Is this a real tool? Are the arguments valid? Does this need approval?
  4. If everything checks out, the system runs the tool and feeds the result back to the model as the next piece of context.
  5. The model decides what to do next.

This is the fundamental rhythm of an agent. Think, act, observe, then think again.

Tools make other capabilities possible:

  • Delegation tools let the agent spin up another agent to handle a subtask. This is the single most powerful capability an agent can have, since it can delegate experts to do things without taking up its own context window and time.
  • Search tools let the agent look up information it doesn't already have, either from your files, your emails, or the internet.
  • Memory tools let the agent write and read notes to itself to use later.
  • Integration tools let the agent actually touch the systems you use — Gmail, Slack, GitHub, your calendar, your docs, and your CRM.

Because agents can create and interact with other agents, a common pattern is to have a main agent that spins off teams of other agents. The main agent decides it needs a specialist, issues a call to spin one up with a tightly scoped brief, waits for the answer, and keeps going. The specialist has its own small context, does its own work, and hands back a summary. Most of what it looked at never touches the main agent's desk.

The organizational analogy here is basically the truth. A good operator doesn't do every piece of work themselves. They pull in the right person for the job, give them just enough context to be useful, and get the answer back as a clean one-pager. The main agent is the operator. Subagents are the people pulled in for specific jobs.

4. Permissions

Agents with tools can do things in the real world. That's the whole idea, and also the part that makes people nervous, correctly so.

Permissions are how an agent decides, for any given action, whether to do it, ask first, or refuse. These are critical safety systems. Ppermissions break down into a few layers.

The first layer is allow-listing. An agent can only call tools that have been explicitly given to it. If there's no "delete file" tool on the list, the model can't delete files, no matter how creatively it's asked to.

The second layer is per-action approval. Even among allowed tools, some actions should never happen silently. Sending an email to an investor, posting a tweet, transferring money, deleting something — these should pause and ask.

The third layer is scope. Even when the agent is allowed to read files, it should only be able to read files inside the workspace it's been pointed at. Even when it's allowed to send Slack messages, it should only be allowed to send them as the user who connected the integration.

Think of this the way you'd think about provisioning access for a new contractor. On day one, you give them credentials scoped to exactly what they need for the current engagement — not the whole company's systems, not admin rights, not billing. As the engagement progresses and trust builds, you expand what they can reach. Agents should work the same way. Start them narrow, and only widen the scope when it's earned.

Good agentic systems will have sensible default permissions.

5. Memory

Context is what the agent is holding right now. Memory is what it carries between sessions.

Context is ephemeral, compressed, and tuned for the current turn. Memory is durable, structured, and meant to survive across days and weeks. Memory is usually accessed via a tool.

There are many different ways memory can work.A good memory layer looks more like the notes you keep on a specific person after the fifth time you've worked with them:

  • Stable preferences ("uses short bullet lists, not paragraphs")
  • Durable context about the work ("the Series A raise is blocked on one open diligence item")
  • References to where things live ("the real customer list is in the Notion doc, not the CRM")
  • Feedback you've given ("don't use the word 'leverage' in any external communication")

Those notes get retrieved when they're relevant and ignored when they're not. Surfacing the right memories for a given task is a bit of an art, but agents are becoming better and better at it.

The first time you work with an agent, it knows nothing about you, so everything takes longer. After a few weeks of use, a well-designed memory layer means it already knows how you write, who the important people in your life are, what you don't want to do twice, and what you want escalated. Most of the "this tool gets better over time" feeling comes from this layer, not from the model.

If an agent tool doesn't have a memory layer, it's starting from scratch every conversation. That's good for one-off tasks, but useless for ongoing work.

6. The Loop

The loop is what turns all of the above components into an agent. It's simple to describe but surprisingly hard to get right. Each turn, the agent:

  1. Reads the latest input (user message, tool result, or subagent handoff).
  2. Decides if there a next action to take, or is this done?
  3. If there is, choose the tool and the arguments.
  4. Check permissions. If needed, ask.
  5. Run the action.
  6. Observe the result.
  7. Update memory and compact context if needed.
  8. Go back to step 1.

Agents are loops. The model provides the judgment at steps 2 and 3. Everything else is plumbing. This loop is what lets an agent do multi-step work without constant hand-holding.

In a chat interface, you are the loop. You ask a question, read the answer, decide the next question, ask it, read the next answer, and so on. You're manually doing the work that makes the system useful.

An agent runs that loop on its own, within whatever limits you set, and only comes back to you when it needs a decision you have to make.

That loop runs the same way regardless of which model is inside it, though some models are better at different tasks than others.

What Different LLMs Are Actually Good At

Not all models are good at the same things. Even inside a single provider's lineup, the best model for creative writing isn't always the best model for calling tools reliably, and the best model for tool calling isn't always the best model for reasoning through ambiguous judgment calls.

Broadly, today:

  • Tool-calling ability is about how well the model emits clean, valid, structured requests, and how well it reasons about which tool to use in a given situation. Claude's Sonnet and Opus models and OpenAI's GPT-5 family are strong here. They fail less often on malformed outputs and handle multi-step tool chains more reliably. This is what you want for operational throughput — routing emails, organizing tasks, updating systems, and executing multi-step workflows.
  • Creativity and voice is a different skill. Our experience is that the design of writing agents and their knowledge of your voice and perspective matter far more than the underlying models.
  • Reasoning is the ability to work through ambiguous problems, weigh tradeoffs, and handle situations that don't fit a template. This is where reasoning-optimized models (like Opus with extended thinking, or GPT-5's deep thinking variants) earn their cost. The difference shows up most on judgment calls, not on well-defined tasks. Often reasoning models are used to plan and orchestrate long tasks, and cheaper, faster models are used to execute the work. Reasoning models are not necessarily better - they can overthink smaller tasks and even decide to ignore instructions.
  • Speed and cost matter too. A model that's 95% as capable for a specific task but ten times faster and cheaper is often the right choice, even if a better model exists.

A well-designed agent system doesn't use one model for everything. It routes the right kind of work to the right kind of model. A tight, fast model handles the routine parts. A smarter, slower model gets pulled in for the judgment calls, and faster, more obedient models execute work like searching the web.

Agents vs. Skills

A skill is a saved set of instructions and resources for a specific task, like "write a LinkedIn post in my voice." When you need it, you invoke it, and the agent follows those instructions within the current conversation.

In multi-agent systems, skills are often executed by subagents so that the main agent's context stays focused. This is a very powerful pattern because subagents can be instructed to learn the best practices for a given task on the fly, create its own skill, and execute it. Skills that work can be saved to memory. This is how bymorning operates.

A skill is a page from the playbook. The agent is the person running the play. If a play works well, the agent will remember.

This matters for a specific design decision in bymorning I want to explain — bymorning is built primarily around configurable agents, not around skills.

The reason is that agents have three things skills don't.

First, an agent has a persistent system prompt that travels with it. Every time you invoke a bymorning agent, it arrives already knowing what it is, what it cares about, what its defaults are, and what it won't do. A skill gets loaded into whatever context happens to be running, and has to share space with everything else in that conversation.

Second, an agent has its own context. When bymorning spins up a subagent to research five investors in parallel, each one has its own small, clean context. The research never pollutes the main conversation. You get five crisp briefs back, not five walls of raw notes. With skills, all of that work would happen in the main chat's context, and the main chat would bloat under the weight.

Third, an agent can run a loop. It can take multiple steps, call multiple tools, backtrack, and keep going until it has a real answer. A skill operates within a single invocation of the host conversation — it doesn't carry its own loop.

This is also why bymorning's main chat stays clean. The research, drafting, and triage agents each run in their own context.

Summary

A chat window is one shape a model can take. It's the simplest, and also the least useful for real work.

An agent is a loop around a model. The system prompt tells it who it is and the context gives it working memory. Tools let it act, permissions keep it safe, and memory lets it get better over time. Delegation lets it pull in help without cluttering the main conversation.

The single biggest mental shift, if you've mostly used AI through chat, is this — stop thinking of the model as the product. The model is a brilliant individual contributor. The agent is the organization built around them. Same talent, but now with a role, a workflow, the right access, and someone making sure the work actually ships. That's the difference between a chat window and a system that does real work on your behalf.