Claude Code — Agents Guide

01 — Concept

What Is an Agent?

An agent pursues a goal across many steps — choosing tools, observing results, and deciding what to do next. Where a prompt answers once and a skill applies expert guidance to one task, an agent runs a loop until it's done.

①

Prompt (one shot)

One question, one answer. You decide every next step. Cheapest, fastest, easiest to debug.

②

Skill (expert behavior)

Reusable expert guidance for one task — Claude loads it when the description matches. You still drive the conversation.

③

Agent (autonomous)

Pursues a goal across many steps. The agent itself decides which tool to call next, within boundaries you set.

💡

Reach for an agent only when the number of steps depends on what's discovered along the way, the work runs unattended, or multiple tools must be combined dynamically. Otherwise a skill is cheaper, faster, and easier to debug.

02 — The Agent Loop

Goal → Plan → Act → Observe

Every agent — no matter how complex — runs the same four-step loop. Learn this diagram. If your agent isn't doing all four, it's a skill in disguise.

🎯

Step 1

Goal

What success looks like

→

🗺️

Step 2

Plan

Which tool, which step, in what order

→

⚡

Step 3

Act

Run a tool, get a result

→

🔁

Step 4

Observe

Done? If not, loop back to Plan

🧩

Three ingredients power every agent: tools (what it can do in the world), memory (what it remembers across the loop), and autonomy (how many steps it can run before checking in with you).

03 — Structure

Anatomy of a Subagent File

Subagents in Claude Code live as Markdown files in .claude/agents/. YAML frontmatter declares the agent's identity and tool allow-list; the body is the system prompt that shapes how it behaves inside the loop.

.claude/agents/file-organizer.md


                            ---
                            name: file-organizer
                            description: Scans a folder, groups
                              files by type, and moves them into
                              typed subfolders. Dry-run by default.
                            tools: Read, Glob, Bash
                            ---

                            You are a file-organizer agent.

                            GOAL: organize the folder the user
                            points you at into typed subfolders
                            (images, docs, archives, code, video,
                            unsorted).

                            PLAN before acting. State your plan
                            in one paragraph.
                            ACT one step at a time. After each
                            tool call, ask: am I done?

                            Default to dry-run. Only move files
                            when the user passes execute=true.

                            STOP if you exceed 30 tool calls or
                            8 minutes of work.

                            NEVER:
                            - Move files outside the target folder
                            - Delete anything
                            - Continue past a destructive action
                              without confirmation

①

name

Short identifier. Used when invoking the agent and shown in your agent list.

②

description

One sentence that says when to invoke this agent. Specific verbs, clear scope — the same trigger discipline as a skill description.

③

tools (allow-list)

The agent can only use tools listed here. Anything not listed is unreachable. Smaller is safer — start narrow, widen when needed.

④

System prompt

The agent's persona, goal, plan-then-execute rhythm, stopping conditions, and explicit "never" rules. This is where you encode safety.

04 — Setup

Set Up Your Workshop Project

Subagents live in .claude/agents/ inside whatever project you open. Create a workshop project once, and every agent you build today will land in the same predictable place.

📦

Prereqs

Claude Code installed and signed in · A terminal you're comfortable with · 5–10 minutes per module · Optional: a GitHub PAT and Gmail account for the MCP-powered modules.

1

Create the workshop folder

Make a folder for today's work and open it in Claude Code. All subagents you build will save into .claude/agents/ here.

terminal

mkdir -p ~/agents-workshop && cd ~/agents-workshop
claude

2

Verify the agents directory

Inside the Claude Code session, ask:

Prompt — bootstrap

"Create the directory .claude/agents/ here if it doesn't exist, and list what's inside."

Empty is fine — you'll fill it in Module 2.

3

List available agents

Anytime you want to see which subagents are loaded:

Claude Code chat

/agents

After Module 2 you'll see file-organizer here. By the end of the workshop, you'll have several.

💡

Optional — install an MCP server now

Module 3 uses MCP. To get a head start: claude mcp add github -- npx -y @modelcontextprotocol/server-github (you'll be prompted for a personal access token).

05 — Workshop

Six Modules, Concept to Capstone

Each module builds on the last. Concept → first agent → tools → multi-step planning → production safety → ship something that's yours.

Setup · Before You Start

Install Claude Code

Step-by-step installation for Windows & macOS — Node.js, npm, Claude Code, and API key setup.

↗ Setup Guide

Module 1 · Concept

What Is an Agent?

Prompt vs skill vs agent. The four-step loop. The three ingredients (tools, memory, autonomy). When NOT to build an agent.

⏱ ~30 min

Module 2 · First Agent

File Organizer Subagent

Build your first .claude/agents/ file. Run dry-first, then live. Debug a misfire on purpose. Read every line of the YAML and prompt.

⏱ ~45 min

Module 3 · Tools

Beyond the Filesystem

Built-in tools (Bash, Web*), MCP servers (GitHub, Slack, Calendar), and custom tools via the Agent SDK. Build a PR triage agent.

⏱ ~45 min

Module 4 · Multi-step

Research Brief Agent

Plan-then-execute pattern. Self-review for citation accuracy. Decide when to ask vs proceed. Force the agent to fix its own output.

⏱ ~60 min

Module 5 · Production

Budgets, Hooks & Observability

Cap steps, time, and tool calls. Hooks as guardrails. Idempotency. Read the trace. Decide when humans must stay in the loop.

⏱ ~45 min

Module 6 · Capstone

Pick & Ship

Choose one of five templates — Inbox Triage, Daily Digest, PR Reviewer, Calendar Coordinator, Research Companion — customize, and demo it in 90 seconds.

⏱ ~60 min

Cheat Sheet

Take It Home

Subagent template, tool selection chart, common failure modes & fixes, plus 10 agent ideas to build at home.

⏱ Reference

🤖 Subagent Builder

Module 2

File Organizer — Your First Subagent

A real subagent that organizes your ~/Downloads folder. Build it conversationally, run it dry-first, then live. Debug a misfire on purpose so you never trust an agent blindly.

1

Scaffold the subagent

From inside ~/agents-workshop, paste this into Claude Code:

Prompt 1 — scaffold

"Create a subagent at .claude/agents/file-organizer.md. Its job is to scan a folder I point it to, group files by type (images, docs, archives, code, video, other), and move them into typed subfolders. Allow only Read, Glob, Bash. Default to dry-run mode unless I pass execute=true."

2

Read every line of the file Claude wrote

Open .claude/agents/file-organizer.md. You should be able to follow every line — YAML frontmatter (name, description, tools allow-list), then the system prompt body with goal, plan-then-execute rhythm, dry-run default, and stopping conditions.

📖

Three things to find in the file

1. The tools: line — that's the entire allow-list.
2. The dry-run default — usually phrased as "only execute when execute=true".
3. A stopping condition — max steps, max tool calls, or a time cap.

3

Run it dry-first

Always plan-before-act. The agent should print a move plan and stop, waiting for approval.

Prompt 2 — dry run

"Run the file-organizer subagent against ~/Downloads in dry-run mode. Show me the plan but don't move anything yet."

⚠️

Read the plan before approving

This is the moment that separates "I trust this agent" from "I just lost my files." Always read the plan. Always.

4

Run it for real

Watch the trace. Claude moves files one by one, reporting each move. When it's done, you have a clean Downloads folder.

Prompt 3 — execute

"Looks good. Run it with execute=true."

5

Debug a misfire (on purpose)

Drop a file with no extension into ~/Downloads and re-run the agent. It will misfire — the trace will show why. Find the gap in the system prompt, then fix the agent (not the output):

Prompt 4 — fix the agent

"The agent doesn't handle files without extensions. Update file-organizer.md so unknown files go into an unsorted/ subfolder. Re-run dry-run to verify."

Module 2 is complete when you can explain what each line of the file does, you've run the agent dry then live, and you've debugged at least one misfire by editing the agent itself.

🔧 Tools & MCP

Module 3

Tools Beyond the Filesystem

An agent is only as capable as its tools. Built-in tools cover the basics; MCP servers connect to GitHub, Slack, Calendar, Linear, Notion, Gmail; and custom tools (via the Agent SDK) reach anywhere your code can.

1

Built-in tools you already have

Every Claude Code agent has these out of the box. Pick the smallest set that does the job.

🧰

The starter kit

Read / Write / Edit — files inside the project. Hook-enforced scope.
Glob / Grep — find files and content fast. Read-only.
Bash — anything the shell can do. Highest blast radius — hook it.
WebFetch — read a known URL. Mind rate limits.
WebSearch — find URLs you don't know yet. Always verify with WebFetch.

2

MCP servers — connect to real systems

MCP (Model Context Protocol) lets your agent talk to GitHub, Slack, Calendar, Linear, Notion, Gmail, and dozens more — each one adds a fresh set of tools to the allow-list.

terminal — install GitHub MCP

claude mcp add github -- npx -y @modelcontextprotocol/server-github
# you'll be prompted for a personal access token
# restart Claude Code; new tools (github_list_prs, github_get_pr, …) appear automatically

3

Build a GitHub PR triage agent

Read-only by design. The agent lists PRs and summarizes — never comments, never merges.

Prompt — scaffold the triage agent

"Create .claude/agents/pr-triage.md. It uses the GitHub MCP tools to list open PRs in a repo I name, fetches the title, author, and last update for each, then writes a summary table sorted by staleness. Read-only — no comments, no merges."

Test it on a real repo you maintain. Notice how the tool allow-list keeps the agent honest — even if it tried to merge, the tool isn't there.

4

Custom tools via the Agent SDK

When no MCP server exists for what you need, write a tool. Five lines is enough to reach a private API.

Python — Claude Agent SDK

from claude_agent_sdk import tool, agent

@tool("check_inventory")
def check_inventory(sku: str) -> dict:
    """Returns stock for a given SKU from internal API."""
    return requests.get(f"https://api.acme.internal/sku/{sku}").json()

agent(tools=[check_inventory], goal="Reorder anything below 10 units")

5

Tool selection — three questions to ask

🧭

Before you grant a tool

I/O cost — how slow or expensive is one call?
Blast radius — what's the worst thing it can do if misfired?
Confirmation — should the agent stop and ask before using it?

If a tool fails any of those badly, either don't grant it, or wrap it in a hook (Module 5) so misuse is impossible.

🧠 Plan-then-Execute

Module 4

Research Brief — A Multi-step Agent

The simplest pattern that handles real complexity: first plan, then execute, then self-review. You'll build an agent that decomposes its own work, cites every claim, and refuses to fabricate.

🗺️

Plan-then-execute, in three sentences

Plan — the agent writes the steps it will take, before any tool calls. Execute — runs the plan one step at a time. Self-review — checks the output against the goal, revises if needed.

1

Scaffold the agent

Prompt 1 — scaffold

"Create .claude/agents/research-brief.md. Given a topic, produce a 1-page brief with: 5 sources (URLs, titles, dates), claims with citations [1]...[5], and a one-paragraph synthesis. Tools: WebSearch, WebFetch, Write. Plan-then-execute pattern. Self-review for citation accuracy before returning."

2

Run on a real topic

Prompt 2 — run it

"Use research-brief on the topic 'agentic coding 2025'. Save the output to briefs/agentic-coding-2025.md."

Watch carefully. The agent should:

Print its plan first — search queries, what it expects to find
Execute each search
Read the top hits with WebFetch
Self-review: "did I cite every claim?"
Revise if needed, then write the file

3

Force better citations — fix the agent, not the output

If your brief has uncited claims, the bug is in the system prompt. Tighten the rule once and every future brief inherits the fix.

Prompt — edit the agent

"Update research-brief.md: any claim without a citation marker [N] must be removed during self-review. No exceptions."

4

Decision rules — when to ask vs proceed

Bake these into the agent's system prompt so it behaves predictably without you in the chat.

Ambiguous goal

Ask once at the start, then proceed. Don't keep re-asking mid-loop.

Conflicting sources

Present both with citations. Don't pick a winner.

Cost spike

If the agent passes its tool-call cap (e.g. more than 30 searches), stop and report.

No results

Return a "couldn't find" brief. Never fabricate to fill the page.

5

Run on a second topic — verify behavior is consistent

A good agent runs the same playbook every time. Pick a different topic and re-run. Plan should appear, every claim should carry a marker, the file should land in briefs/.

✅

Module 4 complete when

You have at least two saved briefs · every claim has a citation marker · you can explain plan-then-execute without looking at notes.

🛡️ Production

Module 5

Budgets, Hooks & Observability

Every agent so far has been one you watched. Now make them safe to run while you're not watching — caps on cost, hooks that enforce rules the agent could otherwise bend, idempotent steps, and a trace you can read after the fact.

1

Budgets — caps on cost and runaway loops

Add to any agent's system prompt: "Stop and report if you exceed 30 tool calls or 8 minutes of work." A few sensible defaults:

Max steps

Loop iterations before forced stop. Default 20–50.

Max tool calls

External calls before forced stop. Default 30.

Time cap

Wall-clock. Default 5–10 minutes for interactive agents.

Token cap

Context spend. Set at the project level when you can.

2

Hooks as guardrails

Hooks run shell commands before/after tool calls. Use them to enforce rules the agent could otherwise bend — the harness runs hooks, not the agent, so they cannot be talked around.

Prompt — add a pre-write hook

"Add a PreToolUse hook to my project's .claude/settings.json that blocks any Write or Edit outside ~/agents-workshop/. Hook fails the call if the path doesn't match."

.claude/settings.json — example

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Write|Edit",
        "hooks": [{
          "type": "command",
          "command": "check-path-allowed.sh"
        }]
      }
    ]
  }
}

3

Idempotency & retries

Make every tool call safe to repeat — agents die mid-run, and the next run should pick up where it stopped, not start over.

♻️

Four idempotency tactics

— Check before write — does the file already exist? Skip or update.
— Use unique IDs — task-id-{date} so repeated runs don't double-create.
— Mark progress — write a .done file after each completed step.
— Resume, don't restart — let the next run skip what's already finished.

4

Observability — read the trace

Every agent run leaves a JSONL trace on disk. When something misfires, share that file — it contains every prompt, every tool call, every output. It is your debugger.

macOS / Linux

# list runs for this project
ls ~/.claude/projects/<project>/

# pretty-print the latest run
cat ~/.claude/projects/<project>/<latest>.jsonl | jq

5

Human-in-the-loop — when to require approval

Some actions cost too much to be wrong. Implement these as hooks — never trust the agent's "I'll be careful" promise alone.

⚠️

Always require human approval for…

— Sending external messages (email, Slack, customer-facing)
— Writing to shared infrastructure (production DB, CI config, prod branches)
— Spending money (paid APIs, charges, deploys)
— Anything destructive (delete, force-push, drop)

6

When to use a hook vs the system prompt

Use a hook

The rule must be unbreakable
Path scope, secret-scrubbing, "no production"
Anything you'd otherwise have to trust the model on

Use the system prompt

Stylistic preferences and output format
Plan-then-execute rhythm
Self-review criteria

🚀 Capstone

Module 6

Pick a Template & Ship

Pick one template. Customize it with your real data. Run it end-to-end. Demo it in 90 seconds. Five templates to choose from — each one is a one-prompt scaffold you'll refine to fit your life.

Template A

Inbox Triage

Reads the last 20 unread Gmail messages, classifies each (reply / archive / forward / wait), drafts replies in your tone — never sends. You ship manually.

Template B

Daily Digest

At 7am, fetches the latest from 3 sources (RSS, HN, a Slack channel), writes a 200-word brief, saves to ~/digest/YYYY-MM-DD.md. Skips quiet days.

Template C

PR Reviewer

Fetches a PR diff, runs your team's checklist (edge cases, tests, naming, perf), produces a comment-ready review. Posts only when you confirm post=true.

Template D

Calendar Coordinator

Reads your next 7 days, finds open 90-minute blocks, drafts deep-work sessions matching priorities you list. Outputs a plan — never modifies the calendar.

Template E

Research Companion

A scaled-up Module 4. Maintains a topics list, refreshes the oldest brief daily, saves history so you see what changed week over week.

Each template is one prompt away. Pick yours and paste:

Template A — Inbox Triage

"Create .claude/agents/inbox-triage.md. Uses the Gmail MCP tools to read the last 20 unread messages, classify each (reply / archive / forward / wait), and for reply candidates draft a response in my tone. Outputs a markdown table — never sends. I send manually."

Template B — Daily Digest

"Create .claude/agents/daily-digest.md. At 7am, fetches the latest from 3 sources I list (RSS feeds, hacker news, a Slack channel), writes a 200-word brief, saves to ~/digest/YYYY-MM-DD.md. Read-only. Skip the day if nothing new."

Template C — PR Reviewer

"Create .claude/agents/pr-reviewer.md. For a given PR URL, fetches the diff, runs through our team checklist (edge cases, tests, naming, performance), and produces a comment-ready review in markdown. Posts to GitHub only if I confirm with post=true."

Template D — Calendar Coordinator

"Create .claude/agents/cal-coordinator.md. Reads my calendar for the next 7 days, finds open 90-minute blocks, drafts proposed deep-work sessions matching priorities I list. Flags any conflicts with existing events. Outputs a plan — does not modify the calendar."

Template E — Research Companion

"Adapt the research-brief agent into .claude/agents/research-companion.md. Maintains a topics list. Daily, picks the topic with the oldest brief and refreshes it. Saves history so I see what changed week over week."

🎤

Your 90-second demo answers

1. What's the goal of your agent?
2. Show the trace of one run.
3. What's one thing it got wrong, and how did you fix it?
4. What would you build next on top of it?

06 — Cheat Sheet

Take This Home

The agent loop, a subagent template, the tool selection chart, and the failure modes you'll see most often. Keep this open while you build.

Subagent file template

.claude/agents/<name>.md

---
name: my-agent
description: One sentence — when to invoke this agent
tools: Read, Write, Bash, WebSearch
---

You are an agent that <goal>.

PLAN before acting. State your plan in one paragraph.
ACT one step at a time. After each tool call, ask yourself:
am I done?
STOP if you exceed 30 tool calls, 8 minutes, or hit anything
ambiguous.

NEVER:
- Write outside <allowed paths>
- Send external messages without explicit approval
- Continue past a destructive action without confirmation

Tool selection chart

Need	Tool	Approval needed?
Read project files	`Read, Glob`	No
Modify project files	`Write, Edit`	Hook-enforced scope
Run shell commands	`Bash`	Yes — high blast radius
Read the web	`WebFetch, WebSearch`	No
Talk to GitHub / Slack / Cal	`MCP server`	Per-action — read free, write gated
Custom internal API	`Agent SDK custom tool`	Define per call

Common failure modes & fixes

Symptom	Likely cause	Fix
Agent loops forever	No stopping condition	Add max-steps + explicit "done" criteria
Agent fabricates results	Tool failed silently	Force the agent to print tool output before claiming success
Agent over-edits	Scope too broad	Tighten path allow-list, add a PreToolUse hook
Slow / expensive runs	Too many web calls	Cap searches, cache fetched URLs in a temp file
Inconsistent output format	Output spec implicit	Add a literal example of the expected output to the system prompt

Quick command reference

Task	How to do it
Create a subagent	"Create `.claude/agents/<name>.md` that …"
List agents in this project	`/agents` in chat
Install an MCP server	`claude mcp add <name> -- <launch command>`
Read a run trace	`cat ~/.claude/projects/<project>/<run>.jsonl \| jq`
Add a guardrail hook	Edit `.claude/settings.json` · `PreToolUse` matcher + command
Run dry-first	Pass `execute=false` (or omit it if your agent defaults to dry-run)

07 — Ideas

10 Agents to Build at Home

Once you've shipped your capstone, these are the next ten to build. Each one is one prompt away — the patterns repeat.

🧾

receipt-sorter

Reads PDFs in ~/Receipts, extracts amounts, builds a monthly CSV

📰

standup-writer

Reads your last 24h of commits + PRs, drafts a Slack standup post

🔖

bookmark-librarian

Periodically de-dupes your browser bookmarks and tags by topic

🛒

deal-alert

Watches a list of products, pings you when prices drop below targets

🕵️

code-archaeology

Given a function, traces its callers and writes a one-pager on its history

📅

meeting-prep

For tomorrow's calendar, drafts 3-bullet prep notes per meeting

📥

inbox-archivist

Auto-archives newsletters, labels promos, surfaces only personal mail

📚

doc-gardener

Scans your repo for stale README/docs, opens PRs with refresh suggestions

✉️

newsletter-drafter

Turns a week of your tweets/posts into a newsletter draft

🧭

onboarding-buddy

For a new repo you cloned, writes a 1-page "where things live" guide

📦

Subagents are version-controllable

The files in .claude/agents/ live on disk. Commit them alongside your codebase. Share them across your team. Treat them like any other source file — diff, review, refactor.

Recap

Key Takeaways

①

Every agent runs the same loop: Goal → Plan → Act → Observe. If your "agent" isn't doing all four, it's a skill in disguise — make it one.

②

Subagents are plain Markdown files in .claude/agents/. YAML frontmatter (name, description, tools allow-list) plus a system prompt. Tighten the allow-list — less is safer.

③

Plan-then-execute beats raw improvisation. Force the agent to write its plan before any tool call, and to self-review before returning. Fix the agent, not the output.

④

Hooks enforce what the system prompt can only request. Use them for path scope, secret-scrubbing, and human approval gates on anything destructive or expensive.

⑤

Read the trace. Every run leaves a JSONL file under ~/.claude/projects/. When something misfires, the trace is your debugger.

Workshop Access

Building AI Agents with Claude Code

What Is an Agent?

Prompt (one shot)

Skill (expert behavior)

Agent (autonomous)

Goal → Plan → Act → Observe

Goal

Plan

Act

Observe

Anatomy of a Subagent File

name

description

tools (allow-list)

System prompt

Set Up Your Workshop Project

Create the workshop folder

Verify the agents directory

List available agents

Six Modules, Concept to Capstone

Install Claude Code

What Is an Agent?

File Organizer Subagent

Beyond the Filesystem

Research Brief Agent

Budgets, Hooks & Observability

Pick & Ship

Take It Home

File Organizer — Your First Subagent

Scaffold the subagent

Read every line of the file Claude wrote

Run it dry-first

Run it for real

Debug a misfire (on purpose)

Tools Beyond the Filesystem

Built-in tools you already have

MCP servers — connect to real systems

Build a GitHub PR triage agent

Custom tools via the Agent SDK

Tool selection — three questions to ask

Research Brief — A Multi-step Agent

Scaffold the agent

Run on a real topic

Force better citations — fix the agent, not the output

Decision rules — when to ask vs proceed

Run on a second topic — verify behavior is consistent

Budgets, Hooks & Observability

Budgets — caps on cost and runaway loops

Hooks as guardrails

Idempotency & retries

Observability — read the trace

Human-in-the-loop — when to require approval

When to use a hook vs the system prompt

Pick a Template & Ship

Inbox Triage

Daily Digest

PR Reviewer

Calendar Coordinator

Research Companion

Take This Home

Subagent file template

Tool selection chart

Common failure modes & fixes

Quick command reference

10 Agents to Build at Home

receipt-sorter

standup-writer

bookmark-librarian

deal-alert

code-archaeology

meeting-prep

inbox-archivist

doc-gardener

newsletter-drafter

onboarding-buddy

Key Takeaways

Now go ship one.

Building AI Agents
with Claude Code