🔒

Workshop Access

Claude Code Agents

Enter the password to continue.

Agents Edition · Claude Code · 5–6 hour workshop

Building AI Agents
with Claude Code

From "what is an agent?" to shipping autonomous agents on your own machine. Six modules of subagents, tools, MCP, plan-then-execute, and the production guardrails that make agents safe to trust.

Get Started
01 — Concept

What Is an Agent?

An agent pursues a goal across many steps — choosing tools, observing results, and deciding what to do next. Where a prompt answers once and a skill applies expert guidance to one task, an agent runs a loop until it's done.

Prompt (one shot)

One question, one answer. You decide every next step. Cheapest, fastest, easiest to debug.

Skill (expert behavior)

Reusable expert guidance for one task — Claude loads it when the description matches. You still drive the conversation.

Agent (autonomous)

Pursues a goal across many steps. The agent itself decides which tool to call next, within boundaries you set.

💡

Reach for an agent only when the number of steps depends on what's discovered along the way, the work runs unattended, or multiple tools must be combined dynamically. Otherwise a skill is cheaper, faster, and easier to debug.

02 — The Agent Loop

Goal → Plan → Act → Observe

Every agent — no matter how complex — runs the same four-step loop. Learn this diagram. If your agent isn't doing all four, it's a skill in disguise.

🎯
Step 1

Goal

What success looks like

🗺️
Step 2

Plan

Which tool, which step, in what order

Step 3

Act

Run a tool, get a result

🔁
Step 4

Observe

Done? If not, loop back to Plan

🧩

Three ingredients power every agent: tools (what it can do in the world), memory (what it remembers across the loop), and autonomy (how many steps it can run before checking in with you).

03 — Structure

Anatomy of a Subagent File

Subagents in Claude Code live as Markdown files in .claude/agents/. YAML frontmatter declares the agent's identity and tool allow-list; the body is the system prompt that shapes how it behaves inside the loop.

.claude/agents/file-organizer.md
--- name: file-organizer description: Scans a folder, groups files by type, and moves them into typed subfolders. Dry-run by default. tools: Read, Glob, Bash --- You are a file-organizer agent. GOAL: organize the folder the user points you at into typed subfolders (images, docs, archives, code, video, unsorted). PLAN before acting. State your plan in one paragraph. ACT one step at a time. After each tool call, ask: am I done? Default to dry-run. Only move files when the user passes execute=true. STOP if you exceed 30 tool calls or 8 minutes of work. NEVER: - Move files outside the target folder - Delete anything - Continue past a destructive action without confirmation

name

Short identifier. Used when invoking the agent and shown in your agent list.

description

One sentence that says when to invoke this agent. Specific verbs, clear scope — the same trigger discipline as a skill description.

tools (allow-list)

The agent can only use tools listed here. Anything not listed is unreachable. Smaller is safer — start narrow, widen when needed.

System prompt

The agent's persona, goal, plan-then-execute rhythm, stopping conditions, and explicit "never" rules. This is where you encode safety.

04 — Setup

Set Up Your Workshop Project

Subagents live in .claude/agents/ inside whatever project you open. Create a workshop project once, and every agent you build today will land in the same predictable place.

📦
Prereqs

Claude Code installed and signed in · A terminal you're comfortable with · 5–10 minutes per module · Optional: a GitHub PAT and Gmail account for the MCP-powered modules.

1

Create the workshop folder

Make a folder for today's work and open it in Claude Code. All subagents you build will save into .claude/agents/ here.

terminal
mkdir -p ~/agents-workshop && cd ~/agents-workshop claude
2

Verify the agents directory

Inside the Claude Code session, ask:

Prompt — bootstrap

"Create the directory .claude/agents/ here if it doesn't exist, and list what's inside."

Empty is fine — you'll fill it in Module 2.

3

List available agents

Anytime you want to see which subagents are loaded:

Claude Code chat
/agents

After Module 2 you'll see file-organizer here. By the end of the workshop, you'll have several.

💡
Optional — install an MCP server now

Module 3 uses MCP. To get a head start: claude mcp add github -- npx -y @modelcontextprotocol/server-github (you'll be prompted for a personal access token).

05 — Workshop

Six Modules, Concept to Capstone

Each module builds on the last. Concept → first agent → tools → multi-step planning → production safety → ship something that's yours.

Setup · Before You Start

Install Claude Code

Step-by-step installation for Windows & macOS — Node.js, npm, Claude Code, and API key setup.

↗ Setup Guide
Module 1 · Concept

What Is an Agent?

Prompt vs skill vs agent. The four-step loop. The three ingredients (tools, memory, autonomy). When NOT to build an agent.

⏱ ~30 min
Module 2 · First Agent

File Organizer Subagent

Build your first .claude/agents/ file. Run dry-first, then live. Debug a misfire on purpose. Read every line of the YAML and prompt.

⏱ ~45 min
Module 3 · Tools

Beyond the Filesystem

Built-in tools (Bash, Web*), MCP servers (GitHub, Slack, Calendar), and custom tools via the Agent SDK. Build a PR triage agent.

⏱ ~45 min
Module 4 · Multi-step

Research Brief Agent

Plan-then-execute pattern. Self-review for citation accuracy. Decide when to ask vs proceed. Force the agent to fix its own output.

⏱ ~60 min
Module 6 · Capstone

Pick & Ship

Choose one of five templates — Inbox Triage, Daily Digest, PR Reviewer, Calendar Coordinator, Research Companion — customize, and demo it in 90 seconds.

⏱ ~60 min
Cheat Sheet

Take It Home

Subagent template, tool selection chart, common failure modes & fixes, plus 10 agent ideas to build at home.

⏱ Reference
🤖 Subagent Builder
Module 2

File Organizer — Your First Subagent

A real subagent that organizes your ~/Downloads folder. Build it conversationally, run it dry-first, then live. Debug a misfire on purpose so you never trust an agent blindly.

1

Scaffold the subagent

From inside ~/agents-workshop, paste this into Claude Code:

Prompt 1 — scaffold

"Create a subagent at .claude/agents/file-organizer.md. Its job is to scan a folder I point it to, group files by type (images, docs, archives, code, video, other), and move them into typed subfolders. Allow only Read, Glob, Bash. Default to dry-run mode unless I pass execute=true."

2

Read every line of the file Claude wrote

Open .claude/agents/file-organizer.md. You should be able to follow every line — YAML frontmatter (name, description, tools allow-list), then the system prompt body with goal, plan-then-execute rhythm, dry-run default, and stopping conditions.

📖
Three things to find in the file

1. The tools: line — that's the entire allow-list.
2. The dry-run default — usually phrased as "only execute when execute=true".
3. A stopping condition — max steps, max tool calls, or a time cap.

3

Run it dry-first

Always plan-before-act. The agent should print a move plan and stop, waiting for approval.

Prompt 2 — dry run

"Run the file-organizer subagent against ~/Downloads in dry-run mode. Show me the plan but don't move anything yet."

⚠️
Read the plan before approving

This is the moment that separates "I trust this agent" from "I just lost my files." Always read the plan. Always.

4

Run it for real

Watch the trace. Claude moves files one by one, reporting each move. When it's done, you have a clean Downloads folder.

Prompt 3 — execute

"Looks good. Run it with execute=true."

5

Debug a misfire (on purpose)

Drop a file with no extension into ~/Downloads and re-run the agent. It will misfire — the trace will show why. Find the gap in the system prompt, then fix the agent (not the output):

Prompt 4 — fix the agent

"The agent doesn't handle files without extensions. Update file-organizer.md so unknown files go into an unsorted/ subfolder. Re-run dry-run to verify."

Module 2 is complete when you can explain what each line of the file does, you've run the agent dry then live, and you've debugged at least one misfire by editing the agent itself.

🔧 Tools & MCP
Module 3

Tools Beyond the Filesystem

An agent is only as capable as its tools. Built-in tools cover the basics; MCP servers connect to GitHub, Slack, Calendar, Linear, Notion, Gmail; and custom tools (via the Agent SDK) reach anywhere your code can.

1

Built-in tools you already have

Every Claude Code agent has these out of the box. Pick the smallest set that does the job.

🧰
The starter kit

Read / Write / Edit — files inside the project. Hook-enforced scope.
Glob / Grep — find files and content fast. Read-only.
Bash — anything the shell can do. Highest blast radius — hook it.
WebFetch — read a known URL. Mind rate limits.
WebSearch — find URLs you don't know yet. Always verify with WebFetch.

2

MCP servers — connect to real systems

MCP (Model Context Protocol) lets your agent talk to GitHub, Slack, Calendar, Linear, Notion, Gmail, and dozens more — each one adds a fresh set of tools to the allow-list.

terminal — install GitHub MCP
claude mcp add github -- npx -y @modelcontextprotocol/server-github # you'll be prompted for a personal access token # restart Claude Code; new tools (github_list_prs, github_get_pr, …) appear automatically
3

Build a GitHub PR triage agent

Read-only by design. The agent lists PRs and summarizes — never comments, never merges.

Prompt — scaffold the triage agent

"Create .claude/agents/pr-triage.md. It uses the GitHub MCP tools to list open PRs in a repo I name, fetches the title, author, and last update for each, then writes a summary table sorted by staleness. Read-only — no comments, no merges."

Test it on a real repo you maintain. Notice how the tool allow-list keeps the agent honest — even if it tried to merge, the tool isn't there.

4

Custom tools via the Agent SDK

When no MCP server exists for what you need, write a tool. Five lines is enough to reach a private API.

Python — Claude Agent SDK
from claude_agent_sdk import tool, agent @tool("check_inventory") def check_inventory(sku: str) -> dict: """Returns stock for a given SKU from internal API.""" return requests.get(f"https://api.acme.internal/sku/{sku}").json() agent(tools=[check_inventory], goal="Reorder anything below 10 units")
5

Tool selection — three questions to ask

🧭
Before you grant a tool

I/O cost — how slow or expensive is one call?
Blast radius — what's the worst thing it can do if misfired?
Confirmation — should the agent stop and ask before using it?

If a tool fails any of those badly, either don't grant it, or wrap it in a hook (Module 5) so misuse is impossible.

🧠 Plan-then-Execute
Module 4

Research Brief — A Multi-step Agent

The simplest pattern that handles real complexity: first plan, then execute, then self-review. You'll build an agent that decomposes its own work, cites every claim, and refuses to fabricate.

🗺️
Plan-then-execute, in three sentences

Plan — the agent writes the steps it will take, before any tool calls. Execute — runs the plan one step at a time. Self-review — checks the output against the goal, revises if needed.

1

Scaffold the agent

Prompt 1 — scaffold

"Create .claude/agents/research-brief.md. Given a topic, produce a 1-page brief with: 5 sources (URLs, titles, dates), claims with citations [1]...[5], and a one-paragraph synthesis. Tools: WebSearch, WebFetch, Write. Plan-then-execute pattern. Self-review for citation accuracy before returning."

2

Run on a real topic

Prompt 2 — run it

"Use research-brief on the topic 'agentic coding 2025'. Save the output to briefs/agentic-coding-2025.md."

Watch carefully. The agent should:

  1. Print its plan first — search queries, what it expects to find
  2. Execute each search
  3. Read the top hits with WebFetch
  4. Self-review: "did I cite every claim?"
  5. Revise if needed, then write the file
3

Force better citations — fix the agent, not the output

If your brief has uncited claims, the bug is in the system prompt. Tighten the rule once and every future brief inherits the fix.

Prompt — edit the agent

"Update research-brief.md: any claim without a citation marker [N] must be removed during self-review. No exceptions."

4

Decision rules — when to ask vs proceed

Bake these into the agent's system prompt so it behaves predictably without you in the chat.

Ambiguous goal

Ask once at the start, then proceed. Don't keep re-asking mid-loop.

Conflicting sources

Present both with citations. Don't pick a winner.

Cost spike

If the agent passes its tool-call cap (e.g. more than 30 searches), stop and report.

No results

Return a "couldn't find" brief. Never fabricate to fill the page.

5

Run on a second topic — verify behavior is consistent

A good agent runs the same playbook every time. Pick a different topic and re-run. Plan should appear, every claim should carry a marker, the file should land in briefs/.

Module 4 complete when

You have at least two saved briefs · every claim has a citation marker · you can explain plan-then-execute without looking at notes.

🛡️ Production
Module 5

Budgets, Hooks & Observability

Every agent so far has been one you watched. Now make them safe to run while you're not watching — caps on cost, hooks that enforce rules the agent could otherwise bend, idempotent steps, and a trace you can read after the fact.

1

Budgets — caps on cost and runaway loops

Add to any agent's system prompt: "Stop and report if you exceed 30 tool calls or 8 minutes of work." A few sensible defaults:

Max steps

Loop iterations before forced stop. Default 20–50.

Max tool calls

External calls before forced stop. Default 30.

Time cap

Wall-clock. Default 5–10 minutes for interactive agents.

Token cap

Context spend. Set at the project level when you can.

2

Hooks as guardrails

Hooks run shell commands before/after tool calls. Use them to enforce rules the agent could otherwise bend — the harness runs hooks, not the agent, so they cannot be talked around.

Prompt — add a pre-write hook

"Add a PreToolUse hook to my project's .claude/settings.json that blocks any Write or Edit outside ~/agents-workshop/. Hook fails the call if the path doesn't match."

.claude/settings.json — example
{ "hooks": { "PreToolUse": [ { "matcher": "Write|Edit", "hooks": [{ "type": "command", "command": "check-path-allowed.sh" }] } ] } }
3

Idempotency & retries

Make every tool call safe to repeat — agents die mid-run, and the next run should pick up where it stopped, not start over.

♻️
Four idempotency tactics

Check before write — does the file already exist? Skip or update.
Use unique IDstask-id-{date} so repeated runs don't double-create.
Mark progress — write a .done file after each completed step.
Resume, don't restart — let the next run skip what's already finished.

4

Observability — read the trace

Every agent run leaves a JSONL trace on disk. When something misfires, share that file — it contains every prompt, every tool call, every output. It is your debugger.

macOS / Linux
# list runs for this project ls ~/.claude/projects/<project>/ # pretty-print the latest run cat ~/.claude/projects/<project>/<latest>.jsonl | jq
5

Human-in-the-loop — when to require approval

Some actions cost too much to be wrong. Implement these as hooks — never trust the agent's "I'll be careful" promise alone.

⚠️
Always require human approval for…

— Sending external messages (email, Slack, customer-facing)
— Writing to shared infrastructure (production DB, CI config, prod branches)
— Spending money (paid APIs, charges, deploys)
— Anything destructive (delete, force-push, drop)

6

When to use a hook vs the system prompt

Use a hook

  • The rule must be unbreakable
  • Path scope, secret-scrubbing, "no production"
  • Anything you'd otherwise have to trust the model on

Use the system prompt

  • Stylistic preferences and output format
  • Plan-then-execute rhythm
  • Self-review criteria
🚀 Capstone
Module 6

Pick a Template & Ship

Pick one template. Customize it with your real data. Run it end-to-end. Demo it in 90 seconds. Five templates to choose from — each one is a one-prompt scaffold you'll refine to fit your life.

Template A

Inbox Triage

Reads the last 20 unread Gmail messages, classifies each (reply / archive / forward / wait), drafts replies in your tone — never sends. You ship manually.

Template B

Daily Digest

At 7am, fetches the latest from 3 sources (RSS, HN, a Slack channel), writes a 200-word brief, saves to ~/digest/YYYY-MM-DD.md. Skips quiet days.

Template C

PR Reviewer

Fetches a PR diff, runs your team's checklist (edge cases, tests, naming, perf), produces a comment-ready review. Posts only when you confirm post=true.

Template D

Calendar Coordinator

Reads your next 7 days, finds open 90-minute blocks, drafts deep-work sessions matching priorities you list. Outputs a plan — never modifies the calendar.

Each template is one prompt away. Pick yours and paste:

Template A — Inbox Triage

"Create .claude/agents/inbox-triage.md. Uses the Gmail MCP tools to read the last 20 unread messages, classify each (reply / archive / forward / wait), and for reply candidates draft a response in my tone. Outputs a markdown table — never sends. I send manually."

Template B — Daily Digest

"Create .claude/agents/daily-digest.md. At 7am, fetches the latest from 3 sources I list (RSS feeds, hacker news, a Slack channel), writes a 200-word brief, saves to ~/digest/YYYY-MM-DD.md. Read-only. Skip the day if nothing new."

Template C — PR Reviewer

"Create .claude/agents/pr-reviewer.md. For a given PR URL, fetches the diff, runs through our team checklist (edge cases, tests, naming, performance), and produces a comment-ready review in markdown. Posts to GitHub only if I confirm with post=true."

Template D — Calendar Coordinator

"Create .claude/agents/cal-coordinator.md. Reads my calendar for the next 7 days, finds open 90-minute blocks, drafts proposed deep-work sessions matching priorities I list. Flags any conflicts with existing events. Outputs a plan — does not modify the calendar."

Template E — Research Companion

"Adapt the research-brief agent into .claude/agents/research-companion.md. Maintains a topics list. Daily, picks the topic with the oldest brief and refreshes it. Saves history so I see what changed week over week."

🎤
Your 90-second demo answers

1. What's the goal of your agent?
2. Show the trace of one run.
3. What's one thing it got wrong, and how did you fix it?
4. What would you build next on top of it?

06 — Cheat Sheet

Take This Home

The agent loop, a subagent template, the tool selection chart, and the failure modes you'll see most often. Keep this open while you build.

Subagent file template

.claude/agents/<name>.md
--- name: my-agent description: One sentence — when to invoke this agent tools: Read, Write, Bash, WebSearch --- You are an agent that <goal>. PLAN before acting. State your plan in one paragraph. ACT one step at a time. After each tool call, ask yourself: am I done? STOP if you exceed 30 tool calls, 8 minutes, or hit anything ambiguous. NEVER: - Write outside <allowed paths> - Send external messages without explicit approval - Continue past a destructive action without confirmation

Tool selection chart

Need Tool Approval needed?
Read project files Read, Glob No
Modify project files Write, Edit Hook-enforced scope
Run shell commands Bash Yes — high blast radius
Read the web WebFetch, WebSearch No
Talk to GitHub / Slack / Cal MCP server Per-action — read free, write gated
Custom internal API Agent SDK custom tool Define per call

Common failure modes & fixes

Symptom Likely cause Fix
Agent loops forever No stopping condition Add max-steps + explicit "done" criteria
Agent fabricates results Tool failed silently Force the agent to print tool output before claiming success
Agent over-edits Scope too broad Tighten path allow-list, add a PreToolUse hook
Slow / expensive runs Too many web calls Cap searches, cache fetched URLs in a temp file
Inconsistent output format Output spec implicit Add a literal example of the expected output to the system prompt

Quick command reference

Task How to do it
Create a subagent "Create .claude/agents/<name>.md that …"
List agents in this project /agents in chat
Install an MCP server claude mcp add <name> -- <launch command>
Read a run trace cat ~/.claude/projects/<project>/<run>.jsonl | jq
Add a guardrail hook Edit .claude/settings.json · PreToolUse matcher + command
Run dry-first Pass execute=false (or omit it if your agent defaults to dry-run)
07 — Ideas

10 Agents to Build at Home

Once you've shipped your capstone, these are the next ten to build. Each one is one prompt away — the patterns repeat.

🧾

receipt-sorter

Reads PDFs in ~/Receipts, extracts amounts, builds a monthly CSV

📰

standup-writer

Reads your last 24h of commits + PRs, drafts a Slack standup post

🔖

bookmark-librarian

Periodically de-dupes your browser bookmarks and tags by topic

🛒

deal-alert

Watches a list of products, pings you when prices drop below targets

🕵️

code-archaeology

Given a function, traces its callers and writes a one-pager on its history

📅

meeting-prep

For tomorrow's calendar, drafts 3-bullet prep notes per meeting

📥

inbox-archivist

Auto-archives newsletters, labels promos, surfaces only personal mail

📚

doc-gardener

Scans your repo for stale README/docs, opens PRs with refresh suggestions

✉️

newsletter-drafter

Turns a week of your tweets/posts into a newsletter draft

🧭

onboarding-buddy

For a new repo you cloned, writes a 1-page "where things live" guide

📦
Subagents are version-controllable

The files in .claude/agents/ live on disk. Commit them alongside your codebase. Share them across your team. Treat them like any other source file — diff, review, refactor.

Recap

Key Takeaways

Every agent runs the same loop: Goal → Plan → Act → Observe. If your "agent" isn't doing all four, it's a skill in disguise — make it one.

Subagents are plain Markdown files in .claude/agents/. YAML frontmatter (name, description, tools allow-list) plus a system prompt. Tighten the allow-list — less is safer.

Plan-then-execute beats raw improvisation. Force the agent to write its plan before any tool call, and to self-review before returning. Fix the agent, not the output.

Hooks enforce what the system prompt can only request. Use them for path scope, secret-scrubbing, and human approval gates on anything destructive or expensive.

Read the trace. Every run leaves a JSONL file under ~/.claude/projects/. When something misfires, the trace is your debugger.

Now go ship one.

Download the full workshop guide PDF and run the six modules end-to-end.

Download Workshop PDF