Building Agents

Why isolation matters

Skills solve the knowledge problem — getting the right instructions to Claude at the right time. But they do not solve the context problem. When a skill loads, its content enters your current conversation. When Claude follows that skill and reads 30 files, writes analysis, and produces a report, all of that output lives in your conversation too.

For quick tasks, this is fine. For heavy tasks — comprehensive code reviews, codebase exploration, test generation, documentation audits — it is a disaster. Your context window fills with intermediate work product, and by the time the task is done, Claude has forgotten what you were working on before you asked it to review that PR.

Agents solve this. An agent is a separate Claude instance with its own context window, its own system prompt, its own tools, and optionally its own copy of the repository. When you delegate a task to an agent, it works in complete isolation. Your main conversation sees only the final result.

The mental model: an agent is a chef. You give them a brief, they go into their own kitchen, use their own tools, make their own decisions, and bring back a finished dish. You do not stand in their kitchen. You do not lend them yours.

Think about the last time Claude's context window felt 'full' or it started forgetting earlier context. What were you doing?

Anatomy of an agent file

Agents live in .claude/agents/ as markdown files with YAML frontmatter. The markdown body becomes the agent's system prompt — the instructions it follows in its isolated context.

.claude/agents/
├── code-reviewer/AGENT.md
├── explorer/AGENT.md
└── test-writer/AGENT.md

Here is a complete agent file:

---
name: code-reviewer
description: Reviews code changes for quality, security, and standards compliance. Returns a structured report without modifying any files.
tools: Read Grep Glob
model: claude-sonnet-4-6
permission-mode: plan
skills:
  - review-standards
  - security-checklist
---

# Code Reviewer

You are a senior code reviewer. Your job is to analyse code changes
and produce a structured review report. You cannot edit files.

## Review Process

1. Read the diff to understand what changed
2. For each changed file, read the full file for context
3. Check against the preloaded review standards and security checklist
4. Produce a report grouped by severity

## Output Format

### Critical (must fix before merge)
- [file:line] Description of issue

### Warning (should fix)
- [file:line] Description of issue

### Suggestion (optional improvement)
- [file:line] Description of suggestion

### Summary
One paragraph assessing overall change quality.

Everything below the frontmatter --- becomes the agent's system prompt. The agent sees these instructions from its first moment of execution. It does not see your main conversation history.

The fields that shape agent behaviour

Agent frontmatter controls capabilities and constraints. Here are the fields that matter most.

name — The agent's identifier. Claude uses this internally when deciding to delegate.

description — How Claude decides when to delegate work to this agent. Like skill descriptions, front-load the key specialisation. Claude reads these descriptions and matches them against incoming tasks.

tools — The most powerful constraint. This field restricts which tools the agent can access:

tools: Read Grep Glob           # Read-only: can search and read, cannot edit
tools: Read Grep Glob Edit Bash # Full access: can read, edit, and run commands
tools: Bash                     # Bash only: for scripted operations

If omitted, the agent has access to all tools. Restricting tools is one of the best reasons to use agents. A code reviewer that can only read files cannot accidentally introduce changes. A documentation agent that cannot run bash cannot accidentally execute destructive commands.

model — Override the session model. Use claude-haiku-4-5 for fast, lightweight agents (exploration, simple analysis). Use claude-opus-4-6 for complex reasoning tasks. This lets you optimise cost and speed per agent.

permission-mode — Controls how the agent handles tool permission prompts:

Mode	Behaviour
`default`	Same as your session — prompts for each tool use
`plan`	Agent proposes actions, you approve the plan before execution
`acceptEdits`	File edits auto-approved, other tools still prompt
`dontAsk`	All tool uses auto-approved
`bypassPermissions`	No prompts, no restrictions (use with caution)

For read-only agents, dontAsk is safe because they cannot make changes. For agents that edit files, plan is a good default — the agent proposes a plan, you approve, then it executes.

skills — Preloads skills into the agent's context at startup. Full skill content is injected, not just descriptions. This is covered in depth in Module 6.

persistent-memory — If true, the agent maintains its own memory across sessions, stored in ~/.claude/projects/<project>/agents/<agent-name>/memory/. Useful for agents that build up project knowledge over time.

allowed-subagents — Restricts which sub-agents this agent can spawn. Prevents unbounded delegation chains.

paths — Glob patterns for auto-delegation matching. Claude only considers delegating to this agent when working with files matching these patterns.

Agents in their own universe

The isolation: worktree option gives an agent its own copy of the repository on a separate git branch. The agent can write files, run tests, modify configurations, even break things — and nothing touches your working directory.

This is invoked when Claude spawns the agent via the Agent tool with isolation: "worktree". The agent operates on the worktree, and if it makes changes, those changes live on a separate branch that you can review, merge, or discard.

Why this matters for test generation: A test-writing agent needs to write test files and run them to verify they pass. Without worktree isolation, it writes into your working directory — potentially conflicting with your uncommitted changes. With worktree isolation, it writes and runs tests in its own copy, returning only the verified results.

Why this matters for parallel work: You can spawn multiple agents in separate worktrees simultaneously. Each one works on a different branch without interfering with the others. Agent A builds the API endpoint, Agent B writes tests, Agent C updates documentation — all in parallel, all isolated.

Lifecycle:

Agent spawns in a new worktree (automatic git worktree add)
Agent does its work on the worktree's branch
Agent completes and returns results
If no changes were made, worktree is automatically cleaned up
If changes were made, worktree path and branch name are returned so you can review

You want Claude to generate tests for a new feature. The tests need to import the feature code, run, and pass. Which agent configuration handles this safely?

Three agents you should build

Agent 1: The Explorer

The most immediately useful agent. Before starting work on an unfamiliar part of your codebase, delegate exploration to an agent:

---
name: explorer
description: Maps unfamiliar codebase areas by reading files, tracing dependencies, and returning a structured architecture summary
tools: Read Grep Glob
model: claude-sonnet-4-6
permission-mode: dontAsk
---

# Codebase Explorer

You are a codebase explorer. Your job is to understand a specific
area of the codebase and return a clear, structured summary.

## Process
1. Start by reading the entry point files
2. Trace imports and dependencies
3. Map the data flow
4. Identify key patterns and conventions

## Output Format
- **Architecture**: How the components fit together
- **Key Files**: Most important files and their roles
- **Data Flow**: How data moves through the system
- **Patterns**: Conventions and patterns used
- **Dependencies**: External dependencies and their purpose

You ask Claude to "explore the authentication system" and it delegates to this agent. The agent reads 30+ files in its own context, maps the architecture, and returns a clean summary. Your conversation gets the summary — not the 30 files of raw content.

Agent 2: The Debugger

---
name: debugger
description: Investigates bugs by reading code, tracing execution paths, checking logs, and identifying root causes with minimal fix recommendations
tools: Read Grep Glob Bash
model: claude-opus-4-6
permission-mode: plan
---

# Production Debugger

You are a specialist debugger. Given a bug report, your job
is to identify the root cause and recommend a minimal fix.

## Process
1. Understand the expected vs actual behaviour
2. Identify the code path involved
3. Read relevant source files and tests
4. Trace the execution path to find where it diverges
5. Check for recent changes that might have caused the regression

## Output Format
- **Root Cause**: What specifically is wrong and where
- **Evidence**: Files and lines that demonstrate the issue
- **Minimal Fix**: The smallest change that resolves the issue
- **Risk Assessment**: What else might this fix affect

Agent 3: The Documentation Writer

---
name: doc-writer
description: Generates or updates documentation by reading source code and producing accurate, well-structured markdown docs
tools: Read Grep Glob Edit
model: claude-sonnet-4-6
permission-mode: acceptEdits
---

# Documentation Writer

You generate and update documentation based on current source code.
Never document speculative or planned features — only what exists.

## Process
1. Read the source code to understand current behaviour
2. Check existing docs for outdated information
3. Write or update documentation that matches the code

## Standards
- Use present tense ("The function returns..." not "will return")
- Include code examples from actual source when possible
- Document parameters, return values, and error cases
- Keep explanations concise — developers are the audience

✎

Module 5 — Final Assessment

What is the fundamental difference between a skill and an agent in terms of context?

You set an agent's tools to 'Read Grep Glob' and forget to add Edit. A colleague asks you why. What is the reason?

An agent with persistent-memory: true stores its memory where?

You want to run three agents simultaneously: one building an API endpoint, one writing tests, and one updating docs. What prevents them from conflicting?