Skills Inside Agents

The knowledge gap in isolated agents

You now know how to build skills (on-demand knowledge) and agents (isolated execution). Each is powerful on its own. But there is a problem when you use agents alone.

An agent starts with an empty context. It has its system prompt from AGENT.md, the project's CLAUDE.md, and whatever you tell it in the delegation prompt. It does not have your team's code review checklist. It does not know your testing conventions. It does not know that your React components use named exports and co-located test files.

So what happens? The agent starts exploring. It reads your codebase to figure out conventions. It infers patterns from existing code. This takes time, consumes its context window, and the inferences are not always correct. The agent might produce a code review that checks for things your team does not care about and misses things your team considers critical.

Skills fix this. When you preload skills into an agent using the skills field in the agent's frontmatter, the full content of those skills is injected into the agent's context at startup. The agent starts with institutional knowledge already loaded — your standards, your conventions, your checklists — before it reads a single file.

This is the difference between hiring a contractor who arrives on day one and needs two weeks to learn your processes, and hiring a contractor who has already read your operations manual.

Think about a task you regularly delegate or wish you could delegate. What knowledge would the person handling it need to have before starting?

The skills field in agent frontmatter

The mechanics are straightforward. In your agent's AGENT.md frontmatter, list the skills you want preloaded:

---
name: code-reviewer
description: Reviews code changes against team standards
tools: Read Grep Glob
skills:
  - review-standards
  - security-checklist
---

When this agent spawns, Claude Code:

Creates a new context window for the agent
Loads the agent's system prompt (the markdown body)
Loads the project's CLAUDE.md
Loads the full content of review-standards and security-checklist skills
Then begins executing the delegated task

The key difference from normal skill behaviour: in your main session, Claude loads only skill descriptions (short text used for matching). Agents with the skills field get the full content — every instruction, every checklist item, every template. This is more expensive in tokens but ensures the agent has complete domain knowledge from the start.

Design implication: Skills you plan to preload into agents should be written as complete, standalone reference documents. They should not assume the reader has access to your conversation history or earlier context. Write them as if they are the only instructions someone has for doing that job.

Pattern 1: The Restricted Reviewer

This is the most immediately useful pattern. A code review agent that can only read code, loaded with your team's exact review criteria.

The agent file (.claude/agents/reviewer/AGENT.md):

---
name: reviewer
description: Reviews code changes for quality, security, and convention compliance using team-specific review criteria. Cannot modify files.
tools: Read Grep Glob
model: claude-sonnet-4-6
permission-mode: dontAsk
skills:
  - code-review-standards
  - security-checklist
---

# Code Reviewer

You are a senior code reviewer for this project. You have been
preloaded with two skills containing the team's review standards
and security checklist. Use them as your evaluation criteria.

## Process
1. Identify all changed files
2. Read each changed file in full (not just the diff — you need context)
3. Evaluate every change against the preloaded standards
4. Check every change against the security checklist
5. Produce a structured report

## Rules
- Do not flag style issues that contradict the preloaded standards
- Do not suggest patterns that are not already used in this codebase
- Severity must match the standards: if the standards say "naming is critical", a naming violation is Critical, not Suggestion
- If unsure about a convention, check three existing files for precedent before flagging

The review standards skill (.claude/skills/code-review-standards/SKILL.md):

---
name: code-review-standards
description: Team code review criteria and severity definitions
user-invocable: false
---

# Code Review Standards

## Critical (blocks merge)
- Type safety violations: implicit any, unsafe type assertions
- Missing error handling on async operations
- Direct database queries outside the data access layer
- Hardcoded secrets or credentials
- Missing input validation on public-facing endpoints

## Warning (should fix before merge)
- Functions exceeding 50 lines
- Missing TypeScript return type annotations on exported functions
- Console.log statements (use structured logger)
- Magic numbers without named constants

## Suggestion (optional)
- Variable naming improvements
- Code organisation within files
- Documentation additions

Why this works: The agent cannot edit your code (tools: Read Grep Glob only). It reviews against your actual criteria (preloaded skills), not generic best practices. It runs in its own context, so reading 30 changed files does not pollute your conversation. And because the skills are version-controlled files, every developer on your team gets the same review criteria.

Pattern 2: The Codebase Navigator

Every developer has experienced this: you need to work on an unfamiliar part of the codebase. Before writing a single line of code, you need to understand how that area is structured — the key files, the data flow, the patterns in use. If you do this exploration in your main conversation, the context fills with file contents that are useful for understanding but useless for implementation.

The Codebase Navigator is an exploration agent preloaded with an architecture skill that describes your project structure. Without the skill, the agent would spend its first actions figuring out where things are. With it, the agent already knows your directory structure, naming conventions, and key patterns — it goes straight to the relevant code.

The agent file (.claude/agents/navigator/AGENT.md):

---
name: navigator
description: Explores and maps unfamiliar areas of the codebase, returning a structured architecture summary with key files, data flow, and patterns
tools: Read Grep Glob
model: claude-sonnet-4-6
permission-mode: dontAsk
skills:
  - project-architecture
---

# Codebase Navigator

You are a codebase exploration specialist. You already know the
project's high-level architecture from the preloaded skill. Use
that knowledge to navigate efficiently.

## Process
1. Start from the entry point most relevant to the requested area
2. Trace imports and dependencies (you know the directory conventions)
3. Map the data flow from input to output
4. Identify patterns specific to this area vs project-wide patterns

## Output Format

### Area: [name of the area explored]

**Entry Points:** List the 3-5 most important files

**Data Flow:**
1. [Input] → [Processing] → [Output]

**Key Patterns:**
- Pattern name: where it's used and why

**Dependencies:**
- Internal: which other areas of the codebase this connects to
- External: which packages/services this depends on

**Gotchas:**
- Anything surprising or non-obvious about this area

The architecture skill (.claude/skills/project-architecture/SKILL.md):

---
name: project-architecture
description: Project directory structure, naming conventions, and key architectural patterns
user-invocable: false
---

# Project Architecture

## Directory Structure
- src/app/ — Next.js App Router pages and layouts
- src/components/ — React components (co-located with tests)
- src/api/ — API route handlers
- src/lib/ — Shared utilities and business logic
- convex/ — Convex backend (schema, mutations, queries)
- content/ — MDX content files

## Naming Conventions
- Components: PascalCase files, named exports
- Utilities: camelCase files, named exports
- API routes: route.ts in directory matching the URL path
- Tests: __tests__/[name].test.tsx co-located with source

## Key Patterns
- Server components by default, 'use client' only when needed
- Data fetching in server components, passed as props
- Convex for real-time data, API routes for external integrations

Adapt this skill to describe your actual project. The more specific the architecture skill, the more efficient the navigator agent becomes.

Pattern 3: The Safe Test Writer

Test generation is one of the most compelling agent use cases because it combines three needs: domain knowledge (your testing conventions), execution freedom (writing files, running tests), and safety (do not break the developer's working directory).

The Safe Test Writer runs in a worktree, preloaded with your testing conventions and component patterns. It writes test files, runs them, iterates on failures, and returns verified, passing tests on a separate branch.

The agent file (.claude/agents/test-writer/AGENT.md):

---
name: test-writer
description: Generates tests for new or modified code, running them in an isolated worktree to verify they pass before returning results
tools: Read Grep Glob Edit Write Bash
model: claude-sonnet-4-6
permission-mode: acceptEdits
skills:
  - testing-conventions
  - component-patterns
---

# Test Writer

You write tests in an isolated worktree. You can freely create
files, run tests, and iterate. Nothing you do affects the main
working directory.

## Process
1. Read the source code to understand what needs testing
2. Review the preloaded testing conventions for structure and patterns
3. Write test files following those conventions exactly
4. Run the tests
5. If tests fail, read the errors, fix the tests, and run again
6. Repeat until all tests pass
7. Report which files you created and their test coverage

## Rules
- Follow the naming and structure from testing-conventions exactly
- Test happy paths first, then error cases, then edge cases
- Mock external dependencies, never internal modules
- Each test should test one behaviour — no multi-assertion tests
- If a test requires complex setup, that is a sign the source code
  may need refactoring — note this in your report, do not refactor

The testing conventions skill (.claude/skills/testing-conventions/SKILL.md):

---
name: testing-conventions
description: Team testing patterns, assertion style, and structure conventions
user-invocable: false
---

# Testing Conventions

## Structure
- Test files: __tests__/[source-name].test.ts(x)
- Describe blocks: one per function or component
- Test names: "should [expected behaviour] when [condition]"

## Assertion Style
- Use expect().toBe() for primitives
- Use expect().toEqual() for objects and arrays
- Use expect().toThrow() for error cases
- Never use expect().toBeTruthy() — be specific about what you expect

## Mocking
- Mock external HTTP calls with msw
- Mock database with in-memory fixtures
- Never mock the module under test
- Reset all mocks in afterEach

## Running
- pnpm test — run all tests
- pnpm test [path] — run specific test file
- pnpm test --coverage — run with coverage report

The combination of worktree isolation + preloaded skills means the agent can freely experiment, iterate, and verify — while producing tests that match your team's conventions from the first draft.

Pattern 4: The Parallel Pipeline

This is the most ambitious pattern. You write one prompt describing a feature. Claude's main thread creates an implementation plan. Then it spawns multiple agents simultaneously — each in its own worktree, each preloaded with the same shared skills for consistency.

The setup: four agents share two skills.

Shared skills:

coding-standards — naming conventions, error handling, patterns
project-architecture — directory structure, key patterns, conventions

Agent 1: API Builder — builds the backend endpoint

---
name: api-builder
tools: Read Grep Glob Edit Write Bash
skills: [coding-standards, project-architecture]
---
Build the API endpoint following project conventions.
Run the linter after creating files.

Agent 2: Frontend Builder — builds the UI component

---
name: frontend-builder
tools: Read Grep Glob Edit Write Bash
skills: [coding-standards, project-architecture]
---
Build the React component following project conventions.
Ensure it handles loading, error, and success states.

Agent 3: Test Writer — generates tests for both

---
name: test-writer
tools: Read Grep Glob Edit Write Bash
skills: [coding-standards, testing-conventions]
---
Generate tests for both the API endpoint and the frontend component.
Run all tests and iterate until they pass.

Agent 4: Doc Writer — updates documentation

---
name: doc-writer
tools: Read Grep Glob Edit
skills: [coding-standards, project-architecture]
---
Update the relevant documentation to reflect the new feature.
Document the API endpoint, component props, and usage examples.

Four agents, four worktrees, one set of standards. Each agent already knows your coding conventions and architecture before it writes a single line. The code they produce independently is consistent because they share the same skills.

After completion, you have four branches to review. The API endpoint matches your patterns. The component matches your patterns. The tests match your testing conventions. The docs match your documentation style. Merge them in sequence.

In the parallel pipeline pattern, what ensures consistency across four independent agents?

✎

Module 6 — Final Assessment

In your main session, Claude loads skill descriptions. When an agent preloads a skill via the skills field, what does it load instead?

You build a code review agent with tools: Read Grep Glob and preloaded review standards. A teammate asks: 'Why not just put the review standards in the agent's system prompt?' What is the advantage of using a skill?

An agent without preloaded skills needs to understand your codebase conventions. How does it typically learn them?

In Pattern 3 (Safe Test Writer), the agent runs in a worktree with full tools. What happens if it writes a broken test file?

You have 5 agents that all need to follow the same coding standards. One approach is to copy the standards into each agent's system prompt. What is the problem with this?