Creating a new agent
QA Orchestra ships with 10 agents covering common quality-engineering questions. Your team might need more — a compliance reviewer for regulated industries, a performance reviewer that knows your SLOs, a security-focused reviewer for sensitive endpoints. Agents are standalone Markdown files. No SDK, no build step, no registration. If you can edit Markdown, you can write an agent.
Anatomy of an agent file
Every agent lives at .claude/agents/<agent-name>.md and has two parts: YAML frontmatter and a body.
1. YAML frontmatter
---
name: compliance-reviewer
description: Compares diffs against regulatory compliance requirements (SOC 2, HIPAA, GDPR) and flags gaps
model: opus
tools: Read, Glob, Grep, Bash
---
| Field | Purpose | Rules |
|---|---|---|
name |
Agent identifier, used in @agent-name invocations and the Agent tool's subagent_type. |
Must exactly match the filename (without .md). The structural linter enforces this. |
description |
One-line summary shown in autocomplete and the Claude Code agent picker. | Start with a verb. Be concrete. "Compares diffs against AC" beats "Reviews code". |
model |
Which Claude model family the agent runs on. | sonnet, opus, or haiku. Use opus for heavy reasoning (functional review, release analysis). sonnet for most work. haiku for fast formatting or routing. |
tools |
Comma-separated list of tools the agent is allowed to use. | Start with Read, Glob, Grep, Bash. Add Write, Edit only if the agent creates or modifies files. Add Agent only if it dispatches to sub-agents. |
2. The body
Every agent file must have three sections. The structural linter (scripts/validate-agents.mjs) enforces their presence:
## Role— one paragraph describing who the agent is and what they do. Referencecontext/CONTEXT.mdand thecontext/annotations/directory where relevant. Be explicit about scope: what the agent does not do is often as important as what it does.## Output format— the exact shape of the agent's output. If it writes toqa-output/, describe the file and its top-level structure. If it returns inline results, show a template.## Rules— non-negotiables. What the agent must and must never do. "Never fabricate findings", "always cite file:line", "stop and ask on ambiguity instead of assuming."
Most agents also include free-form intermediate sections: ## Prerequisites, ## Step 1 — Gather inputs, ## Analysis framework, ## Severity definitions. These are optional — add them when "Role + Rules" alone isn't enough scaffolding for a complex task.
The trigger block
By convention, every agent body starts with a trigger block right after the H1 heading, stating what the agent reads and what it writes:
> **Trigger**: A code diff and compliance rules need to be compared.
> **Reads**: Diff + `context/CONTEXT.md` + compliance policies.
> **Writes**: `qa-output/compliance-review.md`
The linter does not enforce this, but downstream agents rely on it. The bug-reporter agent knows qa-output/functional-review.md exists because functional-reviewer's trigger block says it writes there. Keep the convention consistent.
The fast path: copy an existing agent
The easiest way to start is to copy the agent whose shape is closest to what you want, then edit.
| You're building… | Start from | Why |
|---|---|---|
| A new kind of reviewer (compliance, security, performance) | functional-reviewer.md |
Already has the AC-vs-diff comparison framework, severity handling, and gap-report structure. Swap Role for your domain. |
| A new scenario or test generator | test-scenario-designer.md |
Already has the happy / negative / boundary / edge coverage framework. |
| A routing or planning agent | orchestrator.md |
Already shows the pattern for reading inputs, deciding what to run, and producing a plan file. |
| A browser-driven agent | browser-validator.md |
Already wired to Chrome MCP with the per-scenario loop and evidence-capture conventions. |
| A structured transformer (findings → reports, scenarios → tests) | bug-reporter.md or automation-writer.md |
Already has the "one input = one output" pattern and chaining conventions. |
Copy the file, rename it, update name in the frontmatter to match the new filename, and rewrite ## Role for your domain. Keep ## Output format and ## Rules — they usually only need small tweaks.
Do not invent new top-level sections just to be different. The linter tolerates extra sections, but contributors and downstream agents appreciate consistency. If you need a new kind of structure, open an issue first and discuss.
Validation
Before submitting a new agent, run the structural linter locally:
node scripts/validate-agents.mjs
The linter checks:
- YAML frontmatter has
name,description,model,tools - Filename matches the frontmatter
name modelis one ofsonnet,opus,haiku- Body has an H1 heading
- Body has
## Role,## Output format, and## Rulessections
Passing the linter means the file is structurally valid. It does not guarantee the agent produces good output — that's what the reference tests under tests/ are for. The same lint runs in CI on every push and PR via .github/workflows/lint.yml.
Documenting the agent
After adding the file, update three things:
README.md— add a row to the appropriate tier table.- Tier 1 if the agent is a standalone daily driver (answers one question, runs independently, produces a Markdown file a human can paste into a ticket).
- Tier 2 if the agent is part of the live-validation chain (needs another agent to have run first, e.g., needs a running app).
- Tier 3 if the agent is an orchestration or niche supporting agent.
CLAUDE.md— add a row to the agent map with the agent's Reads and Writes contract. This is what downstream agents consult to know the chain.AGENTS.md— only if the agent has cross-agent dependencies worth calling out in the behavioral rules.
Testing your new agent
Add a worked example under tests/<your-agent-name>/:
input-*.md— realistic input(s) the agent will consume. For a reviewer, a sample diff + sample AC. For a generator, a sample AC or finding. Keep it small — a 10-line diff and a 5-AC ticket usually exercise the interesting paths.expected-output.md— what a "good enough" output looks like. Not a string-match target. A human reading both files should be able to say "yes, this shape is what I'd expect on this input".
See tests/README.md and the existing examples under tests/functional-reviewer/ and tests/test-scenario-designer/ for the pattern.
LLMs aren't deterministic, and string-equality tests on LLM output produce brittle red herrings that break on every run. The reference-example pattern is slower (it needs a human to glance at the diff) but it actually catches regressions in the shape of an agent's answer — missing sections, missing AC verdicts, missing evidence citations — which is what usually breaks when someone edits a Role section carelessly.
Stack-agnostic design
QA Orchestra agents never hardcode framework-specific details. All project-specific commands, URLs, file paths, and conventions come from context/CONTEXT.md (see how context works). When writing a new agent:
- Do not hardcode npm, pytest, cargo, or any toolchain. Read the run command from CONTEXT.md's Automation Framework section.
- Do not hardcode repo names or paths. Read them from the Repositories section.
- Do not hardcode health-check URLs, content markers, or log sentinels. Read them from Health Check.
- Do not hardcode severity definitions, AC format, or terminology. Read them from Project Management and Preferences.
If your agent needs something CONTEXT.md doesn't define, add it to the schema at context/CONTEXT.schema.md as an optional section, and the user who installs your agent will populate it for their stack.
Final checklist
- Agent file at
.claude/agents/<name>.mdwith valid YAML frontmatter - Body has
## Role,## Output format, and## Rulessections - Trigger block at the top naming Reads / Writes
node scripts/validate-agents.mjspasses locally- Row added to the appropriate tier in
README.md - Row added to the agent map in
CLAUDE.md - Reference example in
tests/<name>/with realistic input + expected output - Commit message is one line, descriptive, imperative (e.g.
feat: add compliance-reviewer agent)
Open a PR against main. CI will run the structural linter automatically. Once it passes and a maintainer approves, your agent is in.
Next
- How context works — the CONTEXT.md schema your new agent will read from
- The learning loop — how your agent should contribute to
context/annotations/ - Existing agents on GitHub — read real examples before copying