What is an AI agent harness?

An agent harness is the configuration layer that turns a base AI coding agent like Claude Code into a specialized development environment. It consists of CLAUDE.md project instructions, skills (reusable prompt templates), hooks (shell scripts triggered by lifecycle events), memory (persistent file-based context), and subagents (isolated child processes for parallel work). The harness constrains, extends, and orchestrates the agent's behavior without modifying its source code.

What is the difference between skills and hooks in Claude Code?

Skills are prompt templates invoked by the user or agent (like slash commands) that inject context and instructions. Hooks are shell scripts triggered automatically by lifecycle events (PreToolUse, PostToolUse, etc.) that run outside the model's control — the model cannot skip or modify them. Skills tell the agent what to do; hooks enforce what must happen regardless of what the agent decides.

How do subagents work in Claude Code?

Subagents are isolated child processes spawned via the Agent tool. Each subagent gets its own context window, can run in the background, and can optionally use git worktrees for file isolation. They communicate results back to the parent through a single return message. Subagents enable parallel research, code review, testing, and multi-agent deliberation patterns.

How does memory work across Claude Code sessions?

Claude Code uses file-based memory stored in ~/.claude/projects/ /memory/. A MEMORY.md index file is automatically loaded into every conversation. Individual memory files use YAML frontmatter with type (user, feedback, project, reference), name, and description fields. Memory persists across sessions and conversations, enabling the agent to retain user preferences, project context, and learned corrections.

When should I use a skill vs a hook vs a subagent?

Use skills for reusable workflows the user or agent invokes on demand (code review, deployment, translation). Use hooks for mandatory enforcement that must happen regardless of agent behavior (security checks, formatting, logging). Use subagents for parallel or isolated work that would clutter the main context (research, testing, background tasks). Use memory for information that must persist across sessions.

agent:~/.claude$ cat agent-architecture.md

 █████╗  ██████╗ ███████╗███╗   ██╗████████╗
██╔══██╗██╔════╝ ██╔════╝████╗  ██║╚══██╔══╝
███████║██║  ███╗█████╗  ██╔██╗ ██║   ██║
██╔══██║██║   ██║██╔══╝  ██║╚██╗██║   ██║
██║  ██║╚██████╔╝███████╗██║ ╚████║   ██║
╚═╝  ╚═╝ ╚═════╝ ╚══════╝╚═╝  ╚═══╝   ╚═╝

Agent Architecture: Building AI-Powered Development Harnesses

# The complete system for building production AI agent harnesses. Skills, hooks, memory, subagents, multi-agent orchestration, and the patterns that make AI coding agents reliable infrastructure.

words: 21755 read_time: 109m updated: 2026-06-10 00:00

$ less agent-architecture.md

TL;DR: Claude Code is not a chat box with file access. It is a programmable runtime with 29 documented lifecycle events, each hookable with shell scripts the model cannot skip. Stack hooks into dispatchers, dispatchers into skills, skills into agents, agents into workflows, and you get an autonomous development harness that enforces constraints, delegates work, persists memory across sessions, and orchestrates multi-agent deliberation. Claude Code v2.1.147 added the off-by-default Workflow tool (CLAUDE_CODE_WORKFLOWS=1), moving deterministic multi-agent orchestration from pure userland scripts toward a first-party runtime primitive; v2.1.149 reinforces the same lesson from the security side with PowerShell permission-bypass fixes and a git-worktree sandbox allowlist fix. Hooks and evidence gates still own correctness.⁵²⁵³ This guide covers every layer of that stack: from a single hook to a 10-agent consensus system. Zero frameworks required. All bash and JSON.

Andrej Karpathy coined a term for what grows around an LLM agent: claws. The hooks, scripts, and orchestration that let the agent grip the world outside its context window.¹ Most developers treat AI coding agents as interactive assistants. They type a prompt, watch it edit a file, and move on. That framing caps productivity at whatever you can personally oversee.

The infrastructure mental model is different: an AI coding agent is a programmable runtime with an LLM kernel. Every action the model takes passes through hooks you control. You define policies, not prompts. The model operates within your infrastructure the same way a web server operates within nginx rules. You do not sit at nginx and type requests. You configure it, deploy it, and monitor it.

The distinction matters because infrastructure compounds. A hook that blocks credentials in bash commands protects every session, every agent, every autonomous run. A skill that encodes your evaluation rubric applies consistently whether you invoke it or an agent does. An agent that reviews code for security runs the same checks whether you are watching or not.²

Key Takeaways

Hooks guarantee execution; prompts do not. Use hooks for linting, formatting, security checks, and anything that must run every time regardless of model behavior. Exit code 2 blocks actions. Exit code 1 only warns.³
Skills encode domain expertise that auto-activates. The description field determines everything. Claude uses LLM reasoning (not keyword matching) to decide when to apply a skill.⁴
Subagents prevent context bloat. Isolated context windows for exploration and analysis keep the main session lean. Run independent subagents in parallel, and use agent teams when workers need sustained coordination.⁵
Memory lives in the filesystem. Files persist across context windows. CLAUDE.md, MEMORY.md, rules directories, and handoff documents form a structured external memory system.⁶
Multi-agent deliberation catches blind spots. Single agents cannot challenge their own assumptions. Two independent agents with different evaluation priorities catch structural failures that quality gates cannot address.⁷
The harness pattern is the system. CLAUDE.md, hooks, skills, agents, and memory are not independent features. They compose into a deterministic layer between you and the model that scales with automation.

How to Use This Guide

Experience	Start Here	Then Explore
Using Claude Code daily, want more	The Harness Pattern	Skills System, Hook Architecture
Building autonomous workflows	Subagent Patterns	Multi-Agent Orchestration, Production Patterns
Evaluating agent architecture	Why Agent Architecture Matters	Decision Framework, Security Considerations
Setting up a team harness	CLAUDE.md Design	Hook Architecture, Quick Reference Card

Each section builds on the previous. The Decision Framework at the end provides a lookup table for choosing the right mechanism for each problem type.

Five-Minute Golden Path

Before the deep dive, here is the shortest path from zero to a working harness. One hook, one skill, one subagent, one outcome.

Step 1: Create a security hook (2 minutes)

Create .claude/hooks/block-secrets.sh:

#!/bin/bash
INPUT=$(cat)
CMD=$(echo "$INPUT" | jq -r '.tool_input.command // empty')
if echo "$CMD" | grep -qEi '(AKIA|sk-|ghp_|password=)'; then
    echo "BLOCKED: Potential secret in command" >&2
    exit 2
fi

Wire it in .claude/settings.json:

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash",
        "hooks": [{ "type": "command", "command": ".claude/hooks/block-secrets.sh" }]
      }
    ]
  }
}

Result: Every bash command Claude runs is now screened for leaked credentials. The model cannot skip this check.

Step 2: Create a code review skill (1 minute)

Create .claude/skills/reviewer/SKILL.md with frontmatter (name: reviewer, description: Review code for security issues, bugs, and quality problems. Use when examining changes, reviewing PRs, or auditing code., allowed-tools: Read, Grep, Glob) and a checklist: SQL injection, XSS, hardcoded secrets, missing error handling, functions over 50 lines.

Result: Claude auto-activates this expertise whenever you mention review, check, or audit.

Step 3: Spawn a subagent (30 seconds)

In any Claude Code session, ask Claude to review the last 3 commits for security issues using a separate agent. Claude spawns an Explore agent that reads the diff, applies your review skill, and returns a summary. Your main context stays clean.

What you now have

A three-layer harness: a deterministic security gate (hook), domain expertise that auto-activates (skill), and isolated analysis that protects your context (subagent). Every section below expands one of these three layers.

Why Agent Architecture Matters

Simon Willison frames the current moment around a single observation: writing code is cheap now.⁸ Correct. But the corollary is that verification is now the expensive part. Cheap code without verification infrastructure produces bugs at scale. The investment that pays off is not a better prompt. It is the system around the model that catches what the model misses.

Three forces make agent architecture necessary:

Context windows are finite and lossy. Every file read, tool output, and conversation turn consumes tokens. Microsoft Research and Salesforce tested 15 LLMs across 200,000+ simulated conversations and found a 39% average performance drop from single-turn to multi-turn interaction.⁹ The degradation starts in as few as two turns and follows a predictable curve: precise multi-file edits in the first 30 minutes degrade into single-file tunnel vision by minute 90. Longer context windows do not fix this. The same study’s “Concat” condition (full conversation as a single prompt) achieved 95.1% of single-turn performance with identical content. The degradation comes from turn boundaries, not token limits.

Model behavior is probabilistic, not deterministic. Telling Claude “always run Prettier after editing files” works roughly 80% of the time.³ The model might forget, prioritize speed, or decide the change is “too small.” For compliance, security, and team standards, 80% is not acceptable. Hooks guarantee execution: every Edit or Write triggers your formatter, every time, no exceptions. Deterministic beats probabilistic.

Single perspectives miss multi-dimensional problems. A single agent reviewing an API endpoint checked authentication, validated input sanitization, and verified CORS headers. Clean bill of health. A second agent, prompted separately as a penetration tester, found the endpoint accepted unbounded query parameters that could trigger denial-of-service through database query amplification.⁷ The first agent never checked because nothing in its evaluation framework treated query complexity as a security surface. That gap is structural. No amount of prompt engineering fixes it.

Agent architecture addresses all three: hooks enforce deterministic constraints, subagents manage context isolation, and multi-agent orchestration provides independent perspectives. Together they form the harness.

The Harness Pattern

The harness is not a framework. It is a pattern: a composable set of files, scripts, and conventions that wrap an AI coding agent in deterministic infrastructure. The components:

┌──────────────────────────────────────────────────────────────┐
│                      THE HARNESS PATTERN                      │
├──────────────────────────────────────────────────────────────┤
│  ORCHESTRATION                                                │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐             │
│  │   Agent     │  │   Agent    │  │  Consensus │             │
│  │   Teams     │  │  Spawning  │  │  Validation│             │
│  └────────────┘  └────────────┘  └────────────┘             │
│  Multi-agent deliberation, parallel research, voting          │
├──────────────────────────────────────────────────────────────┤
│  EXTENSION LAYER                                              │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐    │
│  │  Skills   │  │  Hooks   │  │  Memory  │  │  Agents  │    │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘    │
│  Domain expertise, deterministic gates, persistent state,     │
│  specialized subagents                                        │
├──────────────────────────────────────────────────────────────┤
│  INSTRUCTION LAYER                                            │
│  ┌──────────────────────────────────────────────────────┐    │
│  │     CLAUDE.md  +  .claude/rules/  +  MEMORY.md       │    │
│  └──────────────────────────────────────────────────────┘    │
│  Project context, operational policy, cross-session memory    │
├──────────────────────────────────────────────────────────────┤
│  CORE LAYER                                                   │
│  ┌──────────────────────────────────────────────────────┐    │
│  │           Main Conversation Context (LLM)             │    │
│  └──────────────────────────────────────────────────────┘    │
│  Your primary interaction; finite context; costs money        │
└──────────────────────────────────────────────────────────────┘

Instruction Layer: CLAUDE.md files and rules directories define what the agent knows about your project. They load automatically at session start and after every compaction. This is the agent’s long-term architectural memory.

Extension Layer: Skills provide domain expertise that auto-activates based on context. Hooks provide deterministic gates that fire on every matching tool call. Memory files persist state across sessions. Custom agents provide specialized subagent configurations.

Orchestration Layer: Multi-agent patterns coordinate independent agents for research, review, and deliberation. Spawn budgets prevent runaway recursion. Consensus validation ensures quality.

The key insight: most users work entirely in the Core Layer, watching context bloat and costs climb. Power users configure the Instruction and Extension layers, then use the Core Layer only for orchestration and final decisions.²

Managed vs. Self-Hosted Harnesses (April 2026)

Throughout early 2026, the “build your own harness” path was the only real option. In April 2026, that changed. Anthropic shipped Claude Managed Agents in public beta (April 8): harness loop + tool execution + sandbox container + state persistence as a REST API, billed at standard tokens plus $0.08/session-hour. OpenAI’s Agents SDK update (April 16) formalized the same split — harness and compute as separate layers, with native sandbox providers (Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, Vercel) and snapshot/rehydrate for surviving container loss.²³²⁴

The deeper SDK surface for the OpenAI side landed in openai-agents Python v0.14.0 (released April 15, 2026; announced April 16): a SandboxAgent subclass of Agent with default_manifest, sandbox instructions, and capabilities; a Manifest describing the fresh-workspace contract (files, dirs, local files, Git repos, env, users, mounts); a SandboxRunConfig for per-run wiring of sandbox client, live session injection, manifest overrides, snapshots, and materialization concurrency limits. Built-in capabilities cover shell access, filesystem editing, image inspection, skills, sandbox memory, and compaction. Sandbox memory persists extracted lessons across runs and progressively discloses them; workspaces support local files, Git repo entries, and remote mounts (S3, R2, GCS, Azure Blob, S3 Files); snapshots are portable across providers. Backends: UnixLocalSandboxClient, DockerSandboxClient, and hosted clients for Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, and Vercel via optional extras.²⁴

For Python projects that want to embed the Claude Code runtime as a library — between “shell out to claude” and “REST API to Managed Agents” — claude-agent-sdk-python is the third option. The April 28-29 series (v0.1.69 → v0.1.71) bumped the bundled CLI to v2.1.123, raised the floor on the mcp dependency to >=1.19.0 (older versions silently dropped CallToolResult returns from in-process MCP tools, leaving the model with a validation-error blob), and brought SandboxNetworkConfig to schema parity with the TypeScript SDK (allowedDomains, deniedDomains, allowManagedDomainsOnly, allowMachLookup).³⁰

If your harness includes a voice or realtime layer, openai-agents-python v0.17.0 (May 8, 2026) updated RealtimeAgent to default to gpt-realtime-2.⁴¹ Existing realtime sessions pick up the new default automatically; pin the previous model explicitly if you need to hold the old behavior for evaluation.

The architectural fork is now real:

Dimension	Self-hosted harness (this guide’s default)	Managed harness (Claude Managed Agents / OpenAI Agents SDK)
Operational burden	You run everything	Vendor runs loop, sandbox, state
Customization	Total — your hooks, your skills, your memory	Bounded — vendor-defined extension points
Cost model	Token + self-hosted compute	Token + runtime-hour premium
State durability	You design it	Vendor checkpoints across disconnects
Agent team orchestration	Build your own	Vendor-provided multi-agent coordination

When to pick which: self-hosted remains right for teams that already have infrastructure muscle, want skills/hooks they control, or are optimizing a specific workflow deeply. Managed is right for teams without dedicated platform engineers, when time-to-value matters more than customization, or when agent runs need to survive laptop closures reliably without you building that persistence layer. The two are compatible — you can run a self-hosted harness that delegates specific long-running tasks to Managed Agents via its REST API.

What the Harness Looks Like on Disk

~/.claude/
├── CLAUDE.md                    # Personal global instructions
├── settings.json                # User-level hooks and permissions
├── skills/                      # Personal skills (44+)
│   ├── code-reviewer/SKILL.md
│   ├── security-auditor/SKILL.md
│   └── api-designer/SKILL.md
├── agents/                      # Custom subagent definitions
│   ├── security-reviewer.md
│   └── code-explorer.md
├── rules/                       # Categorized rule files
│   ├── security.md
│   ├── testing.md
│   └── git-workflow.md
├── hooks/                       # Hook scripts
│   ├── validate-bash.sh
│   ├── auto-format.sh
│   └── recursion-guard.sh
├── configs/                     # JSON configuration
│   ├── recursion-limits.json
│   └── deliberation-config.json
├── state/                       # Runtime state
│   ├── recursion-depth.json
│   └── agent-lineage.json
├── handoffs/                    # Session handoff documents
│   └── deliberation-prd-7.md
└── projects/                    # Per-project memory
    └── {project}/memory/MEMORY.md

.claude/                         # Project-level (in repo)
├── CLAUDE.md                    # Project instructions
├── settings.json                # Project hooks
├── skills/                      # Team-shared skills
├── agents/                      # Team-shared agents
└── rules/                       # Project rules

Every file in this structure serves a purpose. The ~/.claude/ tree is personal infrastructure that applies to all projects. The .claude/ tree in each repository is project-specific and shared via git. Together, they form the complete harness.

Skills System

Skills are model-invoked extensions. Claude discovers and applies them automatically based on context, without you explicitly calling them.⁴ The moment you catch yourself re-explaining the same context across sessions is the moment you should build a skill.

When to Build a Skill

Situation	Build a…	Why
You paste the same checklist every session	Skill	Domain expertise that auto-activates
You run the same command sequence explicitly	Slash command	User-invoked action with predictable trigger
You need isolated analysis that shouldn’t pollute context	Subagent	Separate context window for focused work
You need a one-time prompt with specific instructions	Nothing	Just type it. Not everything needs abstraction.

Skills are for knowledge Claude always has available. Slash commands are for actions you explicitly trigger. If you are deciding between the two, ask: “Should Claude apply this automatically, or should I decide when to run it?”

Creating a Skill

Skills live in four possible locations, from broadest to narrowest scope:⁴

Scope	Location	Applies to
Enterprise	Managed settings	All users in organization
Personal	`~/.claude/skills/<name>/SKILL.md`	All your projects
Project	`.claude/skills/<name>/SKILL.md`	This project only
Plugin	`<plugin>/skills/<name>/SKILL.md`	Where plugin is enabled

Every skill requires a SKILL.md file with YAML frontmatter:

---
name: code-reviewer
description: Review code for security vulnerabilities, performance issues,
  and best practice violations. Use when examining code changes, reviewing
  PRs, analyzing code quality, or when asked to review, audit, or check code.
allowed-tools: Read, Grep, Glob
---

# Code Review Expertise

## Security Checks
When reviewing code, verify:

### Input Validation
- All user input sanitized before database operations
- Parameterized queries (no string interpolation in SQL)
- Output encoding for rendered HTML content

### Authentication
- Session tokens validated on every protected endpoint
- Permission checks before data mutations
- No hardcoded credentials or API keys in source

Frontmatter Reference

Field	Required	Purpose
`name`	Yes	Unique identifier (lowercase, hyphens, max 64 chars)
`description`	Yes	Discovery trigger (max 1024 chars). Claude uses this to decide when to apply the skill
`allowed-tools`	No	Restrict Claude’s capabilities (e.g., `Read, Grep, Glob` for read-only)
`disable-model-invocation`	No	Prevents auto-activation; skill only activates via `/skill-name`
`user-invocable`	No	Set `false` to hide from the `/` menu entirely
`model`	No	Override which model to use when the skill is active
`context`	No	Set to `fork` to run in isolated context window
`agent`	No	Run as a subagent with its own isolated context
`hooks`	No	Define lifecycle hooks scoped to this skill
`$ARGUMENTS`	No	String substitution: replaced with user’s input after `/skill-name`

The Description Field Is Everything

At session start, Claude Code extracts every skill’s name and description and injects them into Claude’s context. When you send a message, Claude uses language model reasoning to decide if any skill is relevant. Independent analysis of the Claude Code source confirms the mechanism: skill descriptions are injected into an available_skills section of the system prompt, and the model uses standard language understanding to select relevant skills.¹⁰

Bad description:

description: Helps with code

Effective description:

description: Review code for security vulnerabilities, performance issues,
  and best practice violations. Use when examining code changes, reviewing
  PRs, analyzing code quality, or when asked to review, audit, or check code.

The effective description includes: what it does (review code for specific issue types), when to use it (examining changes, PRs, quality analysis), and trigger phrases (review, audit, check) that users naturally type.

Context Budget

All skill descriptions share a context budget that scales dynamically at 1% of the context window, with a fallback of 8,000 characters.⁴ If you have many skills, keep each description concise and put the key use case first. You can override the budget via the SLASH_COMMAND_TOOL_CHAR_BUDGET environment variable,¹¹ but the better fix is shorter, more precise descriptions. Run /context during a session to check whether any skills are being excluded.

Supporting Files and Organization

Skills can reference additional files in the same directory:

~/.claude/skills/code-reviewer/
├── SKILL.md                    # Required: frontmatter + core expertise
├── SECURITY_PATTERNS.md        # Referenced: detailed vulnerability patterns
└── PERFORMANCE_CHECKLIST.md    # Referenced: optimization guidelines

Reference them from SKILL.md with relative links. Claude reads these files on-demand when the skill activates. Keep SKILL.md under 500 lines and move detailed reference material to supporting files.¹²

Project skills (.claude/skills/ in the repo root) are shared via version control:⁴

mkdir -p .claude/skills/domain-expert
# ... write SKILL.md ...
git add .claude/skills/
git commit -m "feat: add domain-expert skill for payment processing rules"
git push

When teammates pull, they get the skill automatically. No installation, no configuration. This is the most effective way to standardize expertise across a team.

Skills as a Prompt Library

Beyond single-purpose skills, the directory structure works as an organized prompt library:

~/.claude/skills/
├── code-reviewer/          # Activates on: review, audit, check
├── api-designer/           # Activates on: design API, endpoint, schema
├── sql-analyst/            # Activates on: query, database, migration
├── deploy-checker/         # Activates on: deploy, release, production
└── incident-responder/     # Activates on: error, failure, outage, debug

Each skill encodes a different facet of your expertise. Together, they form a knowledge base that Claude draws from automatically based on context. A junior developer gets senior-level guidance without asking for it.

Skills Compose with Hooks

Skills can define their own hooks in frontmatter that activate only while the skill runs. This creates domain-specific behavior that does not pollute other sessions:²

---
name: deploy-checker
description: Verify deployment readiness. Use when preparing to deploy,
  release, or push to production.
hooks:
  PreToolUse:
    - matcher: Bash
      hooks:
        - type: command
          command: "bash -c 'INPUT=$(cat); CMD=$(echo \"$INPUT\" | jq -r \".tool_input.command\"); if echo \"$CMD\" | grep -qE \"deploy|release|publish\"; then echo \"DEPLOYMENT COMMAND DETECTED. Running pre-flight checks.\" >&2; fi'"
---

Philosophy skills auto-activate via SessionStart hooks, injecting quality constraints into every session without explicit invocation. The skill itself is knowledge. The hook is enforcement. Together, they form a policy layer.

Common Skill Mistakes

Too-broad descriptions. A git-rebase-helper skill that activates on any git-related prompt (rebases, merges, cherry-picks, even git status) pollutes context on 80% of sessions. The fix is either tightening the description or adding disable-model-invocation: true and requiring explicit /skill-name invocation.⁴

Too many skills competing for budget. More skills means more descriptions competing for the 1% context budget. If you notice skills not activating, check /context for excluded ones. Prioritize fewer, well-described skills over many vague ones.

Critical information buried in supporting files. Claude reads SKILL.md immediately but only accesses supporting files when needed. If critical information is in a supporting file, Claude might not find it. Put essential information in SKILL.md directly.⁴

SDK Skill Surface (May 8, 2026)

Self-hosted harnesses on claude-agent-sdk-python v0.1.77+ should use the skills option on ClaudeAgentOptions to declare available skills, not the legacy "Skill" value in allowed_tools.³⁷ The "Skill" shorthand is deprecated and the dedicated option gives Claude Code more structured information about which skills are available. Bundled CLI in v0.1.77 is v2.1.133.

Plugin and Skill Convergence in `.claude/skills/` (May 29, 2026)

Skills have always loaded from a project’s .claude/skills/ directory. Claude Code v2.1.157 extends that directory to plugins: a plugin placed in .claude/skills/ now loads automatically with no marketplace registration, and claude plugin init <name> scaffolds a fresh one there with the manifest and SKILL.md already wired.⁵⁸ That closes the gap between the two project-tooling shapes that used to live in different places — a bare skill committed straight to the repo, versus a plugin that bundles a skill plus hooks plus an MCP server but previously needed a marketplace to install. The practical effect for harness design: project-scoped tooling no longer needs a registry detour to ship — write it, commit it, and teammates get the same surface on git pull. Plugins still own the bundled-installable use case (hooks + skills + MCP servers + agents in one ZIP); the change is that a project no longer has to stand up a marketplace just to load one from its own tree.

Hiding the Bundled Surface as Governance (June 8, 2026)

Skills are capability, and capability is attack surface. Claude Code v2.1.169 adds a disableBundledSkills setting (and the matching CLAUDE_CODE_DISABLE_BUNDLED_SKILLS environment variable) that hides the bundled skills, workflows, and built-in slash commands from the model entirely.⁶⁰ For a hardened or regulated harness, this is a deliberate attack-surface reduction: an operator who has audited and approved a specific set of project and personal skills can suppress everything Anthropic ships in the box, so the model only ever reasons over the surface the operator vetted. Treat it the same way you treat a tool allowlist — the default is broad capability, and turning the default off is a governance decision, not a convenience toggle.

Hook Architecture

Hooks are shell commands triggered by Claude Code lifecycle events.³ They run outside the LLM as plain scripts, not prompts interpreted by the model. The model wants to run rm -rf /? A 10-line bash script checks the command against a blocklist and rejects it before the shell ever sees it. The hook fires whether the model wants it to or not.

Available Events

Claude Code exposes 29 documented lifecycle events across eight categories as of this guide update. The event list grows with releases, so treat the reference docs as the source of truth and check the cheat sheet for the current full table before wiring production hooks:¹³

Category	Events	Can Block?
Session	`SessionStart`, `Setup`, `SessionEnd`	No
User / completion	`UserPromptSubmit`, `UserPromptExpansion`, `Stop`, `StopFailure`, `TeammateIdle`	Prompt/expansion/stop/idle can block; `StopFailure` cannot
Tool	`PreToolUse`, `PermissionRequest`, `PermissionDenied`, `PostToolUse`, `PostToolUseFailure`, `PostToolBatch`	Pre/permission/batch can block; post events cannot
Subagent / task	`SubagentStart`, `SubagentStop`, `TaskCreated`, `TaskCompleted`	Stop/task events can block; start cannot
Context	`PreCompact`, `PostCompact`, `InstructionsLoaded`	`PreCompact` can block; post/load cannot
Filesystem / workspace	`CwdChanged`, `FileChanged`, `WorktreeCreate`, `WorktreeRemove`	Worktree creation can block; others cannot
Configuration / notification	`ConfigChange`, `Notification`	Config changes can block except policy settings; notifications cannot
MCP	`Elicitation`, `ElicitationResult`	Yes

Exit Code Semantics

Exit codes determine whether hooks block actions:³

Exit Code	Meaning	Action
0	Success	Operation proceeds. Stdout shown in verbose mode.
2	Blocking error	Operation stops. Stderr becomes error message fed to Claude.
1, 3, etc.	Non-blocking error	Operation continues. Stderr shown in verbose mode only (Ctrl+O).

Critical: Every security hook must use exit 2, not exit 1. Exit 1 is a non-blocking warning. The dangerous command still executes. This is the most common hook mistake across teams.¹⁴

Hook Configuration

Hooks live in settings files. Project-level (.claude/settings.json) for shared hooks. User-level (~/.claude/settings.json) for personal hooks:

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash",
        "hooks": [
          {
            "type": "command",
            "command": ".claude/hooks/validate-bash.sh"
          }
        ]
      }
    ],
    "PostToolUse": [
      {
        "matcher": "Write|Edit",
        "hooks": [
          {
            "type": "command",
            "command": "bash -c 'if [[ \"$FILE_PATH\" == *.py ]]; then black --quiet \"$FILE_PATH\" 2>/dev/null; fi'"
          }
        ]
      }
    ]
  }
}

The matcher field filters an event-specific value. For tool events, it matches tool_name values such as Bash, Edit, Write, Read, Glob, Grep, MCP tool names like mcp__server__tool, or * for all tools. Simple names and |-separated lists are exact matches; values with other characters are JavaScript regular expressions. Some events do not support matchers and always fire when configured.¹³

Hook Input/Output Protocol

Hooks receive JSON on stdin with full context:

{
  "tool_name": "Bash",
  "tool_input": {
    "command": "npm test",
    "description": "Run test suite"
  },
  "session_id": "abc-123",
  "agent_id": "main",
  "agent_type": "main"
}

For advanced control, PreToolUse hooks can output JSON to modify tool input, inject context, or make permission decisions. Use the hookSpecificOutput wrapper — the older top-level decision/reason format is deprecated for PreToolUse:

{
  "hookSpecificOutput": {
    "hookEventName": "PreToolUse",
    "permissionDecision": "allow",
    "permissionDecisionReason": "Command validated and modified",
    "updatedInput": {
      "command": "npm test -- --coverage --ci"
    },
    "additionalContext": "Note: This database has a 5-second query timeout."
  }
}

Three Types of Guarantees

Before writing any hook, ask: what kind of guarantee do I need?¹⁴

Formatting guarantees ensure consistency after the fact. PostToolUse hooks on Write/Edit run your formatter after every file change. The model’s output does not matter because the formatter normalizes everything.

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Write|Edit",
        "hooks": [
          {
            "type": "command",
            "command": "bash -c 'if [[ \"$FILE_PATH\" == *.py ]]; then black --quiet \"$FILE_PATH\" 2>/dev/null; elif [[ \"$FILE_PATH\" == *.js ]] || [[ \"$FILE_PATH\" == *.ts ]]; then npx prettier --write \"$FILE_PATH\" 2>/dev/null; fi'"
          }
        ]
      }
    ]
  }
}

Safety guarantees prevent dangerous actions before they execute. PreToolUse hooks on Bash inspect commands and block destructive patterns with exit code 2:

#!/bin/bash
# validate-bash.sh — block dangerous commands
INPUT=$(cat)
CMD=$(echo "$INPUT" | jq -r '.tool_input.command')

if echo "$CMD" | grep -qE "rm\s+-rf\s+/|git\s+push\s+(-f|--force)\s+(origin\s+)?main|git\s+reset\s+--hard|DROP\s+TABLE"; then
    echo "BLOCKED: Dangerous command detected: $CMD" >&2
    exit 2
fi

Quality guarantees validate state at decision points. PreToolUse hooks on git commit commands run your linter or test suite and block the commit if quality checks fail:

#!/bin/bash
# quality-gate.sh — lint before commit
INPUT=$(cat)
CMD=$(echo "$INPUT" | jq -r '.tool_input.command')

if echo "$CMD" | grep -qE "^git\s+commit"; then
    if ! LINT_OUTPUT=$(ruff check . --select E,F,W 2>&1); then
        echo "LINT FAILED -- fix before committing:" >&2
        echo "$LINT_OUTPUT" >&2
        exit 2
    fi
fi

Hook Types Beyond Shell Commands

Claude Code supports five hook types:¹³

Command hooks (type: "command") run shell scripts. Fast, deterministic, no token cost.

MCP tool hooks (type: "mcp_tool") call a tool on an already-connected MCP server. Use them when validation logic already lives behind an MCP boundary and does not need a separate shell script.

Prompt hooks (type: "prompt") send a single-turn prompt to a fast Claude model. The model returns { "ok": true } to allow or { "ok": false, "reason": "..." } to block. Use for nuanced evaluation that regex cannot express.

Agent hooks (type: "agent") spawn a subagent with tool access (Read, Grep, Glob) for multi-turn verification. They are experimental; prefer command hooks for production gates and reserve agent hooks for checks that genuinely require inspecting actual files or test output:

{
  "hooks": {
    "Stop": [
      {
        "hooks": [
          {
            "type": "agent",
            "prompt": "Verify all unit tests pass. Run the test suite and check results. $ARGUMENTS",
            "timeout": 120
          }
        ]
      }
    ]
  }
}

As of Claude Code v2.1.140, agent hook input includes subagent_type, which lets a shared hook distinguish a security-reviewer run from an explorer or generic worker without guessing from prompt text.⁴⁹

HTTP hooks (type: "http") send the event’s JSON input as a POST request to a URL and receive JSON back. Use for webhooks, external notification services, or API-based validation (v2.1.63+). Not supported for SessionStart events:

{
  "hooks": {
    "PostToolUse": [
      {
        "hooks": [
          {
            "type": "http",
            "url": "https://your-webhook.example.com/hook",
            "headers": { "Authorization": "Bearer $WEBHOOK_TOKEN" },
            "allowedEnvVars": ["WEBHOOK_TOKEN"],
            "timeout": 10
          }
        ]
      }
    ]
  }
}

Async Hooks

Hooks can run in the background without blocking execution. Add async: true for non-critical operations like notifications and logging:¹³

{
  "type": "command",
  "command": ".claude/hooks/notify-slack.sh",
  "async": true
}

Use async for notifications, telemetry, and backups. Never use async for formatting, validation, or anything that must complete before the next action.

Dispatchers Over Independent Hooks

Running seven hooks all firing on the same event, each reading stdin independently, creates race conditions. Two hooks writing to the same JSON state file concurrently will truncate the JSON. Every downstream hook that parses that file breaks.²

The fix: one dispatcher per event that runs hooks sequentially from cached stdin:

#!/bin/bash
# dispatcher.sh — run hooks sequentially with cached stdin
INPUT=$(cat)
HOOK_DIR="$HOME/.claude/hooks/pre-tool-use.d"

for hook in "$HOOK_DIR"/*.sh; do
    [ -x "$hook" ] || continue
    echo "$INPUT" | "$hook"
    EXIT_CODE=$?
    if [ "$EXIT_CODE" -eq 2 ]; then
        exit 2  # Propagate block
    fi
done

Debugging Hooks

Five techniques for debugging hooks that fail silently:¹⁴

Test scripts independently. Pipe sample JSON: echo '{"tool_input":{"command":"git commit -m test"}}' | bash your-hook.sh
Use stderr for debug output. Exit code 2 stderr is fed back to Claude as an error message. Non-blocking stderr (exit 1, 3, etc.) appears only in verbose mode (Ctrl+O).
Watch for jq failures. Wrong JSON paths return null silently. Test jq expressions against real tool input.
Verify exit codes. A PreToolUse hook that uses exit 1 provides zero enforcement while appearing to work.
Keep hooks fast. Hooks run synchronously. Keep all hooks under 2 seconds, ideally under 500ms.

SDK-Side Hook Event Streaming

Self-hosted harnesses built on claude-agent-sdk-python (v0.1.74+, May 6, 2026) can subscribe to hook events directly from the message stream rather than going through shell-script callbacks.³⁶ Set include_hook_events=True on ClaudeAgentOptions and HookEventMessage objects (PreToolUse, PostToolUse, Stop, and others) yield from the same iterator as assistant messages and tool results. This mirrors the TypeScript SDK’s includeHookEvents option; bundled CLI was bumped to v2.1.129 in the same release.

The event-stream pattern is the right fit when your harness already lives in Python and you want hook signals in the same control flow as model output. The shell-script hook contract (exit codes, stdin JSON, dispatchers) remains the right answer for harnesses that compose multiple tools, share hooks across Claude Code and Codex, or need exit-code semantics for blocking.

Effort and Session Provenance (May 7-8, 2026)

Two additions in Claude Code v2.1.132 and v2.1.133 give hooks and subprocesses better signal about their execution context:³⁸³⁹

effort.level in hook input. Hooks now receive an effort.level JSON field on the same input that carries tool_input and session_id. The same value is exported as the $CLAUDE_EFFORT env var, so Bash commands can read it without parsing JSON. Use this to scale hook cost with effort tier: skip expensive validation on low, run the full security gate on xhigh or max.
CLAUDE_CODE_SESSION_ID env var on Bash subprocesses. Bash tool subprocesses now see the same session_id value the hooks see, exposed as CLAUDE_CODE_SESSION_ID. This closes the provenance gap for tools that log per-session state and were previously unable to correlate subprocess events with hook events.

Both signals are available without code changes; existing hooks that ignore the new fields keep working.

`autoMode.hard_deny` and v2.1.136 Hook/Plugin Fixes (May 8, 2026)

Claude Code v2.1.136 added a new hard-deny tier to auto mode and fixed a cluster of plugin and MCP issues that affected long-running harnesses:⁴⁰

settings.autoMode.hard_deny. Auto mode classifier rules that block unconditionally, regardless of user intent or allow exceptions. This sits above the existing allow/deny matchers as a non-negotiable governance lever. Use it for rules that must never be overridden (force-push to main, secret-bearing files, production database access) even when an operator has approved the broader category in their personal settings.
MCP servers no longer disappear after /clear. Servers configured in .mcp.json, plugins, and claude.ai connectors had been silently dropping out of the active set after a /clear in the VS Code extension, JetBrains plugin, and Agent SDK. The fix lands in v2.1.136. If you saw “MCP server X went missing mid-session,” this was the cause.
MCP OAuth refresh-token loss on concurrent refresh. Users with several remote MCP servers should no longer need daily re-authentication. Concurrent refresh writes were overwriting each other.
Plan mode now blocks file writes correctly. A matching Edit(...) allow rule was bypassing plan-mode write protection. Plan mode is now enforced regardless of allow rules.
Plugin Stop and UserPromptSubmit hooks no longer fail mid-session. Cache cleanup was deleting plugin-version files still in use by the running session, breaking these two hook events specifically. The fix keeps in-use versions pinned.
skills entry in plugin.json. Setting skills was hiding the plugin’s default skills/ directory. Now the entry composes correctly, and pointing it at a file path raises an explicit error instead of failing silently.
CLAUDE_ENV_FILE SessionStart hook env vars going stale. Vars exported by SessionStart hooks via CLAUDE_ENV_FILE were going stale after /resume or /clear. Fixed in v2.1.136. Sessions now re-source the env file on these events.

For governance harnesses, the operationally interesting line items are autoMode.hard_deny (new lever) and the MCP-disappearance fix (silent failure that broke long sessions). Everything else is a quality-of-life cleanup.

Structured Hook Arguments and Block Continuation (May 11, 2026)

Claude Code v2.1.139 added two hook details that matter for production harnesses: an args: string[] exec form for command hooks, and continueOnBlock for PostToolUse hooks.⁴²⁴⁴ Prefer args when a hook needs dynamic values or path placeholders. It spawns the command directly without a shell, which removes a whole class of quoting and injection mistakes.

Use continueOnBlock when a PostToolUse hook should feed its rejection reason back to Claude and continue the turn instead of ending the flow. Treat it as an operator-experience feature, not a security bypass. A blocking gate should still block the unsafe outcome.

The same release passes CLAUDE_PROJECT_DIR to MCP stdio servers and lets plugin configs reference ${CLAUDE_PROJECT_DIR} in commands.⁴² MCP tools should resolve project-relative paths from that value rather than from whichever process working directory happened to launch the server.

Claude Code v2.1.140 is mostly a reliability release for harness operators: it fixes ConfigChange hooks not firing on settings changes, closes edge cases where disableAllHooks and allowManagedHooksOnly did not compose correctly across settings levels, and stops permission dialogs from exposing unintended environment variables returned by hook results.⁴⁹ That makes the existing governance patterns in this section more dependable; it does not require a new hook architecture.

Claude Code v2.1.141 adds a hook-output terminalSequence field for desktop notifications, window titles, and bells without a controlling terminal.⁵⁰ Treat that as operator signaling, not enforcement. Security and quality gates should still communicate failures through the normal blocking contract: structured hook output plus the exit behavior that prevents the unsafe action. The same release adds claude agents --cwd <path> for scoping Agent View to one directory, CLAUDE_CODE_PLUGIN_PREFER_HTTPS for plugin installs in environments without GitHub SSH keys, and ANTHROPIC_WORKSPACE_ID for workload-identity federation rules that cover more than one workspace.⁵⁰ Those are architecture details for team harnesses: narrower operational views, fewer plugin-install assumptions, and explicit enterprise token scoping.

Claude Code v2.1.142 is more important for background-session orchestration than for hook semantics.⁵¹ claude agents can now dispatch background sessions with explicit directory, settings, MCP, plugin, permission, model, and effort flags instead of depending on wrapper state. Fast mode now defaults to Opus 4.7; pin CLAUDE_CODE_OPUS_4_6_FAST_MODE_OVERRIDE=1 only if a harness has measured dependence on Opus 4.6’s behavior. Root-level plugin SKILL.md discovery and plugin-provided LSP visibility reduce packaging ambiguity. Fixes to MCP_TOOL_TIMEOUT, pre-existing background-session worktrees, daemon sleep/wake and post-upgrade cleanup, and plugin cache cleanup close reliability gaps that otherwise look like orchestration bugs.

Stop-hook steering, cross-session authority, and multi-agent v2 (June 2026)

Four changes from early June matter for harness and multi-agent design.⁵⁹

Stop/SubagentStop hooks gained a steering channel. As of Claude Code v2.1.163, a Stop or SubagentStop hook can return hookSpecificOutput.additionalContext to hand Claude feedback and keep the turn going, without the response being labeled a hook error. Before this, a Stop hook’s only real lever was the exit-2 block, which reads as an error and counts toward the consecutive-block cap. For a quality-gate harness this is the cleaner primitive: a Stop hook that detects “you said done but the tests are red” can now inject “here is what is still failing, continue” instead of hard-blocking. Use the block for genuine stop conditions and additionalContext for “not done yet, here is why.”

Cross-session messaging no longer carries borrowed authority. v2.1.166 hardened the multi-session case: messages relayed via SendMessage from another Claude session no longer carry the originating user’s authority, so a receiving session refuses relayed permission requests and auto mode blocks them. If your orchestration has agents message each other, treat an inbound message as untrusted data, not as an authenticated instruction. This is the same principle the security section applies to tool output, extended to inter-agent messaging.

Model resilience became a first-class setting. The fallbackModel setting now chains up to three backup models, tried in order when the primary is overloaded or unavailable, and a turn auto-retries once on the fallback for unexpected non-retryable API errors. For a long-running autonomous harness, this turns a transient primary-model outage into a graceful degradation rather than a dropped run. claude agents --json also added a waitingFor field (v2.1.162) that surfaces what a blocked background session is waiting on, such as a permission prompt — an observability win for any coordinator polling a fleet of agents.

Safe mode for clean-room governance and troubleshooting. Claude Code v2.1.169 adds a --safe-mode flag (and the matching CLAUDE_CODE_SAFE_MODE environment variable) that starts a session with every customization disabled at once: CLAUDE.md, plugins, skills, hooks, and MCP servers.⁶⁰ This is the inverse of the harness — a deliberate clean-room. Use it to answer the question every operator eventually asks: “is this behavior coming from the model, or from something I configured?” When a hook misfires, a skill activates when it should not, or an MCP server poisons context, --safe-mode gives you a known-empty baseline to diff against. It is also a governance primitive: a way to run the bare model with none of the persistent authority your harness normally grants, which matters when you need to reproduce a result without any operator-defined scaffolding influencing it.

A note on model tiers. This guide treats Opus 4.8 as Claude Code’s agentic default — the model that runs autonomous harnesses unless you select otherwise. As of June 9, 2026, Anthropic launched Claude Fable 5 (claude-fable-5), a new tier above Opus described as its most powerful model — a “Mythos-class” system made safe for general use — selectable in Claude Code v2.1.170 via /model claude-fable-5.⁶⁰ Opus 4.8 remains the agentic default; reach for the higher tier deliberately, on the decisions where raw reasoning depth justifies the cost, not as a blanket setting for a fleet.

Codex shipped multi-agent v2. Codex CLI v0.137.0 keeps the runtime choice with each thread, exposes cleaner follow-up and metadata defaults for spawned agents (hide_spawn_agent_metadata now defaults to true), and propagates raw parent events to child listeners. Its subagent model stays explicit: built-in default/worker/explorer agent types, TOML-defined custom agents, and concurrency controls (agents.max_threads default 6, agents.max_depth default 1). The same release adds a v1 skills extension with per-turn skill-catalog resolution and new thread-start/turn-error lifecycle contributor events, narrowing the gap with Claude Code’s hook/skill surface while keeping the kernel-sandbox posture as the default boundary. Codex v0.138.0–v0.139.0 then hardened multi-agent v2 for production: inter-agent message payloads are now encrypted, a v2 agent config catalog plus an agent-residency LRU manage which agents stay resident, and concurrency is counted by active execution rather than by spawned threads, so idle agents no longer consume a slot.⁶¹ The lifecycle API matured too — close_agent was renamed interrupt_agent (v0.139.0) to reflect that it interrupts a running agent rather than merely closing a handle — and MCP startup warnings raised by a subagent now stay scoped to the owning thread instead of duplicating up into the parent’s transcript.⁶¹ For anyone building Codex-side orchestration, these are the difference between a demo and a fleet: encrypted message transport, bounded residency, execution-counted concurrency, and warnings that do not leak across the thread boundary.

Memory and Context

Every AI conversation operates within a finite context window. As the conversation grows, the system compresses earlier turns to make room for new content. The compression is lossy. Architectural decisions documented in turn 3 may not survive to turn 15.⁹

The Three Mechanisms of Multi-Turn Collapse

The MSR/Salesforce study identified three independent mechanisms, each requiring a different intervention:⁹

Mechanism	What Happens	Intervention
Context compression	Earlier information discarded to fit new content	State checkpointing to filesystem
Reasoning coherence loss	Model contradicts its own earlier decisions across turns	Fresh-context iteration (Ralph loop)
Coordination failure	Multiple agents hold different state snapshots	Shared state protocols between agents

Strategy 1: Filesystem as Memory

The most reliable memory across context boundaries lives in the filesystem. Claude Code reads CLAUDE.md and memory files at the start of every session and after every compaction.⁶

~/.claude/
├── configs/           # 14 JSON configs (thresholds, rules, budgets)
│   ├── deliberation-config.json
│   ├── recursion-limits.json
│   └── consensus-profiles.json
├── hooks/             # 95 lifecycle event handlers
├── skills/            # 44 reusable knowledge modules
├── state/             # Runtime state (recursion depth, agent lineage)
├── handoffs/          # 49 multi-session context documents
├── docs/              # 40+ system documentation files
└── projects/          # Per-project memory directories
    └── {project}/memory/
        └── MEMORY.md  # Always loaded into context

The MEMORY.md file captures errors, decisions, and patterns across sessions. When you discover that ((VAR++)) fails with set -e in bash when VAR is 0, you record it. Three sessions later, when you encounter a similar integer edge case in Python, the MEMORY.md entry surfaces the pattern.¹⁵

Auto Memory (v2.1.32+): Claude Code automatically records and recalls project context. As you work, Claude writes observations to ~/.claude/projects/{project-path}/memory/MEMORY.md. Auto memory loads the first 200 lines into your system prompt at session start. Keep it concise and link to separate topic files for detailed notes.⁶

Memory curation over memory volume (May 2026): A recent arXiv preprint on LLM-agent cooperation frames expanded recall as a possible failure mode: in the authors’ experiments, longer visible history degraded cooperation in 18 of 28 model-game settings.⁴⁸ Treat this as a design warning, not a finished law. The production rule is already clear enough: keep MEMORY.md short, link out to details, and put decision-ready summaries in handoffs. Raw transcript dumps, tool logs, and long recall feeds belong in searchable storage, not automatically in the active prompt.

Strategy 2: Proactive Compaction

Claude Code’s /compact command summarizes the conversation and frees context space while preserving key decisions, file contents, and task state.¹⁵

When to compact: - After completing a distinct subtask (feature implemented, bug fixed) - Before starting a new area of the codebase - When Claude starts repeating or forgetting earlier context - Roughly every 25-30 minutes during intensive sessions

Custom compaction instructions in CLAUDE.md:

# Summary Instructions
When using compact, focus on:
- Recent code changes
- Test results
- Architecture decisions made this session

Compaction protects the conversation; the /cd command (Claude Code v2.1.169) protects the prompt cache. It moves a session to a new working directory mid-stream without breaking the cache that has accumulated over the turn.⁶⁰ Before this, changing directories meant a fresh session and a cold cache. For a long-running session that pivots from one repository to a sibling — common in monorepo and multi-service work — /cd keeps the expensive cached prefix intact while repointing the filesystem context.

Strategy 3: Session Handoffs

For tasks spanning multiple sessions, create handoff documents that capture the full state:

## Handoff: Deliberation Infrastructure PRD-7
**Status:** Hook wiring complete, 81 Python unit tests passing
**Files changed:** hooks/post-deliberation.sh, hooks/deliberation-pride-check.sh
**Decision:** Placed post-deliberation in PostToolUse:Task, pride-check in Stop
**Blocked:** Spawn budget model needs inheritance instead of depth increment
**Next:** PRD-8 integration tests in tests/test_deliberation_lib.py

The Status/Files/Decision/Blocked/Next structure provides the successor session with full context at minimal token cost. Starting a new session with claude -c (continue) or reading the handoff document goes straight to implementation.¹⁵

Strategy 4: Fresh-Context Iteration (The Ralph Loop)

For sessions exceeding 60-90 minutes, spawn a fresh Claude instance per iteration. State persists through the filesystem, not through conversational memory. Each iteration gets the full context budget:¹⁶

Iteration 1: [200K tokens] -> writes code, creates files, updates state
Iteration 2: [200K tokens] -> reads state from disk, continues
Iteration 3: [200K tokens] -> reads updated state, continues
...
Iteration N: [200K tokens] -> reads final state, verifies criteria

Compare with a single long session:

Minute 0:   [200K tokens available] -> productive
Minute 30:  [150K tokens available] -> somewhat productive
Minute 60:  [100K tokens available] -> degraded
Minute 90:  [50K tokens available]  -> significantly degraded
Minute 120: [compressed, lossy]     -> errors accumulate

The fresh-context-per-iteration approach trades 15-20% overhead for the orient step (reading state files, scanning git history) against full cognitive resources per iteration.¹⁶ The cost-benefit calculation: for sessions under 60 minutes, a single conversation is more efficient. Beyond 90 minutes, fresh-context produces higher-quality output despite the overhead.

Strategy 5: Managed Memory Curation (Dreaming)

Anthropic’s Claude Managed Agents added Dreaming as a Research Preview on May 6, 2026.³⁵ Per Anthropic: “Dreaming is a scheduled process that reviews your agent sessions and memory stores, extracts patterns, and curates memories so your agents improve over time.”³⁵

Dreaming runs in the background between sessions, not on the critical path. It complements rather than replaces the filesystem-as-memory pattern: your MEMORY.md file remains the load-bearing surface; Dreaming writes curated memory entries into the Managed Agents memory store, which the agent reads at session start. The two patterns coexist for harnesses that mix self-hosted filesystem state with managed-side curation.

	Filesystem Memory	Dreaming (Managed)
Where memory lives	Your repo, version-controlled	Anthropic-managed memory store
When it updates	You write entries by hand or via hooks	Background process between sessions
What it captures	Decisions, errors, patterns you flag	Patterns extracted from session history
Best for	Project-specific institutional knowledge	Cross-session pattern discovery you would not catch by hand

Dreaming is in Research Preview, so behavior may change. The session-handoffs and CLAUDE.md patterns documented above remain the authoritative memory mechanism for self-hosted harnesses.

The Anti-Patterns

Reading entire files when you need 10 lines. A single 2,000-line file read consumes 15,000-20,000 tokens. Use line offsets: Read file.py offset=100 limit=20 saves the vast majority of that cost.¹⁵

Keeping verbose error output in context. After debugging a bug, your context holds 40+ stack traces from failed iterations. A single /compact after fixing the bug frees that dead weight.

Starting every session by reading every file. Let Claude Code’s glob and grep tools find relevant files on demand, saving 100,000+ tokens of unnecessary pre-loading.¹⁵

Subagent Patterns

Subagents are specialized Claude instances that handle complex tasks independently. They start with a clean context (no pollution from the main conversation), operate with specified tools, and return results as summaries. The exploration results do not bloat your main conversation; only the conclusions return.⁵

Built-In Subagent Types

Type	Model	Mode	Tools	Use For
Explore	Haiku (fast)	Read-only	Glob, Grep, Read, safe bash	Codebase exploration, finding files
General-purpose	Inherits	Full read/write	All available	Complex research + modification
Plan	Inherits (or Opus)	Read-only	Read, Glob, Grep, Bash	Planning before execution

Creating Custom Subagents

Define subagents in .claude/agents/ (project) or ~/.claude/agents/ (personal):

---
name: security-reviewer
description: Expert security code reviewer. Use PROACTIVELY after any code
  changes to authentication, authorization, or data handling.
tools: Read, Grep, Glob, Bash
model: opus
permissionMode: plan
---

You are a senior security engineer reviewing code for vulnerabilities.

When invoked:
1. Identify the files that were recently changed
2. Analyze for OWASP Top 10 vulnerabilities
3. Check for secrets, hardcoded credentials, SQL injection
4. Report findings with severity levels and remediation steps

Focus on actionable security findings, not style issues.

Subagent Configuration Fields

Field	Required	Purpose
`name`	Yes	Unique identifier (lowercase + hyphens)
`description`	Yes	When to invoke (include “PROACTIVELY” to encourage auto-delegation)
`tools`	No	Comma-separated. Inherits all tools if omitted. Supports `Agent(agent_type)` to restrict spawnable agents
`disallowedTools`	No	Tools to deny, removed from inherited or specified list
`model`	No	`sonnet`, `opus`, `haiku`, `inherit` (default: `inherit`)
`permissionMode`	No	`default`, `acceptEdits`, `delegate`, `dontAsk`, `bypassPermissions`, `plan`
`maxTurns`	No	Maximum agentic turns before the subagent stops
`memory`	No	Persistent memory scope: `user`, `project`, `local`
`skills`	No	Auto-load skill content into subagent context at startup. As of v2.1.133, subagents also discover project, user, and plugin skills via the `Skill` tool the same way the parent session does. Earlier versions silently dropped these from subagent context.³⁹
`hooks`	No	Lifecycle hooks scoped to this subagent’s execution
`background`	No	Always run as background task
`isolation`	No	Set to `worktree` for isolated git worktree copy

Worktree Isolation

Subagents can operate in temporary git worktrees, providing a complete isolated copy of the repository:⁵

---
name: experimental-refactor
description: Attempt risky refactoring in isolation
isolation: worktree
tools: Read, Write, Edit, Bash, Grep, Glob
---

You have an isolated copy of the repository. Make changes freely.
If the refactoring succeeds, the changes can be merged back.
If it fails, the worktree is discarded with no impact on the main branch.

Worktree isolation is essential for experimental work that might break the codebase.

Parallel Subagents

Use parallel subagents for independent research tasks that do not need to coordinate with each other:⁵

> Have three explore agents search in parallel:
> 1. Authentication code
> 2. Database models
> 3. API routes

Each agent runs in its own context window, finds relevant code, and returns a summary. The main context stays clean.

The Recursion Guard

Without spawn limits, agents delegate to agents that delegate to agents, each one losing context and burning tokens. The recursion guard pattern enforces budgets:¹⁶

#!/bin/bash
# recursion-guard.sh — enforce spawn budget
CONFIG_FILE="${HOME}/.claude/configs/recursion-limits.json"
STATE_FILE="${HOME}/.claude/state/recursion-depth.json"

MAX_DEPTH=2
MAX_CHILDREN=5
DELIB_SPAWN_BUDGET=2
DELIB_MAX_AGENTS=12

# Read current depth
current_depth=$(jq -r '.depth // 0' "$STATE_FILE" 2>/dev/null)

if [[ "$current_depth" -ge "$MAX_DEPTH" ]]; then
    echo "BLOCKED: Maximum recursion depth ($MAX_DEPTH) reached" >&2
    exit 2
fi

# Increment depth using safe arithmetic (not ((VAR++)) with set -e)
new_depth=$((current_depth + 1))
jq --argjson d "$new_depth" '.depth = $d' "$STATE_FILE" > "${STATE_FILE}.tmp"
mv "${STATE_FILE}.tmp" "$STATE_FILE"

Critical lesson: Use spawn budgets, not just depth limits. Depth-based limits track parent-child chains (blocked at depth 3) but miss width: 23 agents at depth 1 is still “depth 1.” A spawn budget tracks total active children per parent, capped at a configurable maximum. The budget model maps to the actual failure mode (too many total agents) rather than a proxy metric (too many nesting levels).⁷

Recursive delegation is now a first-party depth. As of Claude Code v2.1.172 (June 10, 2026), sub-agents can spawn their own sub-agents, nesting up to 5 levels deep — where delegation was previously effectively one level.⁶² This makes the recursion guard above more important, not less: the platform now permits exactly the agents-delegating-to-agents chains that burn context and tokens, so the spawn budget and depth cap are the thing keeping a 5-level tree from fanning out into hundreds of active agents. Treat 5 levels as a ceiling the platform allows, not a default to reach for.

Agent Teams (Research Preview)

Agent Teams coordinate multiple Claude Code instances that work independently, communicate via a shared mailbox and task list, and can challenge each other’s findings:⁵

Component	Role
Team lead	Main session that creates the team, spawns teammates, coordinates work
Teammates	Separate Claude Code instances working on assigned tasks
Task list	Shared work items that teammates claim and complete (file-locked)
Mailbox	Messaging system for inter-agent communication

Enable with: export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1

When to use agent teams vs subagents:

	Subagents	Agent Teams
Communication	Report results back only	Teammates message each other directly
Coordination	Main agent manages all work	Shared task list with self-coordination
Best for	Focused tasks where only result matters	Complex work requiring discussion and collaboration
Token cost	Lower	Higher (each teammate = separate context window)

Agent View and Goal Loops (May 2026)

Claude Code v2.1.139 added Agent View, a research-preview interface started with claude agents that shows running, blocked, and completed Claude Code sessions from one screen.⁴²⁴³ The official docs frame it as a way to dispatch and manage many sessions, see what each session is doing, and identify which ones need operator input.⁴³ This gives multi-agent work an operations view that final summaries cannot provide.

Use Agent View when promoting a subagent or team pattern: inspect which sessions are blocked, which are still running, and whether the work distribution matches the intended architecture. Do not treat it as proof of quality. It is observability; tests, review gates, and evidence reports still decide whether the work is sound.

The same release added /goal, which sets a completion condition and lets Claude continue across turns until the condition is met, including interactive, -p, and Remote Control use.⁴² Treat /goal as a session-scoped completion loop, not a substitute for deterministic gates. It is useful for keeping an agent focused on a target, but tests, citation checks, deploy checks, and security hooks should remain command- or script-backed where failure must block.

Workflow Tool (v2.1.147+)

Claude Code v2.1.147 adds an off-by-default Workflow tool for deterministic multi-agent orchestration. Enable it with CLAUDE_CODE_WORKFLOWS=1.⁵² Architecturally, this is important because it gives Claude Code a first-party orchestration primitive for flows that previously required custom dispatch scripts, mailbox state, and subagent coordination conventions.

Do not delete the harness around it. A Workflow can structure execution, but it does not replace your safety model. Keep PreToolUse and PostToolUse hooks as the blocking layer, keep spawn budgets or workflow step budgets to prevent runaway width, keep filesystem state auditable, and keep final evidence reports outside the model’s self-assessment. In practice: use Workflow for orchestration shape; use hooks, tests, and review gates for truth.

Multi-Agent Orchestration

Single-agent AI systems have a structural blind spot: they cannot challenge their own assumptions.⁷ Multi-agent deliberation forces independent evaluation from multiple perspectives before any decision locks.

Cross-tool orchestration (April 2026): Google open-sourced Scion on April 7 — a multi-agent hypervisor that runs Claude Code, Gemini CLI, and other “deep agents” as concurrent processes, each with isolated container, git worktree, and credentials. Runs local, hub, or Kubernetes. Explicit philosophy: “isolation over constraints” — agents run with high autonomy inside boundaries enforced at the infrastructure layer, not in the prompt.²⁵ This directly extends the subagent-isolation argument across different tool vendors. If your workflow spans Claude and OpenAI models, Scion is the first real reference implementation for cross-tool subagents with per-agent worktree + credential isolation.

Debate is not a silver bullet: The M3MAD-Bench research cluster (early 2026) found that multi-agent debate plateaus and can be subverted by misleading consensus — valid arguments lose when other agents confidently assert the wrong answer.²⁶ Tool-MAD improves this by giving each agent heterogeneous tool access and using Faithfulness/Relevance scores in the judge stage. If you’re building debate-style orchestration, invest in (a) tool heterogeneity per agent and (b) quantitative judge scoring rather than assuming more agents = better answers.

Managed Multiagent Orchestration and Outcomes (Public Beta)

If you don’t want to build the deliberation infrastructure described below, Multiagent Orchestration entered Public Beta in Claude Managed Agents on May 6, 2026.³⁵ Per Anthropic: “When there is too much work for a single agent to do well, multiagent orchestration lets a lead agent break the job into pieces and delegate each one to a specialist with its own model, prompt, and tools.”³⁵ Specialists “work in parallel on a shared filesystem and contribute to the lead agent’s overall context.”³⁵

Tracing comes in the box. Per Anthropic: “you can also trace every step in the Claude Console: which agent did what, in what order, and why, giving you full visibility into how your task was delegated and executed.”³⁵

The companion Public Beta feature is Outcomes. Per Anthropic: “you write a rubric describing what success looks like and the agent works toward it. A separate grader evaluates the output against your criteria in its own context window, so it isn’t influenced by the agent’s reasoning.”³⁵ This is the managed-service version of the two-gate validation pattern documented later in this section: the rubric replaces the hand-written gate, the separate grader replaces the consensus validator.

	Self-Hosted Deliberation (this section)	Managed Multiagent + Outcomes
Specialist routing	You write the spawn logic	Lead agent breaks the job into pieces
Validation	Two-gate hooks + consensus scoring	Rubric + grader in separate context
Tracing	You instrument it	Claude Console
Best for	Patterns that need full control or specific tool composition	Standard delegation patterns where the validation rubric is the contract
Pricing	Token + harness cost only	Standard tokens plus the Managed Agents session-hour rate (April 8 launch base; see ²³)

Self-hosted deliberation remains the right answer when the validation needs to integrate with your own hook surface (PreToolUse blocking, exit-code semantics, custom dispatchers) or when the harness must run without external dependencies. Managed Multiagent is the right answer when standard delegation plus rubric grading is the contract you actually need.

Minimum Viable Deliberation

Start with 2 agents and 1 rule: agents must evaluate independently before seeing each other’s work.⁷

Decision arrives
  |
  v
Confidence check: is this risky, ambiguous, or irreversible?
  |
  +-- NO  -> Single agent decides (normal flow)
  |
  +-- YES -> Spawn 2 agents with different system prompts
             Agent A: "Argue FOR this approach"
             Agent B: "Argue AGAINST this approach"
             |
             v
             Compare findings
             |
             +-- Agreement with different reasoning -> Proceed
             +-- Genuine disagreement -> Investigate the conflict
             +-- Agreement with same reasoning -> Suspect herding

This pattern covers 80% of the value. Everything else adds incremental improvement.

The Confidence Trigger

Not every task needs deliberation. A confidence scoring module evaluates four dimensions:¹⁷

Ambiguity - Does the query have multiple valid interpretations?
Domain complexity - Does it require specialized knowledge?
Stakes - Is the decision reversible?
Context dependency - Does it require understanding the broader system?

The score maps to three levels:

Level	Threshold	Action
HIGH	0.85+	Proceed without deliberation
MEDIUM	0.70-0.84	Proceed with confidence note logged
LOW	Below 0.70	Trigger full multi-agent deliberation

The threshold adapts by task type. Security decisions require 0.85 consensus. Documentation changes need only 0.50. This prevents over-engineering simple tasks while ensuring risky decisions get scrutiny.⁷

The State Machine

Seven phases, each gated by the previous:⁷

IDLE -> RESEARCH -> DELIBERATION -> RANKING -> PRD_GENERATION -> COMPLETE
                                                                    |
                                                              (or FAILED)

RESEARCH: Independent agents investigate the topic. Each agent gets a different persona (Technical Architect, Security Analyst, Performance Engineer, and others). Context isolation ensures agents cannot see each other’s findings during research.

DELIBERATION: Agents see all research findings and generate alternatives. The Debate agent identifies conflicts. The Synthesis agent combines non-contradictory findings.

RANKING: Each agent scores every proposed approach across 5 weighted dimensions:

Dimension	Weight
Impact	0.25
Quality	0.25
Feasibility	0.20
Reusability	0.15
Risk	0.15

The Two-Gate Validation Architecture

Two validation gates catch problems at different stages:⁷

Gate 1: Consensus Validation (PostToolUse hook). Runs immediately after each deliberation agent completes: 1. Phase must have reached at least RANKING 2. Minimum 2 agents completed (configurable) 3. Consensus score meets the task-adaptive threshold 4. If any agent dissented, concerns must be documented

Gate 2: Pride Check (Stop hook). Runs before the session can close: 1. Diverse methods: multiple unique personas represented 2. Contradiction transparency: dissents have documented reasons 3. Complexity handling: at least 2 alternatives generated 4. Consensus confidence: classified as strong (above 0.85) or moderate (0.70-0.84) 5. Improvement evidence: final confidence exceeds initial confidence

Two hooks at different lifecycle points match how failures actually occur: some are instant (bad score) and some are gradual (low diversity, missing dissent documentation).⁷

Why Agreement Is Dangerous

Charlan Nemeth studied minority dissent from 1986 through her 2018 book In Defense of Troublemakers. Groups with dissenters make better decisions than groups that reach quick agreement. The dissenter does not need to be right. The act of disagreement forces the majority to examine assumptions they would otherwise skip.¹⁸

Wu et al. tested whether LLM agents can genuinely debate and found that without structural incentives for disagreement, agents converge toward the most confident-sounding initial response regardless of correctness.¹⁹ Liang et al. identified the root cause as “Degeneration-of-Thought”: once an LLM establishes confidence in a position, self-reflection cannot generate novel counterarguments, making multi-agent evaluation structurally necessary.²⁰

Independence is the critical design constraint. Two agents evaluating the same deployment strategy with visibility into each other’s findings produced scores of 0.45 and 0.48. Same agents without visibility: 0.45 and 0.72. The gap between 0.48 and 0.72 is the cost of herding.⁷

Detecting Fake Agreement

A conformity detection module tracks patterns suggesting agents are agreeing without genuine evaluation:⁷

Score clustering: Every agent scoring within 0.3 points on a 10-point scale signals shared context contamination rather than independent assessment. When five agents evaluating an authentication refactor all scored security risk between 7.1 and 7.4, re-running with fresh context isolation spread the scores to 5.8-8.9.

Boilerplate dissent: Agents copying each other’s concern language rather than generating independent objections.

Absent minority perspectives: Unanimous approval from personas with conflicting priorities (a Security Analyst and a Performance Engineer rarely agree on everything).

The conformity detector catches the obvious cases (roughly 10-15% of deliberations where agents converge too quickly). For the remaining 85-90%, the consensus and pride check gates provide sufficient validation.

What Didn’t Work in Deliberation

Free-form debate rounds. Three rounds of back-and-forth text for a database indexing discussion produced 7,500 tokens of debate. Round 1: genuine disagreement. Round 2: restated positions. Round 3: identical arguments in different words. Structured dimension scoring replaced free-form debate, dropping cost by 60% while improving ranking quality.⁷

Single validation gate. The first implementation ran one validation hook at session end. An agent completed deliberation with a 0.52 consensus score (below threshold), then continued on unrelated tasks for 20 minutes before the session-end hook flagged the failure. Splitting into two gates (one at task completion, one at session end) caught the same problems at different lifecycle points.⁷

Cost of Deliberation

Each research agent processes roughly 5,000 tokens of context and generates 2,000-3,000 tokens of findings. With 3 agents, that is 15,000-24,000 additional tokens per decision. With 10 agents, roughly 50,000-80,000 tokens.⁷

At current Opus pricing, a 3-agent deliberation costs approximately $0.68-0.90. A 10-agent deliberation costs $2.25-3.00. The system triggers deliberation on roughly 10% of decisions, so the amortized cost across all decisions is $0.23-0.30 per session. Whether that is worth it depends on what a bad decision costs.

When to Deliberate

Deliberate	Skip
Security architecture	Documentation typos
Database schema design	Variable renaming
API contract changes	Log message updates
Deployment strategies	Comment rewording
Dependency upgrades	Test fixture updates

CLAUDE.md Design

CLAUDE.md is operational policy for an AI agent, not a README for humans.²¹ The agent does not need to understand why you use conventional commits. It needs to know the exact command to run and what “done” looks like.

The Precedence Hierarchy

Location	Scope	Shared	Use Case
Enterprise managed settings	Organization	All users	Company standards
`./CLAUDE.md` or `./.claude/CLAUDE.md`	Project	Via git	Team context
`~/.claude/CLAUDE.md`	User	All projects	Personal preferences
`./CLAUDE.local.md`	Project-local	Never	Personal project notes
`.claude/rules/*.md`	Project rules	Via git	Categorized policies
`~/.claude/rules/*.md`	User rules	All projects	Personal policies

Rules files load automatically and provide structured context without cluttering CLAUDE.md.⁶

What Gets Ignored

These patterns reliably produce no observable change in agent behavior:²¹

Prose paragraphs without commands. “We value clean, well-tested code” is documentation, not operations. The agent reads it and proceeds to write code without tests because there is no actionable instruction.

Ambiguous directives. “Be careful with database migrations” is not a constraint. “Run alembic check before applying migrations. Abort if downgrade path is missing.” is.

Contradictory priorities. “Move fast and ship quickly” plus “Ensure comprehensive test coverage” plus “Keep runtime under 5 minutes” plus “Run full integration tests before every commit.” The agent cannot satisfy all four simultaneously and defaults to skipping verification.²¹

Style guides without enforcement. “Follow the Google Python Style Guide” without ruff check --select D gives the agent no mechanism to verify compliance.

What Works

Command-first instructions:

## Build and Test Commands
- Install: `pip install -r requirements.txt`
- Lint: `ruff check . --fix`
- Format: `ruff format .`
- Test: `pytest -v --tb=short`
- Type check: `mypy app/ --strict`
- Full verify: `ruff check . && ruff format --check . && pytest -v`

Closure definitions:

## Definition of Done
A task is complete when ALL of the following pass:
1. `ruff check .` exits 0
2. `pytest -v` exits 0 with no failures
3. `mypy app/ --strict` exits 0
4. Changed files have been staged and committed
5. Commit message follows conventional format: `type(scope): description`

Task-organized sections:

## When Writing Code
- Run `ruff check .` after every file change
- Add type hints to all new functions

## When Reviewing Code
- Check for security issues: `bandit -r app/`
- Verify test coverage: `pytest --cov=app --cov-fail-under=80`

## When Releasing
- Update version in `pyproject.toml`
- Run full suite: `pytest -v && ruff check . && mypy app/`

Escalation rules:

## When Blocked
- If tests fail after 3 attempts: stop and report the failing test with full output
- If a dependency is missing: check `requirements.txt` first, then ask
- Never: delete files to resolve errors, force push, or skip tests

Writing Order

If starting from scratch, add sections in this priority order:²¹

Build and test commands (the agent needs these before it can do anything useful)
Definition of done (prevents false completions)
Escalation rules (prevents destructive workarounds)
Task-organized sections (reduces irrelevant instruction parsing)
Directory scoping (monorepos: keeps service instructions isolated)

Skip style preferences until the first four are working.

File Imports

Reference other files within CLAUDE.md:

See @README.md for project overview
Coding standards: @docs/STYLE_GUIDE.md
API documentation: @docs/API.md
Personal preferences: @~/.claude/preferences.md

Import syntax: relative (@docs/file.md), absolute (@/absolute/path.md), or home directory (@~/.claude/file.md). Maximum depth: 5 levels of imports.⁶

Cross-Tool Instruction Compatibility

AGENTS.md is an open standard recognized by every major AI coding tool.²¹ If your team uses multiple tools, write AGENTS.md as the canonical source and mirror relevant sections to tool-specific files:

Tool	Native File	Reads AGENTS.md?
Codex CLI	AGENTS.md	Yes (native)
Cursor	`.cursor/rules`	Yes (native)
GitHub Copilot	`.github/copilot-instructions.md`	Yes (native)
Amp	AGENTS.md	Yes (native)
Windsurf	`.windsurfrules`	Yes (native)
Claude Code	CLAUDE.md	No (separate format)

The patterns in AGENTS.md (command-first, closure-defined, task-organized) work in any instruction file regardless of tool. Do not maintain parallel instruction sets that drift apart. Write one authoritative source and mirror.

Codex Parity Notes

Codex now has first-class equivalents for the major harness layers, but the migration is a pattern translation, not a file copy. Codex reads AGENTS.md before work begins, layering global guidance from ~/.codex with project and nested repository instructions.³¹ Codex skills use the same SKILL.md mental model with progressive disclosure: Codex starts with the skill name, description, and file path, then loads the full skill only when it decides to use it.³² Codex also has native hooks, plugin-bundled hooks, managed hooks, MCP support, and explicit subagent workflows.³³³⁴

Codex v0.138.0–v0.139.0 hardened that AGENTS.md discovery for non-trivial workspaces: loading now routes through the environment’s filesystem abstraction and preserves logical paths during the discovery walk, so the right file is selected even when the workspace is a remote filesystem or a symlinked tree.⁶¹ This matters whenever your canonical AGENTS.md is the authoritative source and the agent is operating over a mounted, container-materialized, or symlinked checkout — the cases where a naive path walk silently picks the wrong instruction file or none at all. If you mirror one authoritative AGENTS.md across services, treat this as the floor for trusting that the file the agent actually loaded is the one you wrote.

The practical mapping:

Claude Code harness layer	Codex equivalent	Migration rule
`CLAUDE.md` / `.claude/rules/`	`AGENTS.md` / nested `AGENTS.override.md`	Keep commands and completion rules canonical; split only when directory scope genuinely differs
`.claude/skills/<name>/SKILL.md`	`.agents/skills/<name>/SKILL.md` or plugin skill	Port reusable workflows, but rewrite descriptions for Codex’s activation wording and budget
`.claude/settings.json` hooks	Codex `config.toml`, plugin hooks, or managed requirements hooks	Port deterministic gates first; test each hook with real tool events before enabling broadly
`.claude/agents/*.md`	`~/.codex/agents/.toml`, `.codex/agents/.toml`, or built-in `worker` / `explorer`	Port only agents with repeated value; prefer explicit delegation because Codex subagents are explicit
Plugins	Codex plugins	Use plugins as the distribution unit after local hooks and skills are proven

The important difference: Claude subagents can be selected automatically from descriptions, while Codex currently documents subagent workflows as explicit. That makes skills and hooks the right default for always-on harness behavior in Codex; subagents are for deliberate parallel work, review, and exploration.

Testing Your Instructions

Verify the agent actually reads and follows your instructions:

# Check active instructions
claude --print "What instructions are you following for this project?"

# Verify specific rules are active
claude --print "What is your definition of done?"

The acid test: Ask the agent to explain your build commands. If it cannot reproduce them verbatim, the instructions are either too verbose (content pushed out of context), too vague (agent cannot extract actionable instructions), or not being discovered. GitHub’s analysis of 2,500 repositories found that vagueness causes most failures.²¹

Production Patterns

Opus 4.7 Long-Horizon Patterns (April 2026)

Claude Opus 4.7 (April 16, 2026) shipped with specific capabilities that change what a harness needs to defend against:²⁹

Tool-failure resilience: Opus 4.7 continues through tool failures that halted Opus 4.6 sessions. You can reduce — but not eliminate — defensive retry wrappers in subagent code. Keep the hook-level guards; trim the in-prompt “if the tool fails, try again three times” scaffolding.
xhigh effort tier (Opus-4.7 only): Sits between high and max. Recommended default for coding and agentic workloads. On long-running subagents, xhigh meaningfully outperforms high with sub-proportional token cost. max remains the right choice for single-shot hard reasoning; xhigh is better for sustained tasks.
Token-budget ceiling: Configurable per agent run via output_config.task_budget (beta header task-budgets-2026-03-13). The model sees a running countdown and gracefully scopes work to the budget instead of running out unexpectedly. Use for agentic loops where you want predictable token spend without sacrificing quality on short prompts.
Implicit-need awareness: First Claude model to pass “implicit-need” tests — recognizing when the user’s literal request underspecifies what they actually need. This makes CLAUDE.md’s “clarifying rules” section less necessary. If your CLAUDE.md is 200 lines of “also consider X when the user asks for Y” guardrails, prune the ones that are now covered natively.

Worktree Base, Sandbox Paths, and Admin Settings (May 7, 2026)

Claude Code v2.1.133 adds four admin-tier settings worth knowing about for production harnesses:³⁹

Setting	Values	What it does
`worktree.baseRef`	`fresh` (default) \| `head`	New worktrees branch from `origin/<default>` again. Breaking-default revert from v2.1.128, which had used local `HEAD`. Set `worktree.baseRef: "head"` if your team relies on unpushed commits being available in new worktrees.
`sandbox.bwrapPath`	absolute path	Pin the Bubblewrap binary location on Linux/WSL hosts where it is not on `$PATH` or where you ship a vendored version.
`sandbox.socatPath`	absolute path	Same idea for the `socat` binary used by sandbox networking.
`parentSettingsBehavior`	`'first-wins'` (default) \| `'merge'`	Admin-tier control over how SDK `managedSettings` compose with parent enterprise/team settings. `'merge'` lets a child session inherit and extend; `'first-wins'` keeps the parent authoritative.

The worktree.baseRef revert is the one to flag for users: agents that relied on the v2.1.128-v2.1.132 behavior (worktrees branching from local HEAD) lose access to unpushed work in fresh worktrees unless they opt back in.

OTel Feedback Survey for Enterprise Observability (May 8, 2026)

Claude Code v2.1.136 added CLAUDE_CODE_ENABLE_FEEDBACK_SURVEY_FOR_OTEL to re-enable the in-session quality survey for enterprises capturing the responses through OpenTelemetry.⁴⁰ If your org sinks OTel events to a central observability stack, this env var puts the survey back into the data path so quality signal flows through the same pipeline as latency and error metrics. Treat it as opt-in: the default keeps the survey suppressed, which is correct for non-OTel deployments.

The Quality Loop

A mandatory review process for all non-trivial changes:

Implement - Write the code
Review - Re-read every line. Catch typos, logic errors, unclear sections
Evaluate - Run the evidence gate. Check patterns, edge cases, test coverage
Refine - Fix every issue. Never defer to “later”
Zoom Out - Check integration points, imports, adjacent code for regressions
Repeat - If any evidence gate criterion fails, return to step 4
Report - List what changed, how verified, cite specific evidence

The Evidence Gate

“I believe” and “it should” are not evidence. Cite file paths, test output, or specific code.

Criterion	Required Evidence
Follows codebase patterns	Name the pattern and file where it exists
Simplest working solution	Explain what simpler alternatives were rejected and why
Edge cases handled	List specific edge cases and how each is handled
Tests pass	Paste test output showing 0 failures
No regressions	Name the files/features checked
Solves the actual problem	State user’s need and how this addresses it

If you cannot produce evidence for any row, return to Refine.²²

Human Merge Authority

A May 2026 arXiv study of 29,585 AI-agent pull-request lifecycles separates operational agency from merge governance.⁴⁷ The useful architecture lesson is simple: agents can start work, carry branches forward, open PRs, review work, and summarize risk, while merge authority remains a separate governance boundary.

Make that boundary explicit in the harness. Let agents prepare PRs and collect evidence; require human approval for merges, releases, and destructive repository operations unless the organization has a separately audited automation policy. Where automation executes a merge, preserve logs that distinguish the executor from the human or policy that authorized it.

Error Handling Patterns

Atomic file writes. Multiple agents writing to the same state file simultaneously corrupts JSON. Write to .tmp files, then mv atomically. The OS guarantees mv is atomic on the same filesystem.¹⁷

# Atomic state update
jq --argjson d "$new_depth" '.depth = $d' "$STATE_FILE" > "${STATE_FILE}.tmp"
mv "${STATE_FILE}.tmp" "$STATE_FILE"

State corruption recovery. If state gets corrupted, the recovery pattern recreates from safe defaults rather than crashing:¹⁶

if ! jq -e '.depth' "$RECURSION_STATE_FILE" &>/dev/null; then
    # Corrupted state file, recreate with safe defaults
    echo '{"depth": 0, "agent_id": "root", "parent_id": null}' > "$RECURSION_STATE_FILE"
    echo "- Recursion state recovered (was corrupted)"
fi

The ((VAR++)) bash trap. ((VAR++)) returns exit code 1 when VAR is 0 because 0++ evaluates to 0, which bash treats as false. With set -e enabled, this kills the script. Use VAR=$((VAR + 1)) instead.¹⁶

Blast Radius Classification

Classify every agent action by blast radius and gate accordingly:²

Classification	Examples	Gate
Local	File writes, test runs, linting	Auto-approve
Shared	Git commits, branch creation	Warn + proceed
External	Git push, API calls, deployments	Require human approval

Remote Control (connecting to local Claude Code from any browser or mobile app) turns the “External” gate from a blocking wait into an async notification. The agent keeps working on the next task while you review the previous one from your phone.²

Task Specification for Autonomous Runs

Effective autonomous tasks include three elements: objective, completion criteria, and context pointers:¹⁶

OBJECTIVE: Implement multi-agent deliberation with consensus validation.

COMPLETION CRITERIA:
- All tests in tests/test_deliberation_lib.py pass (81 tests)
- post-deliberation.sh validates consensus above 70% threshold
- recursion-guard.sh enforces spawn budget (max 12 agents)
- No Python type errors (mypy clean)

CONTEXT:
- Follow patterns in lib/deliberation/state_machine.py
- Consensus thresholds in configs/deliberation-config.json
- Spawn budget model: agents inherit budget, not increment depth

Criteria must be machine-verifiable: test pass/fail, linter output, HTTP status codes, file existence checks. An early task that asked the agent to “write tests that pass” produced assert True and assert 1 == 1. Technically correct. Practically worthless.¹⁶

Criteria Quality	Example	Outcome
Vague	“Tests pass”	Agent writes trivial tests
Measurable but incomplete	“Tests pass AND coverage >80%”	Tests cover lines but test nothing meaningful
Comprehensive	“All tests pass AND coverage >80% AND no type errors AND linter clean AND each test class tests a distinct module”	Production-quality output

Failure Modes to Watch For

Failure Mode	Description	Prevention
Shortcut Spiral	Skipping quality loop steps to finish faster	Evidence gate requires proof for each criterion
Confidence Mirage	“I’m confident” without running verification	Ban hedging language in completion reports
Phantom Verification	Claiming tests pass without running them this session	Stop hook runs tests independently
Deferred Debt	TODO/FIXME/HACK in committed code	PreToolUse hook on git commit scans diff
Filesystem Pollution	Dead-end artifacts from abandoned iterations	Cleanup step in completion criteria

A Concrete Session Trace

A session trace from an autonomous run processing a PRD with 5 stories:²

SessionStart fires. Dispatcher injects: current date, project detection, philosophy constraints, cost tracking initialization. Five hooks, 180ms total.
Agent reads the PRD, plans the first story. UserPromptSubmit fires. Dispatcher injects: active project context, session drift baseline.
Agent calls Bash to run tests. PreToolUse:Bash fires. Credentials check, sandbox validation, project detection. 90ms. Tests run. PostToolUse:Bash fires: activity heartbeat logged, drift check.
Agent calls Write to create a file. PreToolUse:Write fires: file scope check. PostToolUse:Write fires: lint check, commit tracking.
Agent finishes the story. Stop fires. Quality gate checks: did the agent cite evidence? Hedging language? TODO comments in the diff? If any check fails, exit 2 and the agent continues.
Independent verification: A fresh agent runs the test suite without trusting the previous agent’s self-report.
Three code review agents spawn in parallel. Each reviews the diff independently. If any reviewer flags CRITICAL, the story goes back in the queue.
Story passes. Next story loads. The cycle repeats for all 5 stories.

Total hooks fired across 5 stories: ~340. Total time in hooks: ~12 seconds. That overhead prevented three credential leaks, one destructive command, and two incomplete implementations in a single overnight run.

Case Study: Overnight PRD Processing

A production harness processed 12 PRDs (47 stories) across 8 overnight sessions. Metrics compare the first 4 PRDs (minimal harness: CLAUDE.md only) against the last 8 (full harness: hooks, skills, quality gates, multi-agent review).

Metric	Minimal (4 PRDs)	Full Harness (8 PRDs)	Change
Credential leaks	2 leaked to git	7 blocked pre-commit	Reactive to preventive
Destructive commands	1 force-push to main	4 blocked	Exit 2 enforcement
False completion rate	35% failed tests	4%	Evidence gate + Stop hook
Revision rounds/story	2.1	0.8	Skills + quality loop
Context degradation	6 incidents	1 incident	Filesystem memory
Token overhead	0%	~3.2%	Negligible
Hook time/story	0s	~2.4s	Negligible

The two credential leaks required rotating API keys and auditing downstream services: roughly 4 hours of incident response. The harness overhead that prevented the equivalent was 2.4 seconds of bash per story. The false completion rate dropped from 35% to 4% because the Stop hook independently ran tests before allowing the agent to report done.

Security Considerations

The Five Principles of Trustworthy Agents (Anthropic, April 2026)

Anthropic published a formal framework for agent trustworthiness on April 9, 2026.²⁷ The five principles parallel — and extend — the Evidence Gate thinking in this guide:

Principle	What it means	How this harness satisfies it
Human control	Meaningful human override at every decision point	Hooks gate tool calls; PreCompact blocking; Auto Mode classifier as check-layer
Value alignment	Agent actions track user intent, not adjacent goals	CLAUDE.md as explicit intent specification; skills as capability scoping
Security	Resistance to adversarial inputs and prompt injection	Sandbox + deny-rules + input validation at the hook layer
Transparency	Auditable records of decisions and actions	Hook logging; session transcripts; skill-invocation traces
Privacy	Appropriate data handling and governance	Credential env-var scrubbing; secret detection at hook layer

Anthropic also donated MCP to the Linux Foundation’s Agentic AI Foundation, joining AGENTS.md (now jointly stewarded with OpenAI, Google, Cursor, Factory, Sourcegraph). Agent interoperability standards are now vendor-neutral.²⁷

Skill sandbox tooling: For teams that treat skills as an attack surface, Permiso’s SandyClaw (launched April 2, 2026) runs skills in a dedicated sandbox and delivers evidence-backed verdicts from Sigma/YARA/Nova/Snort detection. First product in the skill-sandbox category.²⁸

The Sandbox

Claude Code supports an optional sandbox mode (enabled via settings.json or the /sandbox command) that restricts network access and filesystem operations using OS-level isolation (seatbelt on macOS, bubblewrap on Linux). When enabled, the sandbox prevents the model from making arbitrary network requests or accessing files outside the project directory. Without sandboxing, Claude Code uses a permission-based model where you approve or deny individual tool calls.¹³

May 2026 security floor. Claude Code v2.1.149 fixed a PowerShell working-directory permission bypass, several PowerShell allow-rule and stale-variable permission-analysis gaps, and a git-worktree sandbox write-allowlist bug that covered the full main repository root instead of only shared git internals.⁵³ If your harness allows PowerShell or worktree-isolated agents, treat v2.1.149+ as the floor and keep shell rules narrow. Broad PowerShell(*) and all-repo write exceptions are orchestration shortcuts, not safety boundaries.

OpenAI Agents SDK sandbox lockdown (v0.17.0, May 8, 2026). On the OpenAI side, openai-agents-python v0.17.0 tightened a parallel boundary: LocalFile.src and LocalDir.src are now constrained to within the materialization base_dir (the SDK process current working directory when the manifest is applied), unless the source is explicitly granted via Manifest.extra_path_grants with SandboxPathGrant.⁴¹ Relative local sources resolve from base_dir; absolute paths must already sit inside it or carry a grant. This closes a local artifact boundary issue: prior versions allowed manifests to pull arbitrary host paths into a sandbox workspace. Migration: declare trusted host roots at the manifest level with SandboxPathGrant(path=..., read_only=True) for read-only mounts. Treat extra_path_grants as trusted application configuration; never populate grants from model output or untrusted manifest input.

OpenAI Agents SDK follow-up floor (v0.17.3). The 0.17.1-0.17.3 line added more sandbox and session hardening: archive extraction limits, GitRepo subpath validation, clearer sandbox-provider errors, mountpoint credentials kept out of sandbox commands, rejection of relative sandbox workspace roots, and Vercel-sandbox terminal-state handling.⁵⁴ If you are using OpenAI-hosted or provider-backed sandboxes rather than only Claude Code hooks, treat 0.17.3 as the current floor for the patterns in this section.

Permission Boundaries

The permission system gates operations at multiple levels:

Level	Controls	Example
Tool permissions	Which tools can be used	Restrict subagent to Read, Grep, Glob
File permissions	Which files can be modified	Block writes to `.env`, `credentials.json`
Command permissions	Which bash commands can run	Block `rm -rf`, `git push --force`
Network permissions	Which domains can be accessed	Allowlist for MCP server connections

Prompt Injection Defense

Skills and hooks provide defense-in-depth against prompt injection:

Skills with tool restrictions prevent a compromised prompt from gaining write access:

allowed-tools: Read, Grep, Glob

PreToolUse hooks validate every tool call regardless of how the model was prompted:

# Block credential file access regardless of prompt
if echo "$FILE_PATH" | grep -qE "\.(env|pem|key|credentials)$"; then
    echo "BLOCKED: Sensitive file access" >&2
    exit 2
fi

Subagent isolation limits blast radius. A subagent with permissionMode: plan cannot make changes even if its prompt is compromised.

Agent Logs and Guardrails Are Security Surfaces

Two May 2026 advisories reinforce a pattern: agent infrastructure creates new places for sensitive content and executable policy to leak or escape. GitHub Advisory GHSA-f3jg-756w-gm35 covers a Gryph Agents payload-filter issue where sensitive tool-payload content could remain in local SQLite logs under default logging behavior.⁴⁵ OSV GHSA-wxxx-gvqv-xp7p covers a LiteLLM custom-code guardrail sandbox escape in an admin-protected proxy endpoint.⁴⁶

The production rule: treat agent transcripts, tool payloads, SQLite logs, and guardrail execution as sensitive infrastructure. Redact before persistence, apply retention limits, and keep custom guardrail code sandboxed and reviewable. A prompt-level “do not log secrets” rule is not enough; the logging and guardrail path needs deterministic tests.

Hook Security

HTTP hooks that interpolate environment variables into headers require an explicit allowedEnvVars list to prevent arbitrary environment variable exfiltration:¹³

{
  "type": "http",
  "url": "https://api.example.com/notify",
  "headers": {
    "Authorization": "Bearer $MY_TOKEN"
  },
  "allowedEnvVars": ["MY_TOKEN"]
}

The Human-Agent Division of Responsibility

Security in agent architectures requires a clear division between human and agent responsibilities:¹⁷

Human Responsibility	Agent Responsibility
Problem definition	Pipeline execution
Confidence thresholds	Execution within thresholds
Consensus requirements	Consensus computation
Quality gate criteria	Quality gate enforcement
Error analysis	Error detection
Architecture decisions	Architecture options
Domain context injection	Documentation generation

The pattern: humans own decisions that require organizational context, ethical judgment, or strategic direction. Agents own decisions that require computational search across large possibility spaces. Hooks enforce the boundary.

Recursive Hook Enforcement

Hooks fire for subagent actions too.¹³ If Claude spawns a subagent via the Agent tool, your PreToolUse and PostToolUse hooks execute for every tool the subagent uses. Without recursive hook enforcement, a subagent could bypass your safety gates. The SubagentStop event lets you run cleanup or validation when a subagent completes.

This is not optional. An agent that spawns a subagent without your security hooks is an agent that can force-push to main, read credential files, or run destructive commands while your gates watch the main conversation do nothing.

Cost as Architecture

Cost is an architectural decision, not an operational afterthought.² Three levels:

Token level. System prompt compression. Remove tutorial code examples (the model knows the APIs), collapse duplicate rules across files, and replace explanations with constraints. “Reject tool calls matching sensitive paths” does the same work as a 15-line explanation of why credentials should not be read.

Agent level. Fresh spawns over long conversations. Each story in an autonomous run gets a new agent with a clean context. The context never balloons because each agent starts fresh. Briefing instead of memory: models execute a clear briefing better than they navigate 30 steps of accumulated context.

Architecture level. CLI-first over MCP when the operation is stateless. A claude --print call for a one-shot evaluation costs less and adds no connection overhead. MCP makes sense when the tool needs persistent state or streaming.

Decision Framework

When to use each mechanism:

Problem	Use	Why
Format code after every edit	PostToolUse hook	Must happen every time, deterministically
Block dangerous bash commands	PreToolUse hook	Must block before execution, exit code 2
Apply security review patterns	Skill	Domain expertise that auto-activates on context
Explore codebase without polluting context	Explore subagent	Isolated context, returns summary only
Run experimental refactoring safely	Worktree-isolated subagent	Changes can be discarded if they fail
Review code from multiple perspectives	Parallel subagents or Agent Team	Independent evaluation prevents blind spots
Decide on irreversible architecture	Multi-agent deliberation	Confidence trigger + consensus validation
Persist decisions across sessions	MEMORY.md	Filesystem survives context boundaries
Share team standards	Project CLAUDE.md + .claude/rules/	Git-distributed, loads automatically
Define project build/test commands	CLAUDE.md	Command-first instructions the agent can verify
Run long autonomous development	Ralph loop (fresh-context iteration)	Full context budget per iteration, filesystem state
Notify Slack when session ends	Async Stop hook	Non-blocking, does not slow the session
Validate quality before commit	PreToolUse hook on git commit	Block the commit if lint/tests fail
Enforce completion criteria	Stop hook	Prevent agent from stopping before task is done

Skills vs Hooks vs Subagents

Dimension	Skills	Hooks	Subagents
Invocation	Automatic (LLM reasoning)	Deterministic (event-driven)	Explicit or auto-delegated
Guarantee	Probabilistic (model decides)	Deterministic (always fires)	Deterministic (isolated context)
Context cost	Injected into main context	Zero (runs outside LLM)	Separate context window
Token cost	Description budget (1% of window, fallback 8,000 characters)	Zero	Full context per subagent
Best for	Domain expertise	Policy enforcement	Focused work, exploration

FAQ

How many hooks is too many?

Performance, not count, is the constraint. Each hook runs synchronously, so total hook execution time adds to every matched tool call. 95 hooks across user-level and project-level settings run without noticeable latency when each hook completes in under 200ms. The threshold to watch: if a PostToolUse hook adds more than 500ms to every file edit, the session feels sluggish. Profile your hooks with time before deploying them.¹⁴

Can hooks block Claude Code from running a command?

Yes. PreToolUse hooks block any tool action by exiting with code 2. Claude Code cancels the pending action and shows the hook’s stderr output to the model. Claude sees the rejection reason and suggests a safer alternative. Exit 1 is a non-blocking warning where the action still proceeds.³

Where should I put hook configuration files?

Hook configurations go in .claude/settings.json for project-level hooks (committed to your repository, shared with your team) or ~/.claude/settings.json for user-level hooks (personal, applied to every project). Project-level hooks take precedence when both exist. Use absolute paths for script files to avoid working-directory issues.¹⁴

Does every decision need deliberation?

No. The confidence module scores decisions across four dimensions (ambiguity, complexity, stakes, context dependency). Only decisions scoring below 0.70 overall confidence trigger deliberation, roughly 10% of total decisions. Documentation fixes, variable renames, and routine edits skip deliberation entirely. Security architecture, database schema changes, and irreversible deployments trigger it consistently.⁷

How do I test a system designed to produce disagreement?

Test both success paths and failure paths. Success: agents disagree productively and reach consensus. Failure: agents converge too quickly, never converge, or exceed spawn budgets. End-to-end tests simulate each scenario with deterministic agent responses, verifying that both validation gates catch every documented failure mode. A production deliberation system runs 141 tests across three layers: 48 bash integration tests, 81 Python unit tests, and 12 end-to-end pipeline simulations.⁷

What is the latency impact of deliberation?

A 3-agent deliberation adds 30-60 seconds of wall-clock time (agents run sequentially through the Agent tool). A 10-agent deliberation adds 2-4 minutes. The consensus and pride check hooks each run in under 200ms. The primary bottleneck is LLM inference time per agent, not orchestration overhead.⁷

How long should a CLAUDE.md file be?

Keep each section under 50 lines and the total file under 150 lines. Long files get truncated by context windows, so front-load the most critical instructions: commands and closure definitions before style preferences.²¹

Can this work with tools other than Claude Code?

The architectural principles (hooks as deterministic gates, skills as domain expertise, subagents as isolated contexts, filesystem as memory) apply conceptually to any agentic system. The specific implementation uses Claude Code’s lifecycle events, matcher patterns, and Agent tool. AGENTS.md carries the same patterns to Codex, Cursor, Copilot, Amp, and Windsurf.²¹ The harness pattern is tool-agnostic even if the implementation details are tool-specific.

Quick Reference Card

Hook Configuration

{
  "hooks": {
    "PreToolUse": [{"matcher": "Bash", "hooks": [{"type": "command", "command": "script.sh"}]}],
    "PostToolUse": [{"matcher": "Write|Edit", "hooks": [{"type": "command", "command": "format.sh"}]}],
    "Stop": [{"matcher": "", "hooks": [{"type": "agent", "prompt": "Verify tests pass. $ARGUMENTS"}]}],
    "SessionStart": [{"matcher": "", "hooks": [{"type": "command", "command": "setup.sh"}]}]
  }
}

Skill Frontmatter

---
name: my-skill
description: What it does and when to use it. Include trigger phrases.
allowed-tools: Read, Grep, Glob
---

Subagent Definition

---
name: my-agent
description: When to invoke. Include PROACTIVELY for auto-delegation.
tools: Read, Grep, Glob, Bash
model: opus
permissionMode: plan
---

Instructions for the subagent.

Exit Codes

Code	Meaning	Use For
0	Success	Allow the operation
2	Block	Security gates, quality gates
1	Non-blocking warning	Logging, advisory messages

Key Commands

Command	Purpose
`/compact`	Compress context, preserve decisions
`/context`	View context allocation and active skills
`/agents`	Manage subagents
`/goal <condition>`	Keep Claude working toward a completion condition
`claude agents`	Open Agent View for running, blocked, and completed sessions
`CLAUDE_CODE_WORKFLOWS=1`	Enable the Workflow tool for deterministic multi-agent orchestration
`claude -c`	Continue most recent session
`claude --print`	One-shot CLI invocation (no conversation)
`# <note>`	Add note to memory file
`/memory`	View and manage auto-memory

File Locations

Path	Purpose
`~/.claude/CLAUDE.md`	Personal global instructions
`.claude/CLAUDE.md`	Project instructions (git-shared)
`.claude/settings.json`	Project hooks and permissions
`~/.claude/settings.json`	User hooks and permissions
`~/.claude/skills/<name>/SKILL.md`	Personal skills
`.claude/skills/<name>/SKILL.md`	Project skills (git-shared)
`~/.claude/agents/<name>.md`	Personal subagent definitions
`.claude/agents/<name>.md`	Project subagent definitions
`.claude/rules/*.md`	Project rule files
`~/.claude/rules/*.md`	User rule files
`~/.claude/projects/{path}/memory/MEMORY.md`	Auto-memory

Changelog

Date	Change
2026-06-10	Guide v1.18: Recursive sub-agents (Claude Code v2.1.172). Added a note to the Recursion Guard subsection: Claude Code sub-agents can now spawn their own sub-agents, nesting up to 5 levels deep — where delegation was previously effectively one level (v2.1.172, June 10). Reframed the userland spawn-budget/depth-cap pattern as the control that keeps a 5-level tree from fanning out, with 5 levels treated as a platform ceiling rather than a default.
2026-06-09	Guide v1.17: Claude Code v2.1.169–v2.1.170 + Codex v0.138.0–v0.139.0 governance and multi-agent-v2 hardening. Wove five verified harness-architecture changes into the body. Skills System gained a “Hiding the Bundled Surface as Governance” subsection: the `disableBundledSkills` setting (and `CLAUDE_CODE_DISABLE_BUNDLED_SKILLS` env var) hides bundled skills, workflows, and built-in slash commands from the model as a deliberate attack-surface reduction (v2.1.169). The June Hook-Architecture subsection added the `--safe-mode` flag (and `CLAUDE_CODE_SAFE_MODE`), which starts a session with all customizations disabled — CLAUDE.md, plugins, skills, hooks, MCP — for clean-room troubleshooting and governance (v2.1.169), plus a model-tier note: Anthropic’s Claude Fable 5 (`claude-fable-5`) launched June 9 as a Mythos-class tier above Opus, selectable via `/model claude-fable-5` in v2.1.170, with Opus 4.8 remaining Claude Code’s agentic default. Memory and Context added the `/cd` command (v2.1.169), which moves a session to a new working directory without breaking the mid-session prompt cache. Multi-Agent Orchestration / Codex Parity hardened for production: `close_agent` renamed `interrupt_agent` (v0.139.0), encrypted inter-agent message payloads, a v2 agent config catalog, agent-residency LRU, and concurrency counted by active execution (v0.138.0), AGENTS.md discovery routed through environment filesystems with preserved logical paths for correct file selection on remote/symlinked workspaces (v0.138.0/v0.139.0), and subagent MCP-startup warnings scoped to the owning thread instead of duplicating into the parent (v0.139.0).
2026-06-08	Guide v1.16: June agent-architecture patterns from Claude Code v2.1.162–v2.1.166 + Codex v0.137.0. Added the “Stop-hook steering, cross-session authority, and multi-agent v2” subsection covering four harness-relevant changes: (1) `Stop`/`SubagentStop` hooks can return `hookSpecificOutput.additionalContext` to inject “not done yet, here is why” feedback and continue the turn without a hook-error block (v2.1.163); (2) cross-session messaging hardened so `SendMessage`-relayed messages from another session no longer carry the originating user’s authority — treat inbound inter-agent messages as untrusted data (v2.1.166); (3) the `fallbackModel` setting chains up to three backup models with a one-shot fallback retry on non-retryable API errors, and `claude agents --json` adds a `waitingFor` field for fleet observability (v2.1.162/166); (4) Codex multi-agent v2 (v0.137.0) keeps the runtime with each thread, defaults `hide_spawn_agent_metadata` to true, propagates parent events to child listeners, and adds a v1 skills extension with per-turn catalog resolution and thread-start/turn-error lifecycle contributor events. No spec change to AGENTS.md (still Agentic-AI-Foundation-stewarded, no versioned changelog).
2026-05-31	Guide v1.15: Claude Code v2.1.157 + Hermes v0.15.1/v0.15.2 patches. Added the “Plugin and Skill Convergence in `.claude/skills/`“ subsection: Claude Code v2.1.157 makes any folder in a project’s `.claude/skills/` directory auto-load as a plugin without marketplace registration, and `claude plugin init <name>` scaffolds a fresh plugin with manifest + SKILL.md there. The harness implication is real — small-scope project tooling no longer pays the manifest tax to live in version control; plugins still own the bundled-installable ZIP shape. Same release ships `EnterWorktree` mid-session switching between Claude-managed worktrees and leaves background worktrees unlocked after the agent finishes so `git worktree remove`/`prune` work cleanly. Hermes Agent v0.15.1 (May 29) is the same-day Velocity hotfix: dashboard 401 reload-loop fix on loopback mode, Docker now requires explicit `HERMES_DASHBOARD_INSECURE=1`, MCP bare commands (`npx`, `npm`, `node`) resolve in Docker, Skills page restored, Kanban workers respond to SIGTERM cleanly, Skills.sh catalog grew 858 → 19,932 entries via sitemap. Hermes v0.15.2 (May 29) is a packaging-only hotfix that bundles `plugin.yaml` manifests in wheel and sdist distributions.
2026-05-28	Guide v1.14: Claude Code v2.1.152-v2.1.154 + Codex v0.134.0-v0.135.0 + Hermes v0.15.0 architecture-pattern pass. Claude Code shifted defaults and added orchestration primitives: Opus 4.8 is now the default with high effort by default and a new `/effort xhigh`; dynamic workflows orchestrate tens to hundreds of agents in the background via `/workflows`; lean system prompt is now default for all models except Haiku/Sonnet/Opus 4.7-and-earlier; the new `MessageDisplay` hook event lets hooks transform or hide assistant text as it is displayed; `disallowed-tools` in skill/command frontmatter removes tools while the skill is active; `/reload-skills` re-scans skill directories without restart; `SessionStart` hooks can return `reloadSkills: true` and set `hookSpecificOutput.sessionTitle`; `--fallback-model` switches mid-session when the primary is missing; auto mode no longer requires opt-in consent; `pluginSuggestionMarketplaces` managed setting allowlists org marketplaces for context-aware suggestions; `claude agents` accepts `! <command>` background-shell sessions; plugins can declare `defaultEnabled: false`; stdio MCP subprocess env now includes `CLAUDE_CODE_SESSION_ID` and `CLAUDECODE=1`. Codex v0.134.0 made `--profile` the primary profile selector across CLI, TUI permissions, and sandbox flows (legacy configs rejected with migration guidance), added local conversation-history search, improved MCP setup with per-server environment targeting and OAuth for streamable HTTP servers, and let read-only MCP tools run concurrently when they advertise `readOnlyHint`; v0.135.0 added richer `codex doctor` diagnostics, `/status` remote details, vim text-object editing, named permission profiles in `/permissions`, and `Sandbox` presets in the Python SDK. Hermes Agent v0.15.0 (May 28) ships the Velocity release: `run_agent.py` refactored 76% across 14 modules, multi-agent Kanban v2 with auto-decomposition and swarm topology, Bitwarden Secrets Manager replacing per-provider keys with one bootstrap token, Promptware defense against Brainworm-class prompt injection at three security chokepoints, skill bundles, a TUI session orchestrator for multi-session management in one terminal, and a 4,500× faster `session_search` with the LLM dependency removed. Harness-architecture implications: the named-profile pattern (Codex `--profile`, Claude Code `pluginSuggestionMarketplaces`) is becoming the standard configuration primitive for multi-tenant agent runtimes; concurrent read-only MCP tools (Codex `readOnlyHint`) are the right pattern to fan out non-mutating context fetches; the `MessageDisplay` hook gives operators a first-class transformation surface that wasn’t reachable from `PostToolUse` or `Stop`; and the lean-system-prompt default removes the long-standing trade-off between operator-defined context and provider scaffolding.
2026-05-24	Guide v1.13: Claude Code v2.1.150 + OpenAI Agents SDK v0.17.3 security/currentness pass. Local `claude --version` returned `2.1.144 (Claude Code)` while npm latest for `@anthropic-ai/claude-code` returned `2.1.150` and GitHub latest release returned `v2.1.150`. Added v2.1.149 harness guidance for PowerShell permission-bypass fixes, PowerShell allow-rule/stale-variable permission-analysis fixes, and the git-worktree sandbox write-allowlist fix; noted that v2.1.150 is internal infrastructure only with no announced user-facing changes. PyPI latest for `openai-agents` returned `0.17.3`, so the OpenAI sandbox section now notes 0.17.1-0.17.3 follow-up hardening for archive extraction, GitRepo subpaths, sandbox credentials, relative workspace roots, and provider terminal-state handling.⁵³⁵⁴
2026-05-21	Guide v1.12: Claude Code v2.1.147 Workflow pass. Local `claude --version` returned `2.1.144 (Claude Code)` while npm latest for `@anthropic-ai/claude-code` returned `2.1.147`. Added the off-by-default `Workflow` tool as a first-party deterministic multi-agent orchestration primitive and clarified that hooks, tests, review gates, spawn budgets, and evidence reports remain the correctness boundary.⁵²
2026-05-15	Guide v1.11: Claude Code v2.1.142 background-session and plugin reliability pass. Local `claude --version` returned `2.1.141 (Claude Code)` while npm latest for `@anthropic-ai/claude-code` returned `2.1.142`. Added operator guidance for new `claude agents` dispatch flags, Opus 4.7 Fast-mode default, root-level plugin `SKILL.md` discovery, plugin LSP visibility, `MCP_TOOL_TIMEOUT` remote HTTP/SSE behavior, and background-session / daemon / plugin-cache reliability fixes.⁵¹
2026-05-14	Guide v1.10: Claude Code v2.1.141 operator-signaling and scoping pass. Local `claude --version` returned `2.1.141 (Claude Code)` and npm latest for `@anthropic-ai/claude-code` returned `2.1.141`. Added hook guidance for `terminalSequence` as operator signaling rather than enforcement, noted `claude agents --cwd <path>` for directory-scoped Agent View, and documented the architecture impact of `CLAUDE_CODE_PLUGIN_PREFER_HTTPS` plus `ANTHROPIC_WORKSPACE_ID` for plugin installation and workload-identity federation scoping.⁵⁰
2026-05-13	Guide v1.9: Claude Code v2.1.140 reliability pass. Local `claude --version` returned `2.1.140 (Claude Code)`. Added `subagent_type` to agent-hook guidance and updated the hook governance section for v2.1.140 fixes to `ConfigChange`, `disableAllHooks`, `allowManagedHooksOnly`, permission-dialog env-var display, custom style reset after settings sync, Windows Git Bash native-package fallback, and `/scroll-speed` behavior.⁴⁹
2026-05-11	Guide v1.8: Claude Code v2.1.139 currentness pass + focused agent-security/memory scan. Verified local `claude --version` as 2.1.139 and added v2.1.139 operational changes: Agent View via `claude agents`, `/goal` completion loops, command-hook `args`, `PostToolUse` `continueOnBlock`, MCP `CLAUDE_PROJECT_DIR`, and OpenTelemetry active-time fix.⁴²⁴³⁴⁴ Added memory-curation warning from “The Memory Curse” arXiv preprint, human merge-authority guidance from the PR-lifecycle arXiv preprint, and agent-log/guardrail security guidance from the Gryph Agents and LiteLLM advisories.⁴⁵⁴⁶⁴⁷⁴⁸ Fixed stale Skills vs Hooks vs Subagents token-budget row from 2% to the current 1% / 8,000-character skill-description budget.
2026-05-09	Guide v1.7: Day-3 follow-up on Claude Code v2.1.136 + openai-agents-python v0.17.0. Added `autoMode.hard_deny` and v2.1.136 hook/plugin fixes subsection to Hook Architecture covering the new unconditional-block tier, MCP-disappears-after-`/clear` fix across VS Code/JetBrains/Agent SDK, MCP OAuth refresh-token loss on concurrent refresh, plan-mode write-block fix when `Edit(...)` allow rule matched, plugin `Stop`/`UserPromptSubmit` cache-cleanup race, `skills` entry hiding default `skills/` dir, and `CLAUDE_ENV_FILE` SessionStart-hook env vars going stale after `/resume`/`/clear`.⁴⁰ Added OTel Feedback Survey subsection to Production Patterns covering `CLAUDE_CODE_ENABLE_FEEDBACK_SURVEY_FOR_OTEL`.⁴⁰ Extended The Sandbox subsection with openai-agents-python v0.17.0 lockdown: `LocalFile.src` / `LocalDir.src` constrained to within `base_dir` unless granted via `Manifest.extra_path_grants` with `SandboxPathGrant`.⁴¹ Added RealtimeAgent default-model note (`gpt-realtime-2`) to Managed vs. Self-Hosted Harnesses.⁴¹ Changelog-only: Claude Code v2.1.137 (Win VSCode activation fix), v2.1.138 (internal fixes); `claude-agent-sdk-python` v0.1.78 (CLI v2.1.136 bundle), v0.1.79 (CLI v2.1.137 bundle), v0.1.80 (CLI v2.1.138 bundle).
2026-05-08	Guide v1.6: Day-2 follow-up on Claude Code v2.1.132/v2.1.133 + SDK v0.1.77. Added SDK Skill Surface subsection to Skills System covering the `skills` option on `ClaudeAgentOptions` and the deprecation of `"Skill"` in `allowed_tools`.³⁷ Added Effort and Session Provenance subsection to Hook Architecture covering the new `effort.level` JSON field + `$CLAUDE_EFFORT` env var on hook input, and the `CLAUDE_CODE_SESSION_ID` env var on Bash subprocesses.³⁸³⁹ Added Subagent skill discovery fix to the Subagent Configuration Fields table (subagents now discover project, user, and plugin skills via the `Skill` tool, silently dropped before v2.1.133).³⁹ Added Worktree Base, Sandbox Paths, and Admin Settings subsection to Production Patterns covering `worktree.baseRef` (breaking-default revert back to `origin/<default>` from local `HEAD`), `sandbox.bwrapPath`, `sandbox.socatPath`, and `parentSettingsBehavior`.³⁹
2026-05-07	Guide v1.5: Claude Managed Agents, May 6 SF expansion. Added Strategy 5 (Managed Memory Curation: Dreaming, Research Preview) to Memory and Context with table contrasting filesystem-as-memory vs. Dreaming.³⁵ Added Managed Multiagent Orchestration (Public Beta) and Outcomes (Public Beta) at the top of Multi-Agent Orchestration with verbatim Anthropic quotes on shared-filesystem specialists and Claude Console tracing, plus a comparison table vs. self-hosted deliberation. Added SDK-side hook event streaming subsection covering `claude-agent-sdk-python` v0.1.74’s `include_hook_events` and `HookEventMessage`.³⁶ Changelog-only: Claude Code v2.1.124-v2.1.131 (`claude project purge`, `--dangerously-skip-permissions` for project dirs, `skill_activated` `invocation_trigger`, PostToolUse format-on-save fix, PreToolUse JSON+exit-2 blocking fix, `skillOverrides` settings); `claude-agent-sdk-python` v0.1.72 (CLI 2.1.126), v0.1.73 (`session_store_flush`), v0.1.75 (CLI 2.1.131), v0.1.76 (`api_error_status`); openai-agents-python v0.15.0-v0.16.1 with v0.16.0 (May 7) defaulting to gpt-5.4-mini, removing the implicit `max_turns` ceiling, and adding SDK-side tool execution concurrency.
2026-05-07	Guide v1.4: Refreshed Claude Code hook and skill mechanics against current official docs and local runtime evidence (`claude --version` 2.1.132, `codex --version` returned `codex-cli 0.128.0`). Updated the hook surface from 22/26+ to 29 documented events, fixed skill-description budget from 2%/16,000 to 1%/8,000, changed hook-type count from four to five with `mcp_tool`, removed the unsupported fixed “10 parallel subagents” claim, and added a public-safe Codex parity section covering AGENTS.md, skills, hooks, plugins, and explicit subagent workflows.
2026-04-29	Guide v1.3: Expanded the OpenAI Agents SDK coverage in the Managed vs. Self-Hosted Harnesses section with the named SDK surface from `openai-agents` Python v0.14.0 (April 15) — `SandboxAgent`, `Manifest`, `SandboxRunConfig`, sandbox memory with progressive disclosure, workspace mounts (S3/R2/GCS/Azure), portable snapshots, and the local/Docker/hosted client backends (Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, Vercel). Replaced the secondary Help Net Security citation with the primary v0.14.0 release-notes citation. Added a short note on `claude-agent-sdk-python` v0.1.69-v0.1.71 (April 28-29) as the third self-hosted option (embed Claude Code runtime as a Python library): bundled Claude CLI bumped to v2.1.123, raised `mcp` dependency floor to `>=1.19.0` (older versions silently dropped `CallToolResult` from in-process MCP tools), Trio nursery cancellation fix, and `SandboxNetworkConfig` allowlist-field parity with TS SDK. v0.14.7-v0.14.8 SDK refinements documented in `[^58]`.
2026-04-25	Guide v1.2: Google Cloud Next 2026 (April 22-24) — Vertex AI rebranded to Gemini Enterprise Agent Platform; Agentspace absorbed into unified Gemini Enterprise; Workspace Studio (no-code agent builder); 200+ models in Model Garden including Anthropic Claude; partner agents from Box, Workday, Salesforce, ServiceNow; ADK v1.0 stable across four languages; Project Mariner (web-browsing agent); managed MCP servers with Apigee as API-to-agent bridge; A2A protocol v1.0 in production at 150 organizations. Microsoft Agent Framework 1.0 (April 2026): stable APIs, LTS commitment, full MCP support, .NET + Python. The browser-based DevUI that visualizes agent execution and tool calls in real time ships as a preview alongside the 1.0 stable surface. Salesforce Headless 360 (April 15, TDX): every Salesforce capability (CRM, service, marketing, ecommerce) exposed as API/MCP tool/CLI command so agents like Claude Code, Cursor, and Codex can build on the platform without a browser. (TDX 2026 ran April 15-16; the Headless 360 announcement is dated April 15.) MetaComp StableX KYA (April 21): Know Your Agent governance framework for regulated financial services (payments, compliance, wealth) — first of its kind from a licensed financial institution; available across Claude, Claude Code, OpenClaw, and other compatible AI platforms. Claude Managed Agents pricing: $0.08 per session-hour while a session is running, with no runtime charge while idle — on top of normal Claude model token rates. (Per Anthropic’s Claude pricing page; the public-beta launch was April 8, 2026.) Memory for Managed Agents entered public beta on April 23, 2026 under the `managed-agents-2026-04-01` beta header. All Managed Agents endpoints now require this beta header.
2026-04-16	Guide v1.1: Added Managed vs. Self-Hosted Harnesses section covering Claude Managed Agents (April 8 beta) and OpenAI Agents SDK harness/compute separation (April 16). Added Scion cross-tool multi-agent hypervisor (April 7, Google). Documented M3MAD-Bench debate plateau finding. Added The Five Principles of Trustworthy Agents (Anthropic, April 9) + MCP/AGENTS.md Linux Foundation governance. Permiso SandyClaw skill-sandbox reference. New Opus 4.7 Long-Horizon Patterns: tool-failure resilience, `xhigh` effort tier, token-budget ceiling (`task_budget` beta), implicit-need awareness reducing CLAUDE.md scaffolding.
2026-03-24	Initial publication

References

Andrej Karpathy on “claws” as a new layer on top of LLM agents. HN discussion (406 points, 917 comments). ↩
Author’s implementation. 84 hooks, 48 skills, 19 agents, ~15,000 lines of orchestration. Documented in Claude Code as Infrastructure. ↩↩↩↩↩↩↩↩
Anthropic, “Claude Code Hooks: Exit Codes.” code.claude.com/docs/en/hooks. Exit 0 allows, exit 2 blocks, exit 1 warns for most events; WorktreeCreate is stricter. ↩↩↩↩↩
Anthropic, “Extend Claude with Skills.” code.claude.com/docs/en/skills. Skill structure, frontmatter fields, LLM-based matching, and 1% / 8,000-character description budget. ↩↩↩↩↩↩↩
Anthropic, “Claude Code Sub-agents.” code.claude.com/docs/en/sub-agents. Isolated context, worktree support, agent teams. ↩↩↩↩↩
Anthropic, “Claude Code Documentation.” docs.anthropic.com/en/docs/claude-code. Memory files, CLAUDE.md, auto-memory. ↩↩↩↩↩
Author’s multi-agent deliberation system. 10 research personas, 7-phase state machine, 141 tests. Documented in Multi-Agent Deliberation. ↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩
Simon Willison, “Writing code is cheap now.” Agentic Engineering Patterns. ↩
Laban, Philippe, et al., “LLMs Get Lost In Multi-Turn Conversation,” arXiv:2505.06120, May 2025. Microsoft Research and Salesforce. 15 LLMs, 200,000+ conversations, 39% average performance drop. ↩↩↩
Mikhail Shilkov, “Inside Claude Code Skills: Structure, Prompts, Invocation.” mikhail.io. Independent analysis of skill discovery, context injection, and available_skills prompt section. ↩
Claude Code Source, SLASH_COMMAND_TOOL_CHAR_BUDGET. github.com/anthropics/claude-code. ↩
Anthropic, “Skill Authoring Best Practices.” platform.claude.com. 500-line limit, supporting files, naming conventions. ↩
Anthropic, “Claude Code Hooks: Lifecycle Events.” code.claude.com/docs/en/hooks. 29 documented lifecycle events, hook types, matcher behavior, async hooks, HTTP hooks, prompt hooks, agent hooks, and MCP tool hooks. ↩↩↩↩↩↩↩
Author’s Claude Code hooks tutorial. 5 production hooks from scratch. Documented in Claude Code Hooks Tutorial. ↩↩↩↩↩
Author’s context window management across 50 sessions. Documented in Context Window Management. ↩↩↩↩↩
Author’s Ralph Loop implementation. Fresh-context iteration with filesystem state, spawn budgets. Documented in The Ralph Loop. ↩↩↩↩↩↩↩
Author’s deliberation system architecture. 3,500 lines of Python, 12 modules, confidence trigger, consensus validation. Documented in Building AI Systems: From RAG to Agents. ↩↩↩
Nemeth, Charlan, In Defense of Troublemakers: The Power of Dissent in Life and Business, Basic Books, 2018. ↩
Wu, H., Li, Z., and Li, L., “Can LLM Agents Really Debate?” arXiv:2511.07784, 2025. ↩
Liang, T. et al., “Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate,” EMNLP 2024. ↩
Author’s AGENTS.md analysis across real-world repositories. Documented in AGENTS.md Patterns. See also: GitHub Blog, “How to Write a Great agents.md: Lessons from Over 2,500 Repositories.” ↩↩↩↩↩↩↩↩
Author’s quality loop and evidence gate methodology. Part of the Jiro Craftsmanship system. ↩
Anthropic, “Claude Managed Agents Overview”. Public beta launched April 8, 2026. Harness-as-a-service with session checkpointing, bundled sandbox, REST API. Pricing: standard tokens + $0.08/session-hour. Beta header managed-agents-2026-04-01. ↩↩
OpenAI, “openai-agents Python v0.14.0 release notes”. Released April 15, 2026; announcement covered April 16. Introduces the Sandbox Agents SDK surface as a beta layer over the existing Agent / Runner flow: SandboxAgent, Manifest (workspace contract), SandboxRunConfig, capabilities (shell, filesystem editing, image inspection, skills, sandbox memory, compaction), workspace mounts (local, Git, remote: S3, R2, GCS, Azure Blob, S3 Files), portable snapshots with path normalization and symlink preservation, and run-state serialization for resume. Backends: UnixLocalSandboxClient, DockerSandboxClient, and hosted clients for Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, Vercel via optional extras. The April 16 announcement summarized at Help Net Security. ↩↩
Google Cloud, “Scion: Multi-Agent Hypervisor”. Open-sourced April 7, 2026. Orchestrates Claude Code, Gemini CLI, and other deep agents as isolated processes with per-agent container, git worktree, and credentials. Local/hub/Kubernetes deployment modes. InfoQ coverage. ↩
Multi-agent debate research cluster, Q1–Q2 2026. Wu et al., “Can LLM Agents Really Debate?” (arXiv 2511.07784); M3MAD-Bench — multi-model multi-agent debate benchmark showing performance plateaus and susceptibility to misleading consensus; Tool-MAD — heterogeneous tool assignment per agent + Faithfulness/Relevance judge scores. ↩
Anthropic, “Our framework for developing safe and trustworthy agents”. April 9, 2026. Five principles: human control, value alignment, security, transparency, privacy. MCP donation to Linux Foundation’s Agentic AI Foundation. ↩↩
Permiso Security, “SandyClaw: First Dynamic Sandbox for AI Agent Skills”. April 2, 2026. Skill execution sandbox with Sigma/YARA/Nova/Snort detection and evidence-backed verdicts. ↩
Anthropic, “Introducing Claude Opus 4.7”. April 16, 2026. Long-horizon agent improvements: 3× SWE-Bench production task resolution vs Opus 4.6, tool-failure resilience, xhigh effort tier, task budgets (beta), implicit-need awareness. See also What’s new in Opus 4.7 for Messages API breaking changes. ↩
Composite reference — OpenAI openai-agents-python v0.14.7 (April 28, 2026) and v0.14.8 (April 29, 2026); Anthropic claude-agent-sdk-python v0.1.69 (April 28), v0.1.70 (April 28), and v0.1.71 (April 29). v0.14.7 highlights: tool_name/call_id convenience properties on tool items, raised Phase 2 memory consolidation turn limit, GPT-5.5 aliases for sandbox compaction, tar/zip member validation tightening, symlink rejection on LocalFile sources, removal of unset fields from Responses API calls. v0.14.8 highlights: preserve MCP re-export import errors, delimit sandbox prompt-instruction sections. claude-agent-sdk-python v0.1.69 added docstrings to ClaudeAgentOptions fields and bumped the bundled CLI to v2.1.121; v0.1.70 raised the mcp dependency floor to >=1.19.0 (older versions silently dropped CallToolResult returns from in-process MCP tool handlers), fixed Trio nursery corruption on early cancellation when iterating query() with options.stderr set (spawn_detached() now used for the stderr reader), and bumped the bundled CLI to v2.1.122; v0.1.71 added domain-allowlist fields (allowedDomains, deniedDomains, allowManagedDomainsOnly, allowMachLookup) to SandboxNetworkConfig for parity with the TypeScript schema, and bumped the bundled CLI to v2.1.123. ↩
OpenAI, “Custom instructions with AGENTS.md”. Codex reads global and project AGENTS.md / AGENTS.override.md files before work, merges root-to-current-directory guidance, and caps project docs by project_doc_max_bytes. ↩
OpenAI, “Agent Skills”. Codex skills use SKILL.md, progressive disclosure, explicit $skill invocation, and implicit activation from descriptions. ↩
OpenAI, “Codex Hooks”. Codex hooks support command hooks in config, plugin hooks, managed hooks, matchers for supported events, stdin JSON input, and JSON output fields. ↩
OpenAI, “Codex Subagents” and “Codex CLI 0.128.0 changelog”. Codex supports explicit parallel subagent workflows, built-in default, worker, and explorer agents, custom TOML agents, inherited sandbox policy, plugin-bundled hooks, hook enablement state, and persisted /goal workflows in 0.128.0. ↩
Anthropic, “New in Claude Managed Agents”. May 6, 2026. Dreaming (Research Preview): scheduled background process that reviews agent sessions and memory stores, extracts patterns, and curates memories. Outcomes (Public Beta): rubric-based evaluation in which a separate grader scores output against the rubric in its own context window so it is not influenced by the agent’s reasoning. Multiagent Orchestration (Public Beta): lead agent delegates pieces of a job to specialists, each with its own model, prompt, and tools; specialists work in parallel on a shared filesystem and contribute to the lead agent’s overall context, with full per-step tracing in the Claude Console. ↩↩↩↩↩↩↩↩
Anthropic, claude-agent-sdk-python v0.1.74. May 6, 2026. Adds include_hook_events to ClaudeAgentOptions; when set, hook events (PreToolUse, PostToolUse, Stop, others) are emitted by the CLI and yielded from the message stream as HookEventMessage, mirroring the TypeScript SDK’s includeHookEvents. Bundled Claude CLI bumped to v2.1.129. ↩↩
Anthropic, claude-agent-sdk-python v0.1.77. May 8, 2026. Deprecates the "Skill" value in allowed_tools in favor of a dedicated skills option on ClaudeAgentOptions, gives Claude Code more structured signal about available skills, improves error messages on Command failed exceptions, and bundles Claude CLI v2.1.133. ↩↩
Anthropic, Claude Code v2.1.132. May 6, 2026. Adds CLAUDE_CODE_SESSION_ID env var on Bash tool subprocesses (matches the session_id hooks already see), CLAUDE_CODE_DISABLE_ALTERNATE_SCREEN to keep conversation in native scrollback, refreshed /tui fullscreen startup banner (lower memory, mouse support, auto-copy on selection), and roughly twenty bug fixes spanning SIGINT graceful shutdown, surrogate emoji --resume corruption, plan-mode --permission-mode flag, Indic and ZWJ cursor handling, NFD vim ops, paste-starts-with-/ swallow, MCP unbounded memory, MCP tools/list retry, Bedrock + Vertex ENABLE_PROMPT_CACHING_1H 400, and statusline context_window showing cumulative tokens. ↩↩
Anthropic, Claude Code v2.1.133. May 7, 2026. Hooks now receive effort.level JSON input + $CLAUDE_EFFORT env var (also readable from Bash commands). Subagents discover project, user, and plugin skills via the Skill tool (regression fix). New admin settings: worktree.baseRef (fresh | head) reverts the worktree base back to origin/<default> after v2.1.128’s switch to local HEAD; sandbox.bwrapPath and sandbox.socatPath pin sandbox binaries on Linux/WSL; parentSettingsBehavior ('first-wins' | 'merge') controls how SDK managedSettings compose with parent settings. Other fixes: parallel-session 401-after-refresh-token-race, drive-root allow-rule scoping, MCP OAuth proxy/mTLS support, Remote Control stop/interrupt completing cancel, cross-session /effort leakage, --remote-control listed in --help. ↩↩↩↩↩↩
Anthropic, Claude Code v2.1.136. May 8, 2026. Adds settings.autoMode.hard_deny for auto-mode classifier rules that block unconditionally regardless of user intent or allow exceptions, and CLAUDE_CODE_ENABLE_FEEDBACK_SURVEY_FOR_OTEL to re-enable the in-session quality survey for enterprises capturing responses through OpenTelemetry. Operator-impact fixes: MCP servers from .mcp.json, plugins, and claude.ai connectors silently disappearing after /clear in VS Code, JetBrains, and Agent SDK; MCP OAuth refresh tokens being lost on concurrent refresh; plan mode not blocking file writes when a matching Edit(...) allow rule existed; plugin Stop/UserPromptSubmit hooks failing when cache cleanup deleted a still-running version; skills entry in plugin.json hiding the plugin’s default skills/ directory; CLAUDE_ENV_FILE SessionStart-hook env vars going stale after /resume or /clear. Plus roughly thirty additional polish and reliability fixes spanning the TUI, autocomplete, and terminal rendering. Companion releases: v2.1.137 (May 9, VSCode extension Windows activation fix), v2.1.138 (May 9, internal fixes); claude-agent-sdk-python v0.1.78, v0.1.79, and v0.1.80 bumped the bundled Claude CLI to v2.1.136, v2.1.137, and v2.1.138 respectively. ↩↩↩↩
OpenAI, openai-agents-python v0.17.0. May 8, 2026. RealtimeAgent defaults to gpt-realtime-2. Sandbox local-source materialization now constrains LocalFile.src and LocalDir.src to within the manifest base_dir (the SDK process current working directory when the manifest is applied) unless the source is explicitly granted via Manifest.extra_path_grants with SandboxPathGrant. Relative local sources resolve from base_dir; absolute sources must already be inside it or under an explicit grant. Migration: declare trusted host roots at the manifest level, preferably read-only. Treat extra_path_grants as trusted application configuration; do not populate from model output or untrusted manifest input. Also includes a Responses context-management extra_args collision fix. ↩↩↩↩
Anthropic, Claude Code v2.1.139. May 2026. Current-session local evidence on May 11, 2026: claude --version returned 2.1.139 (Claude Code). Release notes add Agent View (claude agents), /goal, hook args: string[], continueOnBlock for PostToolUse, CLAUDE_PROJECT_DIR for MCP stdio servers, plugin command interpolation for ${CLAUDE_PROJECT_DIR}, and fixes including claude_code.active_time.total OpenTelemetry emission in --print mode. ↩↩↩↩↩
Anthropic, “Manage multiple agents with agent view”. Agent View docs describe dispatching and managing many Claude Code sessions from one screen, seeing what each session is doing, and identifying sessions that need operator input. The page identifies Agent View as Research Preview and documents local session limitations. ↩↩↩
Anthropic, “Claude Code Hooks”. Hook docs covering command-hook fields, PreToolUse, PostToolUse, exit-code behavior, hook input/output, and direct slash-command expansion paths. ↩↩
GitHub Advisory Database, GHSA-f3jg-756w-gm35 / CVE-2026-45046. “Gryph Agents Payload Filter Fails to Strip Tool Payload for Sensitive Content.” Published May 2026; describes sensitive file-write payload content remaining in local SQLite logs under default logging behavior and fixed in Gryph v0.7.0. ↩↩
OSV, GHSA-wxxx-gvqv-xp7p / CVE-2026-40217. “LiteLLM has a sandbox escape in custom-code guardrail.” Published May 11, 2026; describes an admin-protected POST /guardrails/test_custom_code endpoint running user-supplied Python in a hand-rolled sandbox and recommends upgrading or blocking the endpoint if unable to upgrade. ↩↩
Young Jo (seph) Chung and Safwat Hassan, “Collaborator or Assistnat? How AI Coding Agents Partition Work Across Pull Request Lifecycles”, arXiv:2605.08017v1, May 2026. The abstract reports analysis of 29,585 PR lifecycles across OpenAI, Copilot, Devin, Cursor, and Claude Code, distinguishing operational agency from merge governance. ↩↩
Jiayuan Liu et al., “The Memory Curse: How Expanded Recall Erodes Cooperative Intent in LLM Agents”, arXiv:2605.08060v1, May 2026. The abstract reports experiments across 7 LLMs and 4 games over 500 rounds where expanded accessible history degraded cooperation in 18 of 28 model-game settings. ↩↩
Anthropic, Claude Code v2.1.140. May 12, 2026. Adds subagent_type to agent hook input and fixes ConfigChange hooks, disableAllHooks, allowManagedHooksOnly, permission-dialog env-var display from hook results, custom style resets after settings updates, native package resolution fallback on Windows Git Bash, and /scroll-speed. ↩↩↩
Anthropic, Claude Code v2.1.141. May 13, 2026. Adds terminalSequence to hook JSON output for desktop notifications, window titles, and bells; CLAUDE_CODE_PLUGIN_PREFER_HTTPS for HTTPS plugin-source cloning; ANTHROPIC_WORKSPACE_ID for workload identity federation workspace scoping; claude agents --cwd <path> for Agent View directory filtering; /feedback session attachment options for the last 24 hours or 7 days; and related agent, background-job, hook, MCP, Remote Control, permission-dialog, and terminal-rendering fixes. Current-session verification on May 14, 2026: claude --version returned 2.1.141 (Claude Code) and npm view @anthropic-ai/claude-code version dist-tags.latest time.modified --json returned latest 2.1.141. ↩↩↩
Anthropic, Claude Code v2.1.142. May 14, 2026. Adds claude agents dispatch flags for background sessions (--add-dir, --settings, --mcp-config, --plugin-dir, --permission-mode, --model, --effort, --dangerously-skip-permissions), changes Fast mode to Opus 4.7 by default with CLAUDE_CODE_OPUS_4_6_FAST_MODE_OVERRIDE=1 as the pinning override, surfaces root-level plugin SKILL.md files as skills when no skills/ directory exists, shows plugin-provided LSP servers in plugin details, warns before replacing an existing GitHub App connection, and fixes MCP_TOOL_TIMEOUT, background-session worktree, daemon sleep/wake, post-upgrade daemon cleanup, plugin cache, and Agent View reliability issues. Current-session verification on May 15, 2026: claude --version returned 2.1.141 (Claude Code) and npm latest returned 2.1.142. ↩↩
Anthropic, Claude Code v2.1.147. May 21, 2026. Adds the off-by-default Workflow tool for deterministic multi-agent orchestration (CLAUDE_CODE_WORKFLOWS=1), pinned background sessions, /code-review [effort] --comment replacing /simplify, REPL and Workflow sandbox hardening, auto-updater diagnostics, large-diff rendering improvements, prompt-history deduplication, and fixes for enterprise login restrictions, PowerShell behavior, MCP pagination, Agent View, plugins, hook conditions, pasted text, and stripped-image loops. Current-session verification on May 21, 2026: claude --version returned 2.1.144 (Claude Code) and npm view @anthropic-ai/claude-code version dist-tags.latest time.modified --json returned latest 2.1.147 with time.modified 2026-05-21T20:38:35.053Z. ↩↩↩
Anthropic, Claude Code v2.1.148, v2.1.149, v2.1.150, and Claude Code CHANGELOG. v2.1.148 fixes a Bash exit-code regression from v2.1.147. v2.1.149 adds /usage per-category limits usage, /diff keyboard scrolling, GFM task-list rendering, and Enterprise allowAllClaudeAiMcps; harness-relevant fixes include PowerShell cd permission bypasses, PowerShell prefix/wildcard and stale-variable permission analysis, git-worktree sandbox write-allowlist scope, Bash find vnode exhaustion on macOS, managed-settings approval freezes, otelHeadersHelper path-space diagnostics, and Remote Control session rename sync. v2.1.150 is internal infrastructure only. Current-session verification on May 24, 2026: local claude --version returned 2.1.144 (Claude Code) while npm latest returned 2.1.150 with time.modified 2026-05-23T04:03:10.243Z; GitHub latest release returned v2.1.150 published 2026-05-23T04:03:51Z. ↩↩↩
OpenAI, openai-agents-python v0.17.1, v0.17.2, and v0.17.3. v0.17.1 adds sandbox-provider error details, archive extraction limits, GitRepo subpath validation, and tracing/session/realtime fixes. v0.17.2 fixes Conversations reasoning persistence, local approval rejection reasons, AsyncSQLiteSession settings, and realtime unknown-tool behavior. v0.17.3 keeps mountpoint credentials out of sandbox commands, rejects relative sandbox workspace roots, handles terminal Vercel sandbox states, and fixes output-schema, guardrail, runtime, and memory import edge cases. Current-session verification on May 24, 2026: python3 -m pip index versions openai-agents returned latest 0.17.3; GitHub latest release returned v0.17.3 published 2026-05-19T01:27:36Z. ↩↩
Claude Code Changelog (canonical), v2.1.152 release notes, v2.1.153 release notes, v2.1.154 release notes. v2.1.152 (May 27) adds the MessageDisplay hook event, disallowed-tools in skill/command frontmatter, /reload-skills, SessionStart hook reloadSkills and sessionTitle outputs, /code-review --fix apply-to-working-tree, pluginSuggestionMarketplaces managed setting, removal of auto-mode opt-in, and --fallback-model mid-session switching. v2.1.153 (May 28) makes /model save as the new-session default with s for session-only, adds skipLfs to plugin marketplaces, surfaces COLUMNS/LINES in status-line env, and persists macOS background-agent Privacy & Security grants. v2.1.154 (May 28) makes Opus 4.8 the default with high effort by default and a new /effort xhigh, introduces dynamic workflows via /workflows, makes Fast mode on Opus 4.8 available at 2× rate for 2.5× speed, defaults the lean system prompt for all models except Haiku/Sonnet/Opus 4.7-and-earlier, lets claude agents accept ! <command> for background-shell sessions, lets plugins declare defaultEnabled: false, passes CLAUDE_CODE_SESSION_ID and CLAUDECODE=1 to stdio MCP subprocess env, and deprecates CLAUDE_CODE_OPUS_4_6_FAST_MODE_OVERRIDE (removed June 1). ↩
Codex Changelog (OpenAI Developers) and openai/codex releases. Codex CLI 0.134.0 (May 26, 2026) added local conversation-history search, made --profile the primary profile selector across CLI/TUI/sandbox flows with legacy-config migration, improved MCP setup with per-server environment targeting plus OAuth for streamable HTTP servers, made connector tool schemas more reliable by preserving local $ref/$defs and compacting oversized schemas before exposure, and enabled concurrent execution of read-only MCP tools advertising readOnlyHint. Codex CLI 0.135.0 (May 28, 2026) added richer codex doctor diagnostics, surfaced remote connection details and server version in /status, added vim text-object editing with improved word/line-end behavior and configurable interrupt-turn, made /permissions understand named permission profiles, packaged a bundled patched zsh helper across supported macOS and Linux, and added friendly Sandbox presets to the Python SDK for thread and turn APIs. ↩
Hermes Agent v0.15.0 release notes. “The Velocity release.” 1,302 commits, 747 merged PRs, 321 community contributors. run_agent.py refactored 76% (16,083 → 3,821 lines across 14 modules). Multi-agent Kanban platform with auto-decomposition, swarm topology, per-task model overrides, scheduled tasks, and worktree management. session_search redesigned 4,500× faster with the LLM dependency removed. Promptware defense against Brainworm-class prompt injection at three security chokepoints. Bitwarden Secrets Manager integration replacing per-provider keys with a single bootstrap token. Skill bundles for loading multiple skills with one slash command. TUI session orchestrator for multi-session management in one terminal. Krea 2 and FAL image-generation providers; xAI integration round (web-search plugin, OAuth upstream, retired-model detection, natural TTS pauses). ↩
Claude Code v2.1.157 release notes and the Claude Code Changelog (canonical). May 29, 2026. Plugins placed in a project’s .claude/skills/ directory now load automatically without requiring a marketplace; claude plugin init <name> scaffolds a fresh plugin in that directory; /plugin gained argument autocomplete. Also: EnterWorktree can switch between Claude-managed worktrees mid-session, background worktrees are left unlocked after the agent finishes so git worktree remove/prune work cleanly, and tool_decision telemetry events include tool_parameters when OTEL_LOG_TOOL_DETAILS=1. Also includes bug fixes for unprocessable images (now degrade to text placeholders), sandbox network permission prompts in auto/bypass mode, background-session retire-on-park, and terminal rendering across tmux / VS Code / Cursor / Windsurf. ↩
Claude Code Changelog (canonical) and Codex CLI v0.137.0 release notes, June 2026. Claude Code v2.1.162 (June 3) added waitingFor to claude agents --json; v2.1.163 (June 4) added hookSpecificOutput.additionalContext for Stop/SubagentStop non-error feedback; v2.1.166 (June 6) hardened cross-session SendMessage authority (relayed messages no longer carry user authority) and added the fallbackModel setting (up to three fallbacks, one-shot retry on non-retryable errors). Codex CLI v0.137.0 (June 4) shipped multi-agent v2 (runtime-with-thread, hide_spawn_agent_metadata default true, parent→child event propagation), a v1 skills extension with per-turn catalog resolution, and thread-start/turn-error lifecycle contributor events; the Codex subagents docs confirm default/worker/explorer agent types and agents.max_threads/max_depth concurrency controls. AGENTS.md (agents.md) publishes no versioned spec change. Current-session verification June 8, 2026. ↩
Anthropic, Claude Code v2.1.169 release notes and v2.1.170 release notes, June 8–9, 2026. v2.1.169 adds the disableBundledSkills setting plus CLAUDE_CODE_DISABLE_BUNDLED_SKILLS (hides bundled skills, workflows, and built-in slash commands from the model); the --safe-mode flag plus CLAUDE_CODE_SAFE_MODE (starts a session with all customizations disabled: CLAUDE.md, plugins, skills, hooks, and MCP servers); and the /cd command (moves a session to a new working directory without breaking the prompt cache). v2.1.170 makes Claude Fable 5 (claude-fable-5) selectable via /model claude-fable-5, with Opus 4.8 remaining Claude Code’s agentic default. Model-tier launch: Anthropic, “Claude Fable 5”, June 9, 2026 — a “Mythos-class” tier above Opus, described as Anthropic’s most powerful model made safe for general use. ↩↩↩↩
OpenAI, Codex CLI rust-v0.138.0 release notes (June 8, 2026) and rust-v0.139.0 release notes (June 9, 2026). v0.138.0 hardens multi-agent v2 with encrypted inter-agent message payloads, a v2 agent config catalog, an agent-residency LRU, and concurrency counted by active execution rather than spawned threads. v0.139.0 renames the close_agent lifecycle API to interrupt_agent, and scopes subagent MCP startup warnings to the owning thread so they no longer duplicate into the parent. AGENTS.md discovery is hardened across both releases: loading routes through environment filesystems and preserves logical paths during discovery, ensuring correct file selection for remote and symlinked workspaces. ↩↩↩
Anthropic, Claude Code v2.1.172 release notes (June 10, 2026). Sub-agents can now spawn their own sub-agents, with recursive delegation supported up to 5 levels deep; previously delegation was effectively one level. ↩

Agent Architecture: Building AI-Powered Development Harnesses

Key Takeaways

How to Use This Guide

Five-Minute Golden Path

Step 1: Create a security hook (2 minutes)

Step 2: Create a code review skill (1 minute)

Step 3: Spawn a subagent (30 seconds)

What you now have

Why Agent Architecture Matters

The Harness Pattern

Managed vs. Self-Hosted Harnesses (April 2026)

What the Harness Looks Like on Disk

Skills System

When to Build a Skill

Creating a Skill

Frontmatter Reference

The Description Field Is Everything

Context Budget

Supporting Files and Organization

Sharing Skills via Git

Skills as a Prompt Library

Skills Compose with Hooks

Common Skill Mistakes

SDK Skill Surface (May 8, 2026)

Plugin and Skill Convergence in .claude/skills/ (May 29, 2026)

Hiding the Bundled Surface as Governance (June 8, 2026)

Hook Architecture

Available Events

Exit Code Semantics

Hook Configuration

Hook Input/Output Protocol

Three Types of Guarantees

Hook Types Beyond Shell Commands

Async Hooks

Dispatchers Over Independent Hooks

Debugging Hooks

SDK-Side Hook Event Streaming

Effort and Session Provenance (May 7-8, 2026)

autoMode.hard_deny and v2.1.136 Hook/Plugin Fixes (May 8, 2026)

Structured Hook Arguments and Block Continuation (May 11, 2026)

Stop-hook steering, cross-session authority, and multi-agent v2 (June 2026)

Memory and Context

The Three Mechanisms of Multi-Turn Collapse

Strategy 1: Filesystem as Memory

Strategy 2: Proactive Compaction

Strategy 3: Session Handoffs

Strategy 4: Fresh-Context Iteration (The Ralph Loop)

Strategy 5: Managed Memory Curation (Dreaming)

The Anti-Patterns

Subagent Patterns

Built-In Subagent Types

Creating Custom Subagents

Subagent Configuration Fields

Worktree Isolation

Parallel Subagents

The Recursion Guard

Agent Teams (Research Preview)

Agent View and Goal Loops (May 2026)

Workflow Tool (v2.1.147+)

Multi-Agent Orchestration

Managed Multiagent Orchestration and Outcomes (Public Beta)

Minimum Viable Deliberation

The Confidence Trigger

The State Machine

The Two-Gate Validation Architecture

Why Agreement Is Dangerous

Detecting Fake Agreement

What Didn’t Work in Deliberation

Cost of Deliberation

When to Deliberate

CLAUDE.md Design

The Precedence Hierarchy

What Gets Ignored

What Works

Writing Order

File Imports

Cross-Tool Instruction Compatibility

Codex Parity Notes

Testing Your Instructions

Production Patterns

Plugin and Skill Convergence in `.claude/skills/` (May 29, 2026)

`autoMode.hard_deny` and v2.1.136 Hook/Plugin Fixes (May 8, 2026)