Architecture
Warden has three entry paths and one review pipeline.
The CLI starts from local git state or explicit targets. Pull request reviews
start from a GitHub event payload. Scheduled reviews start from cron workflows
and configured paths. All three paths build a review context, resolve
warden.toml, and run skills through the same analysis engine.
Changed code -> Event context -> Config and trigger resolution -> Skill tasks -> File and hunk preparation -> Main skill analysis agents -> Finding post-processing agents -> Reports, comments, checks, and logsEntry Points
Section titled “Entry Points”| Entry point | What it provides |
|---|---|
| CLI | Local repository path, git range or file targets, terminal mode, optional JSONL output. |
| GitHub Action | Webhook payload, PR metadata, GitHub API client, workflow inputs, check and review permissions. |
| Scheduled review | Repository context, configured paths, and schedule triggers. |
The entry point decides where context comes from and where results go. It does not change what a skill means.
Review Context
Section titled “Review Context”Warden normalizes input into an event context:
- repository owner, name, and local checkout path
- pull request title, body, base SHA, head SHA, and changed files when present
- file status, patch text, and diff context source
- event type and action for trigger matching
Local runs synthesize this context from git and file targets. GitHub runs build it from the event payload and GitHub API data.
Config and Triggers
Section titled “Config and Triggers”warden.toml decides which skills are eligible to run. Warden loads config,
resolves local, built-in, and remote skill roots, then matches each configured
trigger against the event context.
Trigger matching answers three questions:
| Question | Controlled by |
|---|---|
| Should this skill run for this event? | type, actions, local run mode, and schedule settings. |
| Which files should it see? | paths, ignorePaths, and defaults. |
| How should results behave? | failOn, reportOn, maxFindings, requestChanges, failCheck, and confidence thresholds. |
In GitHub Actions, org-level base config and repository config can be layered. The base config is the enforced baseline. Repository config can add local coverage without weakening base skills.
Skill Tasks
Section titled “Skill Tasks”A matched trigger becomes a skill task. Each task contains:
- the resolved
SKILL.md - the filtered review context
- model, runtime, max turns, chunking, and verification options
- output thresholds for failure and reporting
Warden launches matched skills in parallel. A shared semaphore gates file-level analysis so multiple skills can be active while total model concurrency stays bounded.
See Runner for concurrency settings.
Diff Preparation
Section titled “Diff Preparation”Before a model sees code, Warden prepares the diff:
- Parse each changed file patch.
- Classify files as per-hunk, whole-file, or skipped by chunking rules.
- Split large hunks and coalesce nearby hunks.
- Expand each hunk with surrounding file context.
- Group hunks by file.
The unit of main analysis is a hunk with context. Files run in parallel when allowed by the runner, while hunks inside a file run in order.
See Chunking for file pattern modes and coalescing settings.
Agent Lanes
Section titled “Agent Lanes”Warden uses model-backed agents in several lanes. They share the selected runtime, but can use different configured models.
| Lane | Model field | Purpose |
|---|---|---|
| Main analysis | defaults.agent.model, skill model, or trigger model | Runs the skill prompt against each prepared hunk. |
| Auxiliary | defaults.auxiliary.model | Repairs malformed structured output, verifies findings, checks suggested fixes, deduplicates against existing comments, and evaluates fix attempts. |
| Synthesis | defaults.synthesis.model | Merges findings that describe the same root cause across multiple locations. |
The main analysis agent is the skill itself: Warden builds a system prompt from
SKILL.md, adds the changed-code context, and asks for structured findings.
Auxiliary agents are narrower. They do not decide what the skill should care about. They keep the output usable: parse it, verify it, merge duplicates, and remove unsafe suggested fixes.
Finding Pipeline
Section titled “Finding Pipeline”Each hunk analysis returns candidate findings plus usage and error metadata. Warden then runs the shared post-processing pipeline:
- Extract and validate JSON findings.
- Drop findings outside the analyzed hunk range.
- Deduplicate identical findings from the same skill run.
- Verify candidates with a second read-only repo-aware pass unless disabled.
- Merge same-root-cause findings across locations.
- Validate suggested fixes deterministically and, when available, semantically.
- Build a
SkillReportwith findings, skipped files, hunk failures, usage, and duration.
If every hunk fails for the same systemic reason, Warden stops early and reports the provider, auth, or model-selector failure instead of pretending the code was clean.
Output Paths
Section titled “Output Paths”Local and CI runs consume the same SkillReport shape, then render it for the
current surface.
| Surface | Output |
|---|---|
| Terminal | Live skill progress, filtered findings, summary, optional interactive fixes. |
| JSONL | Incremental chunks, skill reports, and final summary for warden runs. |
| GitHub Checks | One core Warden check plus per-skill checks when running on pull requests. |
| GitHub Reviews | Inline comments, deduplicated findings, optional change requests. |
GitHub runs do extra PR hygiene after posting new findings. Warden fetches existing comments, suppresses duplicates, evaluates whether follow-up commits fixed prior findings, resolves stale Warden comments when safe, and can dismiss a previous Warden change request when blocking findings are gone.
Runtime Boundary
Section titled “Runtime Boundary”The runtime adapter is the boundary between Warden and model execution.
| Runtime | Role |
|---|---|
pi | Default runtime for main, auxiliary, and synthesis model calls through Pi. |
claude | Claude Code runtime for repo-aware execution, with API key or local Claude auth. |
Everything before the runtime boundary is deterministic orchestration: configuration, trigger matching, diff parsing, chunking, concurrency, reporting, and GitHub state management. Everything after it is model-backed review work scoped by the active skill and lane.
For model configuration details, see Models and Runtimes.