Dr. Zero - Autonomous Repository Improvement Plugin

Status: Experimental Alpha (Internal Use Only) Version: 10.17.1

Dr. Zero is a Claude Code plugin providing autonomous repository improvement through dual-scoring curriculum learning and multi-agent swarm coordination. Scoring is based on the GRPO/HRPO framework (arXiv:2601.07055):

HRPO scores the proposer (format compliance + difficulty calibration, reward (0.5 * format + difficulty) / 1.5 — rescaled to [0, 1] from paper Eq. 4)
GRPO scores the solver (binary acceptance-test reward)

Full user documentation lives under docs/plugins/drzero/ (see Documentation below for the quadrant map).

Quick Start

Installation

# Add otc-awesome-llm marketplace
claude plugin marketplace add /path/to/otc-awesome-llm

# Install Dr. Zero plugin
claude plugin install drzero@otc-awesome-llm

# Verify installation
claude plugin list

Basic Usage

# Run autonomous improvement (default: 3 iterations, 3 tasks/iteration)
/drzero:drzero

# Health check (confirm plugin is loaded)
/drzero:drzero-ping

# Check session status
/drzero:drzero-status

# View/edit configuration
/drzero:drzero-config

Swarm Coordination Commands

# Hierarchical coordination (Orchestrator + 16 domain agents)
/drzero:drzero-swarm "Implement user authentication"

# Democratic debate for architecture decisions
/drzero:drzero-council "Should we use microservices or monolith?"

# Centralized governance with mandatory quality gates
/drzero:drzero-citadel "Deploy payment processing changes"

# Peer-to-peer parallel work (no central orchestrator)
/drzero:drzero-unity "Fix all linting errors"

# Simplified execution (or ruthless optimization with --evil)
/drzero:drzero-morty "Update README"

# Parallel variant implementations for A/B comparison
/drzero:drzero-cronenberg "Try 3 different approaches to caching"

# Cross-repo coordination
/drzero:drzero-portal-gun "Update authentication across all microservices"

# Minimal viable solutions under extreme constraints
/drzero:drzero-pickle "Minimal changes to pass CI"

Swarm Modes at a Glance

The 8 coordination modes, each shipped as its own command (full contracts in the command reference):

Mode	Command	Coordination model	Best for
Swarm	`/drzero:drzero-swarm`	Hierarchical (orchestrator + 16 specialists)	Clear tasks needing parallel domain expertise
Council	`/drzero:drzero-council`	Democratic debate among orchestrator variants	Architecture decisions, design trade-offs
Citadel	`/drzero:drzero-citadel`	Centralized governance, mandatory quality gates	Production deploys, compliance-bound changes
Unity	`/drzero:drzero-unity`	Peer-to-peer, no central orchestrator	Embarrassingly parallel work (lint, renames)
Morty	`/drzero:drzero-morty`	Single agent, no routing (`--evil` for aggressive optimization)	Obvious or mechanical changes
Cronenberg	`/drzero:drzero-cronenberg`	N parallel variant implementations, compare and pick	A/B testing competing approaches
Portal Gun	`/drzero:drzero-portal-gun`	Cross-repo coordinator	Multi-repository changes, dependency rollouts
Pickle	`/drzero:drzero-pickle`	Single constrained implementer	Minimal solutions in locked-down environments

Two-Phase Architecture

flowchart TD
    user([User / autonomous scan]) -->|prompt or repo findings| proposer

    subgraph phase1 ["Phase 1 — Dual-Scoring Curriculum Refinement"]
        proposer[Proposer agent] -->|WorkItems| solver[Solver agent]
        solver -->|"attempt facts (exit codes, diffs)"| stopHook[SubagentStop hook]
        stopHook -->|"dr0.scoring: HRPO proposer + GRPO solver"| session[("/tmp/drzero_session.json")]
        session -->|"success rate vs ~50% target"| proposer
    end

    preHook[PreToolUse hook] -.->|"validates scope_boundary + acceptance_test"| solver

    session -->|refined, domain-tagged prompts| orchestrator

    subgraph phase2 ["Phase 2 — Orchestrator Agent Swarm"]
        orchestrator[Orchestration dispatcher] -->|domain routing| specialists["16 domain specialist agents"]
        specialists --> gates[Quality gates / security review]
        gates --> results[Tested changes + PR]
    end

Scoring in both phases is computed exclusively by the dr0 Python package via the SubagentStop hook — agents never self-report scores. Deep-dive diagrams (HRPO loop, swarm sequence, checkpoint lifecycle) live in dr0/docs/architecture-diagrams.md.

Phase 1: Dual-Scoring Curriculum Learning (arXiv:2601.07055)

Proposer analyzes CI failures, lint errors, type errors, and documentation gaps to generate WorkItems (improvement tasks).
Solver attempts each WorkItem, producing code patches and running acceptance tests.
HRPO scores the proposer: evaluates format compliance and difficulty calibration of the generated WorkItems using the reward (0.5 * format + difficulty) / 1.5 (rescaled to [0, 1] from paper Eq. 4).
GRPO scores the solver: binary acceptance-test success, the repository-task proxy for exact-match reward.
Difficulty auto-adjusts to target a 50% success rate.
Output: Refined, high-quality prompts for Phase 2.

Anti-hallucination rule: All GRPO and HRPO scores are computed exclusively by the dr0 Python package. Scores are never self-reported by agents.

Phase 2: Agent Swarm Execution

Orchestrator (Rick Sanchez) coordinates 16 domain specialist agents via Task tool.
Domain agents execute domain-specific tasks (potentially in parallel).
Security/Cerberus reviews code through three lenses: security, quality, savage.
Output: Production-ready changes with tests passing.

Session State

${XDG_RUNTIME_DIR:-$HOME/.cache}/drzero/session-<id>.json -- per-session transient state, written by hooks during execution. <id> is $DRZERO_SESSION_ID if set, otherwise the hook process PID. The drzero/ directory is per-user, mode 0700, and ownership-checked (CR-1).
.drzero-state.json -- persistent validated state, survives across sessions.

Configuration

Create drzero.yml in project root or ~/.claude/drzero.yml for user defaults.

The drzero.yml format is shared between Claude and Codex, but several values are runtime-sensitive. The example below uses Claude-oriented values (backend: claude, orchestrator rick, quality reviewer cerberus). Codex runs use backend: codex with the orchestration/security domain identifiers (see plugins/drzero/assets/drzero.yml.example).

Config precedence (both runtimes):

./drzero.yml
~/.claude/drzero.yml
built-in defaults

version: '1.0'

dr_zero:
  max_iterations: 5
  tasks_per_iteration: 3
  proposer:
    persona: healthcare # conservative, patient-safety focused
  solver:
    backend: claude # Claude runtime; Codex examples use `codex`
    temperature: 0.7
  terminal:
    all_tests_pass: true
    lint_clean: true
    coverage_threshold: 80

agent_swarm:
  orchestrator:
    agent: rick # Claude orchestrator; Codex uses `orchestration`
    quality_reviewer: cerberus # Claude reviewer; Codex uses `security`
    definition_of_done:
      - tests_pass
      - lint_clean
      - docs_updated
      - security_cleared

All 20 Agents (16 Domain Specialists + 4 Meta/Bridge Agents)

The plugin ships exactly 20 agent files — 16 domain specialists, the orchestration coordinator, the two Phase 1 curriculum-learning agents, and one bridge agent. The contract is defined in agents/CLAUDE.md (a directory-conventions document, not an agent) and enforced by tests; installation bundles are declared in agents-manifest.json.

Dr. Zero uses a domain-filename convention: each agent is named by its domain ({domain}.md), enabling SDK-native precedence resolution without custom discovery logic.

Meta and Bridge Agents (4)

Domain	Agent File	Role
orchestration	orchestration.md	Orchestrator (Rick Sanchez) -- coordinates Phase 2 agent swarm; never a work domain
proposer	proposer.md	Generates WorkItems in Phase 1 (scored by HRPO)
solver	solver.md	Attempts WorkItems in Phase 1 (scored by GRPO)
security-remediation	security-remediation.md	Bridge agent: secplat findings → portal-gun cross-repo remediation

Domain Specialists (16)

Domain	Agent File	Focus Area
architecture	architecture.md	System design, domain boundaries, interfaces, rollout strategy
backend	backend.md	REST/GraphQL APIs, microservices, message queues, caching
compliance	compliance.md	HIPAA, SOC2, PCI, policy-as-code
database	database.md	PostgreSQL, MongoDB, query optimization
devops	devops.md	CI/CD pipelines, automation, tooling
documentation	documentation.md	MkDocs, Diataxis framework
frontend	frontend.md	React, Vue, Angular, Storybook
gitops	gitops.md	Semantic-release, workflow distribution across 130+ repos
implementation	implementation.md	Production code (Ansible, Terraform, GitHub Actions, Python)
infrastructure	infrastructure.md	IaC module structure, environment baselines, deployment topology
monitoring	monitoring.md	Dynatrace, Splunk, Azure Monitor
networking	networking.md	VPC/VNET, DNS, load balancers, CDN, security groups, routing
performance	performance.md	Load testing, caching strategies, profiling, bottleneck analysis
secrets	secrets.md	Consul Vault, CyberArk, Venafi certificates, credential rotation
security	security.md	Three-headed review (security, quality, savage) -- Cerberus
testing	testing.md	Molecule, terratest, pytest, GitHub Actions validation

Key insight: Filename = domain = agent name. The SDK resolves precedence by filename match alone -- no domain registry or custom discovery needed.

The canonical 16-domain taxonomy (scopes, artifacts, AI-DLC stage mapping) is defined in drzero-domain-mapping.md §1. The plugin deliberately ships no reviewer agents — review tiers are plugin-external and composed by consumers via quality_gates (see agents/CLAUDE.md).

Skills

Five skills ship with the plugin under skills/, each loadable on demand:

Skill	Purpose
domain-agent-routing	Understand the 16 domain specializations, geometric priority matrix theory, and runtime agent discovery mechanism
drzero-curriculum-learning	Understand dual-scoring curriculum learning (HRPO proposer + GRPO solver, arXiv:2601.07055) and the proposer-solver architecture behind Phase 1
pr-ci-monitoring	GitHub PR/CI monitoring strategies, auto-merge workflows, and integration with gh CLI or GitHub MCP server
rick-swarm-integration	Orchestrator swarm coordination patterns, Phase 2 work execution, and multi-agent parallel task distribution
security-review-protocol	Security review quality gates, the three-headed review system (security, quality, savage), and definition-of-done criteria

Hook System

Two hooks implement the plugin's fail-closed validation and anti-hallucination scoring. Full contract: docs/plugins/drzero/reference/ref-hooks.md; dr0-side background: dr0/docs/hook-architecture.md.

PreToolUse — validates Dr. Zero agent inputs before the Task tool spawns them: required WorkItem fields, scope-boundary defenses (empty/wildcard/absolute/traversal paths rejected), and acceptance-test command whitelisting with shell-metacharacter blocking. Validation failure blocks the invocation (exit 1) before any tokens are spent.
SubagentStop — captures proposer/solver output when a Task completes and computes deterministic scores via dr0.scoring: HRPO proposer reward (format + difficulty calibration) and GRPO solver reward (binary acceptance-test outcome). Results are written to /tmp/drzero_session.json under file locking, with provenance (scored_by, scored_at). Model-supplied scores are ignored; scoring failures write null, never a fabricated number.

A standalone validator, hooks/validate-state-file.py, checks .drzero-state.json before Phase 2 execution.

Hook registration

Hooks are registered via hooks/hooks.json, which is auto-loaded by Claude Code through the "hooks" key in .claude-plugin/plugin.json. The lib/setup-dr0.sh step only installs the dr0 pip package — hook wiring requires no manual action. After installing the plugin, run /reload-plugins and confirm the status line reports hooks > 0; if it reports 0, the scoring pipeline will not run and the Issue #304 anti-hallucination guard will HALT every session.

Loader note: Claude Code resolves plugin.json from the marketplace source (~/.claude/plugins/marketplaces/.../), not the per-version cache (~/.claude/plugins/cache/.../). Editing the cache copy will not change /reload-plugins output — re-sync the marketplace or reinstall the plugin.

Integration with AI-DLC

AI-DLC plans; Dr. Zero executes. The AI-DLC plugin's inception/effort workflow produces units of work tagged with canonical domain slugs, and /ai-dlc:effort hands them to Dr. Zero as a domain_routing: YAML payload (per-unit primary_domain, parallel domains, dependencies, quality_gates, drzero_review_covers, artifact_paths). The orchestration dispatcher consumes that payload via a six-step procedure (H1–H6): detect handoff context, parse the payload, resolve dependency order, dispatch each unit to its domain specialists, report effort-level completion, and stay idempotent on resume.

On return, the dispatcher emits Reviewer Status updates and — when a unit claims review coverage — a predicate_verdicts: block, closing the loop back into AI-DLC's effort state. Canonical contracts: drzero-domain-mapping.md and drzero-orchestrator-procedure.md; conceptual overview: docs/plugins/ai-dlc/explanation/drzero-integration.md.

Agent Override System (3-Level Precedence)

Dr. Zero leverages the Claude Code SDK's native precedence mechanism:

.claude/agents/ (repo) > ~/.claude/agents/ (user) > plugin/agents/ (bundled)

The SDK automatically resolves which agent to use based on filename matching.

How It Works

When Orchestrator routes a WorkItem with domain: "testing", it invokes:

Task(agent="testing")  # Just the domain name

The SDK automatically checks:

.claude/agents/testing.md (repo-specific) -- use if exists
~/.claude/agents/testing.md (user-wide) -- use if exists
plugin/agents/testing.md (bundled) -- fallback

Override Example: Custom Testing Agent

Create .claude/agents/testing.md in your project:

---
name: testing
description: 'Custom test agent for our project-specific validation'
---

# Custom Testing Agent

You are the testing specialist for this project.

## Project-Specific Test Patterns

- Use our custom pytest fixtures in tests/conftest.py
- Follow our test naming convention: test_{feature}_{scenario}
- Always include integration tests for API endpoints

When Dr. Zero routes domain: "testing", your custom agent is used automatically.

Security Features

Scope boundary validation: The PreToolUse hook rejects path traversal (../../../etc/passwd), absolute paths, and CAT attacks (empty or glob/wildcard scopes) before any solver runs
Command whitelisting: Validates acceptance_test commands against a whitelist of 28 safe tools (pytest, ruff, terraform plan, etc.), blocks shell metacharacters, and prevents mutating subcommands (e.g., terraform apply)
Untrusted-input fencing: /drzero validates user input before processing — mode-specific length limits, prompt-injection pattern blocking, and shell-metacharacter escaping, with the sanitized copy persisted atomically to /tmp/drzero_input_validation.json
Anti-hallucination scoring: All GRPO/HRPO scores are computed by the dr0 Python package via the SubagentStop hook -- agents never self-report scores, model-supplied scores are discarded, and scoring failures write null with error provenance (see Issue #304)
Trusted-path installer: lib/setup-dr0.sh only executes the dr0 installer from a trusted checkout ($OTC_AWESOME_LLM_ROOT or plugin-relative) — never from the current working directory, so an untrusted repo cannot ship a crafted installer (see lib/README.md)
Security review protocol: The security-review-protocol skill defines the three-headed review gates (security, quality, savage) and definition-of-done criteria applied by the security agent
Cross-repo remediation bridge: The security-remediation agent turns secplat findings into coordinated portal-gun remediation across affected repositories

Requirements

Python 3.11+
Git repository
Claude Code CLI
dr0 Python package (provides GRPO/HRPO scoring math)
Local CI tools: pytest, ruff, mypy, bandit

Troubleshooting

"Proposer agent not found"

Ensure agents are in one of the three discovery paths:

.claude/agents/proposer.md (project-level override)
~/.claude/agents/proposer.md (user-level override)
plugin/agents/proposer.md (bundled fallback)

"Phase 1 not converging"

Adjust target success rate in drzero.yml:

dr_zero:
  target_success_rate: 0.4 # Lower threshold (default: 0.5)
  max_iterations: 5 # More iterations

"Orchestrator not found"

Falls back to sequential execution without Orchestrator. To use Orchestrator:

Check /help for available agents
Verify the orchestration.md agent exists in a discovery path
Confirm name: orchestration in the agent frontmatter

Agent Override Not Working

If your custom agent is not being picked up:

Filename must match the domain exactly:

~/.claude/agents/testing.md     -- correct for "testing" domain
~/.claude/agents/my-tester.md   -- wrong, SDK will not find this

Frontmatter name must match the filename (without .md):
```
---
name: testing
---
```

Check the /agents command to confirm which source wins:

/agents
# testing (project)  [overrides plugin]
# testing (user)     [overrides plugin]
# testing (plugin)   [bundled default]

Verify precedence order:
- Project (.claude/agents/) overrides user (~/.claude/agents/)
- User overrides plugin (bundled agents)
- If agent appears in multiple locations, highest precedence wins

Common mistakes:

Agent file not named after the domain (e.g., koji.md instead of testing.md)
Frontmatter name does not match filename
Directory typo: .claude/agent/ (missing trailing s)

Handling Conflicts (Swarm Agents Modifying Same Files)

When multiple agents modify overlapping files in Phase 2:

Resolution strategies:

Sequential execution (safest):

# drzero.yml
agent_swarm:
  max_parallel: 1

File-based locking (automatic): Orchestrator automatically sequences agents working on the same files.

Manual resolution:

git status
# edit conflicted files
git add <resolved-file>
/drzero:drzero --resume

Prevention:

Use smaller, focused WorkItems (fewer file overlaps)
Prefer domain specialists with clear boundaries
Use /drzero:drzero-council for architecture decisions before implementation

Rolling Back Changes

Dr. Zero creates git stashes before each phase:

git stash list | grep drzero

Rollback Phase 2 only (keep Phase 1 refinements):

git reset --hard HEAD~1
git stash pop stash@{0}

Full session revert:

git reset --hard <commit-before-drzero>
git stash clear

Checkpoint configuration:

# drzero.yml
dr_zero:
  checkpoints:
    enabled: true
    frequency: per-phase  # or: per-iteration, per-task
    auto_stash: true

SDK Precedence Not Working As Expected

# Verify which agent source wins
/agents testing
# Shows: testing (plugin) OR testing (user) OR testing (project)

# Check Claude Code version
claude --version

# Verify plugin installation
claude plugin list

If precedence is still broken, file an issue with:

Claude Code version
Agent file locations and their frontmatter
/agents command output
Dr. Zero session logs

Documentation

User documentation follows the Diataxis quadrants under docs/plugins/drzero/:

Tutorial: Your first autonomous session
How-to: Create a custom domain agent
Reference: Command reference · Hook reference
Explanation: Domain routing · HRPO/GRPO curriculum learning

Plugin and dr0 internals:

Paper Alignment: references/drzero-paper-2601.07055.md (arXiv:2601.07055)
Scoring runtime: dr0/README.md (canonical dr0.scoring surface)
Architecture: dr0/docs/architecture.md and dr0/docs/architecture-diagrams.md
Configuration Schema: dr0/docs/configuration-schema.md
Hook internals: docs/drzero-hooks-architecture.md

Notes

All agent invocations use the Task tool (no subprocess calls)
Context is preserved throughout both phases
HRPO scores the proposer; GRPO scores the solver -- never the reverse (arXiv:2601.07055 Figure 2)
Scores are computed by the dr0 Python package, never self-reported
Session state: ${XDG_RUNTIME_DIR:-$HOME/.cache}/drzero/session-<id>.json (per-session, transient) and .drzero-state.json (persistent)
Checkpoints use git stash for safe rollback

Contributing

See the main repository CONTRIBUTING.md and CLAUDE.md.

License

Internal Use Only - Optum Tech Compute

drzero

Plugin Structure

Installation

Verification

Documentation

Dr. Zero - Autonomous Repository Improvement Plugin

Quick Start

Installation

Basic Usage

Swarm Coordination Commands

Swarm Modes at a Glance

Two-Phase Architecture

Phase 1: Dual-Scoring Curriculum Learning (arXiv:2601.07055)

Phase 2: Agent Swarm Execution

Session State

Configuration

All 20 Agents (16 Domain Specialists + 4 Meta/Bridge Agents)

Meta and Bridge Agents (4)

Domain Specialists (16)

Skills

Hook System

Hook registration

Integration with AI-DLC

Agent Override System (3-Level Precedence)

How It Works

Override Example: Custom Testing Agent

Security Features

Requirements

Troubleshooting

"Proposer agent not found"

"Phase 1 not converging"

"Orchestrator not found"

Agent Override Not Working

Handling Conflicts (Swarm Agents Modifying Same Files)

Rolling Back Changes

SDK Precedence Not Working As Expected

Documentation

Notes

Contributing

License