Skip to content

drzero

otc-awesome-llm is the Optum LLM library providing version-controlled prompts, chatmodes, instructions, and agent modes for infrastructure operations via native IDE integrations

v11.3.0
Claude Code

By Thomas Hudak ([email protected])

Plugin Structure

๐Ÿค–
21
Agents
โšก
0
Skills
โŒจ๏ธ
12
Commands
๐Ÿช
6
Hooks
๐Ÿ“‹
0
Rules

Installation

Install this plugin using the Claude Code CLI:

claude plugin install drzero@otc-awesome-llm

Verification

After installation, verify the plugin is loaded:

claude plugin list

Documentation

Dr. Zero - Autonomous Repository Improvement Plugin

Status: Experimental Alpha (Internal Use Only) Version: 10.17.1

Dr. Zero is a Claude Code plugin providing autonomous repository improvement through dual-scoring curriculum learning and multi-agent swarm coordination. Scoring is based on the GRPO/HRPO framework (arXiv:2601.07055):

  • HRPO scores the proposer (format compliance + difficulty calibration, reward (0.5 * format + difficulty) / 1.5 โ€” rescaled to [0, 1] from paper Eq. 4)
  • GRPO scores the solver (binary acceptance-test reward)

Full user documentation lives under docs/plugins/drzero/ (see Documentation below for the quadrant map).

Quick Start

Installation

# Add otc-awesome-llm marketplace
claude plugin marketplace add /path/to/otc-awesome-llm

# Install Dr. Zero plugin
claude plugin install drzero@otc-awesome-llm

# Verify installation
claude plugin list

Basic Usage

# Run autonomous improvement (default: 3 iterations, 3 tasks/iteration)
/drzero:drzero

# Health check (confirm plugin is loaded)
/drzero:drzero-ping

# Check session status
/drzero:drzero-status

# View/edit configuration
/drzero:drzero-config

Swarm Coordination Commands

# Hierarchical coordination (Orchestrator + 16 domain agents)
/drzero:drzero-swarm "Implement user authentication"

# Democratic debate for architecture decisions
/drzero:drzero-council "Should we use microservices or monolith?"

# Centralized governance with mandatory quality gates
/drzero:drzero-citadel "Deploy payment processing changes"

# Peer-to-peer parallel work (no central orchestrator)
/drzero:drzero-unity "Fix all linting errors"

# Simplified execution (or ruthless optimization with --evil)
/drzero:drzero-morty "Update README"

# Parallel variant implementations for A/B comparison
/drzero:drzero-cronenberg "Try 3 different approaches to caching"

# Cross-repo coordination
/drzero:drzero-portal-gun "Update authentication across all microservices"

# Minimal viable solutions under extreme constraints
/drzero:drzero-pickle "Minimal changes to pass CI"

Swarm Modes at a Glance

The 8 coordination modes, each shipped as its own command (full contracts in the command reference):

ModeCommandCoordination modelBest for
Swarm/drzero:drzero-swarmHierarchical (orchestrator + 16 specialists)Clear tasks needing parallel domain expertise
Council/drzero:drzero-councilDemocratic debate among orchestrator variantsArchitecture decisions, design trade-offs
Citadel/drzero:drzero-citadelCentralized governance, mandatory quality gatesProduction deploys, compliance-bound changes
Unity/drzero:drzero-unityPeer-to-peer, no central orchestratorEmbarrassingly parallel work (lint, renames)
Morty/drzero:drzero-mortySingle agent, no routing (--evil for aggressive optimization)Obvious or mechanical changes
Cronenberg/drzero:drzero-cronenbergN parallel variant implementations, compare and pickA/B testing competing approaches
Portal Gun/drzero:drzero-portal-gunCross-repo coordinatorMulti-repository changes, dependency rollouts
Pickle/drzero:drzero-pickleSingle constrained implementerMinimal solutions in locked-down environments

Two-Phase Architecture

flowchart TD
    user([User / autonomous scan]) -->|prompt or repo findings| proposer

    subgraph phase1 ["Phase 1 โ€” Dual-Scoring Curriculum Refinement"]
        proposer[Proposer agent] -->|WorkItems| solver[Solver agent]
        solver -->|"attempt facts (exit codes, diffs)"| stopHook[SubagentStop hook]
        stopHook -->|"dr0.scoring: HRPO proposer + GRPO solver"| session[("/tmp/drzero_session.json")]
        session -->|"success rate vs ~50% target"| proposer
    end

    preHook[PreToolUse hook] -.->|"validates scope_boundary + acceptance_test"| solver

    session -->|refined, domain-tagged prompts| orchestrator

    subgraph phase2 ["Phase 2 โ€” Orchestrator Agent Swarm"]
        orchestrator[Orchestration dispatcher] -->|domain routing| specialists["16 domain specialist agents"]
        specialists --> gates[Quality gates / security review]
        gates --> results[Tested changes + PR]
    end

Scoring in both phases is computed exclusively by the dr0 Python package via the SubagentStop hook โ€” agents never self-report scores. Deep-dive diagrams (HRPO loop, swarm sequence, checkpoint lifecycle) live in dr0/docs/architecture-diagrams.md.

Phase 1: Dual-Scoring Curriculum Learning (arXiv:2601.07055)

  1. Proposer analyzes CI failures, lint errors, type errors, and documentation gaps to generate WorkItems (improvement tasks).
  2. Solver attempts each WorkItem, producing code patches and running acceptance tests.
  3. HRPO scores the proposer: evaluates format compliance and difficulty calibration of the generated WorkItems using the reward (0.5 * format + difficulty) / 1.5 (rescaled to [0, 1] from paper Eq. 4).
  4. GRPO scores the solver: binary acceptance-test success, the repository-task proxy for exact-match reward.
  5. Difficulty auto-adjusts to target a 50% success rate.
  6. Output: Refined, high-quality prompts for Phase 2.

Anti-hallucination rule: All GRPO and HRPO scores are computed exclusively by the dr0 Python package. Scores are never self-reported by agents.

Phase 2: Agent Swarm Execution

  • Orchestrator (Rick Sanchez) coordinates 16 domain specialist agents via Task tool.
  • Domain agents execute domain-specific tasks (potentially in parallel).
  • Security/Cerberus reviews code through three lenses: security, quality, savage.
  • Output: Production-ready changes with tests passing.

Session State

  • ${XDG_RUNTIME_DIR:-$HOME/.cache}/drzero/session-<id>.json -- per-session transient state, written by hooks during execution. <id> is $DRZERO_SESSION_ID if set, otherwise the hook process PID. The drzero/ directory is per-user, mode 0700, and ownership-checked (CR-1).
  • .drzero-state.json -- persistent validated state, survives across sessions.

Configuration

Create drzero.yml in project root or ~/.claude/drzero.yml for user defaults.

The drzero.yml format is shared between Claude and Codex, but several values are runtime-sensitive. The example below uses Claude-oriented values (backend: claude, orchestrator rick, quality reviewer cerberus). Codex runs use backend: codex with the orchestration/security domain identifiers (see plugins/drzero/assets/drzero.yml.example).

Config precedence (both runtimes):

  1. ./drzero.yml
  2. ~/.claude/drzero.yml
  3. built-in defaults
version: '1.0'

dr_zero:
  max_iterations: 5
  tasks_per_iteration: 3
  proposer:
    persona: healthcare # conservative, patient-safety focused
  solver:
    backend: claude # Claude runtime; Codex examples use `codex`
    temperature: 0.7
  terminal:
    all_tests_pass: true
    lint_clean: true
    coverage_threshold: 80

agent_swarm:
  orchestrator:
    agent: rick # Claude orchestrator; Codex uses `orchestration`
    quality_reviewer: cerberus # Claude reviewer; Codex uses `security`
    definition_of_done:
      - tests_pass
      - lint_clean
      - docs_updated
      - security_cleared

All 20 Agents (16 Domain Specialists + 4 Meta/Bridge Agents)

The plugin ships exactly 20 agent files โ€” 16 domain specialists, the orchestration coordinator, the two Phase 1 curriculum-learning agents, and one bridge agent. The contract is defined in agents/CLAUDE.md (a directory-conventions document, not an agent) and enforced by tests; installation bundles are declared in agents-manifest.json.

Dr. Zero uses a domain-filename convention: each agent is named by its domain ({domain}.md), enabling SDK-native precedence resolution without custom discovery logic.

Meta and Bridge Agents (4)

DomainAgent FileRole
orchestrationorchestration.mdOrchestrator (Rick Sanchez) -- coordinates Phase 2 agent swarm; never a work domain
proposerproposer.mdGenerates WorkItems in Phase 1 (scored by HRPO)
solversolver.mdAttempts WorkItems in Phase 1 (scored by GRPO)
security-remediationsecurity-remediation.mdBridge agent: secplat findings โ†’ portal-gun cross-repo remediation

Domain Specialists (16)

DomainAgent FileFocus Area
architecturearchitecture.mdSystem design, domain boundaries, interfaces, rollout strategy
backendbackend.mdREST/GraphQL APIs, microservices, message queues, caching
compliancecompliance.mdHIPAA, SOC2, PCI, policy-as-code
databasedatabase.mdPostgreSQL, MongoDB, query optimization
devopsdevops.mdCI/CD pipelines, automation, tooling
documentationdocumentation.mdMkDocs, Diataxis framework
frontendfrontend.mdReact, Vue, Angular, Storybook
gitopsgitops.mdSemantic-release, workflow distribution across 130+ repos
implementationimplementation.mdProduction code (Ansible, Terraform, GitHub Actions, Python)
infrastructureinfrastructure.mdIaC module structure, environment baselines, deployment topology
monitoringmonitoring.mdDynatrace, Splunk, Azure Monitor
networkingnetworking.mdVPC/VNET, DNS, load balancers, CDN, security groups, routing
performanceperformance.mdLoad testing, caching strategies, profiling, bottleneck analysis
secretssecrets.mdConsul Vault, CyberArk, Venafi certificates, credential rotation
securitysecurity.mdThree-headed review (security, quality, savage) -- Cerberus
testingtesting.mdMolecule, terratest, pytest, GitHub Actions validation

Key insight: Filename = domain = agent name. The SDK resolves precedence by filename match alone -- no domain registry or custom discovery needed.

The canonical 16-domain taxonomy (scopes, artifacts, AI-DLC stage mapping) is defined in drzero-domain-mapping.md ยง1. The plugin deliberately ships no reviewer agents โ€” review tiers are plugin-external and composed by consumers via quality_gates (see agents/CLAUDE.md).

Skills

Five skills ship with the plugin under skills/, each loadable on demand:

SkillPurpose
domain-agent-routingUnderstand the 16 domain specializations, geometric priority matrix theory, and runtime agent discovery mechanism
drzero-curriculum-learningUnderstand dual-scoring curriculum learning (HRPO proposer + GRPO solver, arXiv:2601.07055) and the proposer-solver architecture behind Phase 1
pr-ci-monitoringGitHub PR/CI monitoring strategies, auto-merge workflows, and integration with gh CLI or GitHub MCP server
rick-swarm-integrationOrchestrator swarm coordination patterns, Phase 2 work execution, and multi-agent parallel task distribution
security-review-protocolSecurity review quality gates, the three-headed review system (security, quality, savage), and definition-of-done criteria

Hook System

Two hooks implement the plugin's fail-closed validation and anti-hallucination scoring. Full contract: docs/plugins/drzero/reference/ref-hooks.md; dr0-side background: dr0/docs/hook-architecture.md.

  • PreToolUse โ€” validates Dr. Zero agent inputs before the Task tool spawns them: required WorkItem fields, scope-boundary defenses (empty/wildcard/absolute/traversal paths rejected), and acceptance-test command whitelisting with shell-metacharacter blocking. Validation failure blocks the invocation (exit 1) before any tokens are spent.
  • SubagentStop โ€” captures proposer/solver output when a Task completes and computes deterministic scores via dr0.scoring: HRPO proposer reward (format + difficulty calibration) and GRPO solver reward (binary acceptance-test outcome). Results are written to /tmp/drzero_session.json under file locking, with provenance (scored_by, scored_at). Model-supplied scores are ignored; scoring failures write null, never a fabricated number.

A standalone validator, hooks/validate-state-file.py, checks .drzero-state.json before Phase 2 execution.

Hook registration

Hooks are registered via hooks/hooks.json, which is auto-loaded by Claude Code through the "hooks" key in .claude-plugin/plugin.json. The lib/setup-dr0.sh step only installs the dr0 pip package โ€” hook wiring requires no manual action. After installing the plugin, run /reload-plugins and confirm the status line reports hooks > 0; if it reports 0, the scoring pipeline will not run and the Issue #304 anti-hallucination guard will HALT every session.

Loader note: Claude Code resolves plugin.json from the marketplace source (~/.claude/plugins/marketplaces/.../), not the per-version cache (~/.claude/plugins/cache/.../). Editing the cache copy will not change /reload-plugins output โ€” re-sync the marketplace or reinstall the plugin.

Integration with AI-DLC

AI-DLC plans; Dr. Zero executes. The AI-DLC plugin's inception/effort workflow produces units of work tagged with canonical domain slugs, and /ai-dlc:effort hands them to Dr. Zero as a domain_routing: YAML payload (per-unit primary_domain, parallel domains, dependencies, quality_gates, drzero_review_covers, artifact_paths). The orchestration dispatcher consumes that payload via a six-step procedure (H1โ€“H6): detect handoff context, parse the payload, resolve dependency order, dispatch each unit to its domain specialists, report effort-level completion, and stay idempotent on resume.

On return, the dispatcher emits Reviewer Status updates and โ€” when a unit claims review coverage โ€” a predicate_verdicts: block, closing the loop back into AI-DLC's effort state. Canonical contracts: drzero-domain-mapping.md and drzero-orchestrator-procedure.md; conceptual overview: docs/plugins/ai-dlc/explanation/drzero-integration.md.

Agent Override System (3-Level Precedence)

Dr. Zero leverages the Claude Code SDK's native precedence mechanism:

.claude/agents/ (repo) > ~/.claude/agents/ (user) > plugin/agents/ (bundled)

The SDK automatically resolves which agent to use based on filename matching.

How It Works

When Orchestrator routes a WorkItem with domain: "testing", it invokes:

Task(agent="testing")  # Just the domain name

The SDK automatically checks:

  1. .claude/agents/testing.md (repo-specific) -- use if exists
  2. ~/.claude/agents/testing.md (user-wide) -- use if exists
  3. plugin/agents/testing.md (bundled) -- fallback

Override Example: Custom Testing Agent

Create .claude/agents/testing.md in your project:

---
name: testing
description: 'Custom test agent for our project-specific validation'
---

# Custom Testing Agent

You are the testing specialist for this project.

## Project-Specific Test Patterns

- Use our custom pytest fixtures in tests/conftest.py
- Follow our test naming convention: test_{feature}_{scenario}
- Always include integration tests for API endpoints

When Dr. Zero routes domain: "testing", your custom agent is used automatically.

Security Features

  • Scope boundary validation: The PreToolUse hook rejects path traversal (../../../etc/passwd), absolute paths, and CAT attacks (empty or glob/wildcard scopes) before any solver runs
  • Command whitelisting: Validates acceptance_test commands against a whitelist of 28 safe tools (pytest, ruff, terraform plan, etc.), blocks shell metacharacters, and prevents mutating subcommands (e.g., terraform apply)
  • Untrusted-input fencing: /drzero validates user input before processing โ€” mode-specific length limits, prompt-injection pattern blocking, and shell-metacharacter escaping, with the sanitized copy persisted atomically to /tmp/drzero_input_validation.json
  • Anti-hallucination scoring: All GRPO/HRPO scores are computed by the dr0 Python package via the SubagentStop hook -- agents never self-report scores, model-supplied scores are discarded, and scoring failures write null with error provenance (see Issue #304)
  • Trusted-path installer: lib/setup-dr0.sh only executes the dr0 installer from a trusted checkout ($OTC_AWESOME_LLM_ROOT or plugin-relative) โ€” never from the current working directory, so an untrusted repo cannot ship a crafted installer (see lib/README.md)
  • Security review protocol: The security-review-protocol skill defines the three-headed review gates (security, quality, savage) and definition-of-done criteria applied by the security agent
  • Cross-repo remediation bridge: The security-remediation agent turns secplat findings into coordinated portal-gun remediation across affected repositories

Requirements

  • Python 3.11+
  • Git repository
  • Claude Code CLI
  • dr0 Python package (provides GRPO/HRPO scoring math)
  • Local CI tools: pytest, ruff, mypy, bandit

Troubleshooting

"Proposer agent not found"

Ensure agents are in one of the three discovery paths:

  1. .claude/agents/proposer.md (project-level override)
  2. ~/.claude/agents/proposer.md (user-level override)
  3. plugin/agents/proposer.md (bundled fallback)

"Phase 1 not converging"

Adjust target success rate in drzero.yml:

dr_zero:
  target_success_rate: 0.4 # Lower threshold (default: 0.5)
  max_iterations: 5 # More iterations

"Orchestrator not found"

Falls back to sequential execution without Orchestrator. To use Orchestrator:

  1. Check /help for available agents
  2. Verify the orchestration.md agent exists in a discovery path
  3. Confirm name: orchestration in the agent frontmatter

Agent Override Not Working

If your custom agent is not being picked up:

  1. Filename must match the domain exactly:

    ~/.claude/agents/testing.md     -- correct for "testing" domain
    ~/.claude/agents/my-tester.md   -- wrong, SDK will not find this
    
  2. Frontmatter name must match the filename (without .md):

    ---
    name: testing
    ---
    
  3. Check the /agents command to confirm which source wins:

    /agents
    # testing (project)  [overrides plugin]
    # testing (user)     [overrides plugin]
    # testing (plugin)   [bundled default]
    
  4. Verify precedence order:

    • Project (.claude/agents/) overrides user (~/.claude/agents/)
    • User overrides plugin (bundled agents)
    • If agent appears in multiple locations, highest precedence wins

Common mistakes:

  • Agent file not named after the domain (e.g., koji.md instead of testing.md)
  • Frontmatter name does not match filename
  • Directory typo: .claude/agent/ (missing trailing s)

Handling Conflicts (Swarm Agents Modifying Same Files)

When multiple agents modify overlapping files in Phase 2:

Resolution strategies:

  1. Sequential execution (safest):

    # drzero.yml
    agent_swarm:
      max_parallel: 1
    
  2. File-based locking (automatic): Orchestrator automatically sequences agents working on the same files.

  3. Manual resolution:

    git status
    # edit conflicted files
    git add <resolved-file>
    /drzero:drzero --resume
    

Prevention:

  • Use smaller, focused WorkItems (fewer file overlaps)
  • Prefer domain specialists with clear boundaries
  • Use /drzero:drzero-council for architecture decisions before implementation

Rolling Back Changes

Dr. Zero creates git stashes before each phase:

git stash list | grep drzero

Rollback Phase 2 only (keep Phase 1 refinements):

git reset --hard HEAD~1
git stash pop stash@{0}

Full session revert:

git reset --hard <commit-before-drzero>
git stash clear

Checkpoint configuration:

# drzero.yml
dr_zero:
  checkpoints:
    enabled: true
    frequency: per-phase  # or: per-iteration, per-task
    auto_stash: true

SDK Precedence Not Working As Expected

# Verify which agent source wins
/agents testing
# Shows: testing (plugin) OR testing (user) OR testing (project)

# Check Claude Code version
claude --version

# Verify plugin installation
claude plugin list

If precedence is still broken, file an issue with:

  • Claude Code version
  • Agent file locations and their frontmatter
  • /agents command output
  • Dr. Zero session logs

Documentation

User documentation follows the Diataxis quadrants under docs/plugins/drzero/:

Plugin and dr0 internals:

Notes

  • All agent invocations use the Task tool (no subprocess calls)
  • Context is preserved throughout both phases
  • HRPO scores the proposer; GRPO scores the solver -- never the reverse (arXiv:2601.07055 Figure 2)
  • Scores are computed by the dr0 Python package, never self-reported
  • Session state: ${XDG_RUNTIME_DIR:-$HOME/.cache}/drzero/session-<id>.json (per-session, transient) and .drzero-state.json (persistent)
  • Checkpoints use git stash for safe rollback

Contributing

See the main repository CONTRIBUTING.md and CLAUDE.md.

License

Internal Use Only - Optum Tech Compute