Skip to content

Writing Playbooks

Playbooks encode write-capable operational procedures. They are declared in agent.yaml, registered via the ADK, and always require human confirmation before execution.


Playbooks vs Skills

A Playbook is the write-complement to a Skill:

  • A Skill detects that SSH is open to 0.0.0.0/0 (read, advisory)
  • A Playbook closes that ingress rule (write, requires approval)

Both are declared in the same agent.yaml. The ADK enforces the read/write split at registration time.


The agent.yaml playbook section

yaml
playbooks:
  can_trigger: true                    # can this agent initiate playbook execution?
  can_generate_candidate: true         # can Code Agent generate one dynamically if none found?
  owned_playbook_ids:
    - security.remediate_public_exposure
    - security.lock_public_storage

  # Read-before-write context — data the playbook needs before executing write tools
  context:
    context_descriptions:
      - "security group config — ingress rules before write execution"
      - "public exposure inventory — resources targeted for remediation"
    supported_context_types:
      - public_exposure_inventory
      - security_group_config
    declared_context_builders:
      - security.public_exposure_context

  target_language: python              # python | bash | terraform

Each owned_playbook_id requires a corresponding playbook.yaml in the agent package.


playbook.yaml — full schema

Each playbook is a separate file. This is the complete schema with all fields annotated:

yaml
# ── Identity ──────────────────────────────────────────────────────────────────
schema_version: 1                               # required — always 1
playbook_id: security.remediate_public_exposure # required — globally unique, format: domain.name
name: remediate-public-exposure                 # required — kebab-case identifier
display_name: Remediate Public Network Exposure # required — user-facing name
status: active                                  # required — active | deprecated

owner:                                          # required
  team: security-engineering
  contact: security-engineering@company

purpose: >                                      # required — one sentence
  Remove public ingress rules from security groups flagged for unrestricted access.

description: >                                  # required — fuller explanation
  Executes targeted security group rule modifications to close unrestricted ingress
  on port 22 (SSH), port 3389 (RDP), and port 0 (all traffic) where the source CIDR
  is 0.0.0.0/0 or ::/0. Requires a pre-execution snapshot and produces full Evidence.

# ── Classification ─────────────────────────────────────────────────────────────
classification:
  domain: security                              # required — must exist in Domain Registry
  trigger_type: finding                         # finding | explicit | scheduled
  trigger_conditions:                           # required if trigger_type: finding
    finding_category: network_exposure
    severity: [HIGH, CRITICAL]
  safety_class: remediation                     # advisory | remediation | destructive

# ── Context (read-before-write) ────────────────────────────────────────────────
context:
  context_descriptions:
    - "current ingress rules for targeted security groups"
    - "list of resources flagged in the triggering Finding"
  supported_context_types:
    - security_group_config                     # estate snapshot of SG rules before execution
    - public_exposure_inventory                 # resources from the triggering Finding
  declared_context_builders:
    - security.security_group_context
    - security.public_exposure_context

# ── Tool access ────────────────────────────────────────────────────────────────
tool_access:
  readonly_tools:                               # for pre-execution reads
    allowed_tool_classes:
      - security_group_read
    execution_locations: [client]
  write_tools:                                  # for execution steps
    allowed_tool_classes:
      - security_group_write
    execution_locations: [client]
    requires_human_approval: true               # always true for write tools — enforced

# ── Steps ──────────────────────────────────────────────────────────────────────
steps:
  - step_id: snapshot_before                   # required — take before state
    type: read
    tool_class: security_group_read
    description: Capture current ingress rules for all targeted security groups
    required: true
    on_failure: abort                          # abort | skip | continue

  - step_id: validate_targets
    type: read
    tool_class: security_group_read
    description: Confirm targeted security groups exist and match Finding resource IDs
    depends_on: [snapshot_before]
    required: true
    on_failure: abort

  - step_id: revoke_public_ingress
    type: write
    tool_class: security_group_write
    description: Revoke unrestricted ingress rules (0.0.0.0/0, ::/0) on targeted ports
    depends_on: [validate_targets]
    required: true
    on_failure: abort
    rollback_step: rollback_ingress            # if this step fails mid-execution

  - step_id: verify_after
    type: read
    tool_class: security_group_read
    description: Confirm ingress rules are removed — snapshot after state
    depends_on: [revoke_public_ingress]
    required: true
    on_failure: continue                       # verification failure does not undo the change

  - step_id: rollback_ingress                  # only executed on failure of revoke_public_ingress
    type: write
    tool_class: security_group_write
    description: Restore original ingress rules from before-snapshot
    depends_on: [snapshot_before]
    required: false
    is_rollback: true

# ── Safety ─────────────────────────────────────────────────────────────────────
policy:
  requires_human_review_for: [revoke_public_ingress]  # step IDs that need approval
  blast_radius:
    restarts_required: false
    downtime_expected: false
    reversible: true
    rollback_available: true
  prohibited_actions:
    - delete_security_group
    - modify_vpc_routing

# ── Evidence ───────────────────────────────────────────────────────────────────
evidence:
  captures: [before_state, after_state, api_calls, approval_record]
  output_type: execution_record

# ── Versioning ─────────────────────────────────────────────────────────────────
versioning:
  version: "1.0.0"

Safety classes

ClassDescriptionRequires human review
advisoryProduces analysis only, no mutationsNo
remediationModifies resources, reversibleYes — always
destructiveDeletes or irreversibly modifies resourcesYes — with explicit confirmation step

WARNING

destructive playbooks require a double-confirmation step in the UI. Users must type the resource identifier to proceed. This cannot be bypassed.


Execution plan rules

Steps in a Playbook follow a DAG:

depends_on: []           → starts immediately (runs in parallel with other independent steps)
depends_on: [step_a]     → waits for step_a to complete before starting
on_failure: abort        → halt the entire playbook on this step's failure
on_failure: skip         → continue without this step's output
on_failure: continue     → record the failure, proceed

Steps with empty depends_on run in parallel. The platform resolves the execution order from the dependency graph at runtime.


Registering a playbook via the ADK

bash
# From the agent package directory
adk validate .           # validates agent.yaml + all playbook.yaml files offline

# Expected:
# ✓ identity: valid
# ✓ playbooks: 2 playbooks validated
#   ✓ security.remediate_public_exposure — steps: 5, safety_class: remediation
#   ✓ security.lock_public_storage — steps: 3, safety_class: remediation
# Validation passed.

adk register .           # registers all assets into the Context Engine

INFO

adk register is atomic — all assets are placed or none are. A failed registration leaves the Context Engine unchanged.


Dynamically generated playbooks

If no registered Playbook matches a user's intent, the Code Agent can generate a candidate Playbook dynamically — but only if can_generate_candidate: true in the agent's agent.yaml.

Generated playbooks follow the same schema and safety rules. They are presented to the user for review before execution. They are not registered permanently unless the user explicitly saves them.


Next steps

Escher — Agentic CloudOps by Tessell