Skip to content

Playbooks Overview

Implementation status

The shipped Playbook Agent is v3-playbook-agent-go. The v4-playbook-agent-go repo exists but is currently empty — v4 design is in the architecture spec; v4 implementation has not yet started. State machines, safety classes, and the lifecycle below are described from the v3 contract plus the v4 spec; read the actual v3-playbook-agent-go README for what's running today.

Playbooks are write-capable operational procedures. They turn Findings into executed, evidenced Runs — with human approval enforced at every write step.


The lifecycle

A Playbook is the reusable procedure that drives this lifecycle. It encodes the steps, safety rules, context requirements, and evidence expectations for a class of remediation. One Playbook can be instantiated as many Bundles and Runs across different resources, accounts, and environments.


Skills vs Playbooks

SkillsPlaybooks
OperationRead-only observationWrite operations (create, update, delete)
Cloud mutationsNever permittedPermitted after human approval
Human approvalNot requiredRequired before every write step
OutputFinding, Report, TriageRun + Evidence
Registered viaagent.yaml → skills[]agent.yaml → playbooks[]
Tool accesstool_class: readonly onlytool_class: write permitted

Playbook triggers

Trigger typeDescription
findingAutomatically surfaced when a matching Finding is created (matched on trigger_conditions[])
explicitUser directly invokes the Playbook by name in the UI
apiTriggered programmatically via the Playbook Agent API
scheduledTime-based trigger — defined in the Playbook's schedule block (e.g. nightly cleanup)

Safety classes

Every Playbook declares a safety_class that controls how the Platform Framework handles human interaction:

safety_classBehaviour
advisoryEscher presents the Bundle and recommends action. User runs it manually when ready.
supervisedEscher presents the Bundle and requires explicit approval before each write step.
automatedEscher executes automatically with no approval gate. Reserved for reversible, idempotent operations.

WARNING

automated playbooks should only be used for operations that are completely safe to reverse and have no blast radius — e.g. applying a tag. Any network, IAM, or data operation must use supervised.


Safety guarantees

Every Playbook execution:

  1. Requires human approval before any write step executes (for advisory and supervised classes)
  2. Captures a before-state snapshot of every targeted resource prior to the first write
  3. Produces Evidence for every step — API call inputs/outputs, before/after state, timestamps
  4. Supports pause and abort — a running Playbook can be stopped mid-execution; already-executed steps are preserved in Evidence
  5. Defines rollback steps for all destructive operations — Escher executes the rollback automatically on failure
  6. Validates against a pinned EstateView — if the estate has drifted since Bundle creation, the Bundle is flagged stale before execution begins

Playbook vs Code Agent generation

Playbooks can come from two sources:

SourceWhen
Registered PlaybookA domain team has authored and registered a Playbook via adk register. This is preferred — the Playbook has been reviewed and tested.
Code Agent generationNo registered Playbook matches the request. The Code Agent generates a candidate Playbook on-the-fly, grounded in the Agent Registry domain boundary. is_candidate: true is set — it is not persisted until a human promotes it.

Candidate Playbooks go through the same approval flow as registered Playbooks. They can be promoted to the Playbook collection (becoming is_candidate: false) after review.


Writing vs executing

TaskWhere to go
Author a new PlaybookWriting Playbooks — full schema reference + annotated example
Run an existing PlaybookExecuting Playbooks — Bundle review, approval, monitoring
Review run history and evidenceEvidence & Reports — audit trail, compliance export

Next steps

Escher — Agentic CloudOps by Tessell