Skip to content

Schema Reference

Developer Reference

This page covers internal implementation details. It is not included in the User Guide.

Complete JSON schema for all 13 Context Engine collections. These are the authoritative data structures populated by adk register and read at runtime by the Analysis Agent, Playbook Agent, and Gateway.


Collection map

#CollectionFormula variablePhase accessedEmbedding
2.2SkillSkill(x)2C Step 1 (search) + Phase 3 (full)purpose + description + capabilities + context_descriptions
2.4ToolTool_readonly / Tool_writePhase 6 + 5A State 7none — filter by tool_id / tool_class
2.5GuardrailGuardrailPhase 7 + 5A State 7none — filter only
2.6Domain LensDomainLensPhase 7 (RAG)source ref only — CE handles RAG internally
2.7TemplateTemplatePhase 7 (AA resolves once)none — filter only
2.8PlaybookPlaybook(p)2C write Step 1 + Phase 3 + 5Atrigger_conditions + name
2.10Tag Storetags (internal)All phases (request-scoped)none
2.11Agent RegistryAgent manifestPhase 2C/2W miss onlycapabilities[]
2.12Domain Expert GraphExpertGraphPhase 7 (advanced, on-demand)none — graph traversal
2.13Domain RegistryADK registration onlynone — exact match
2.14Evidence (design intent)All artifact stages (Report Item / Finding / Plan / Run)none — direct lookup by id; indexed by attached_to.*

Intent (2.1) and Context Builder (2.3) are deprecated. Cloud Knowledge (2.9) is merged into Domain Lens (§2.6) via content_type: "cloud_service".

Evidence (2.14) is design intent — not yet implemented in the running CE API. See §2.14 below and the Evidence & Reports design spec. The collection name, storage location (CE collection vs separate write-optimised store), and indexing strategy are pending engineering sign-off.


2.2 Skill

Formula variable: Skill(x)Phase: 2C Step 1 (semantic search) → Phase 3 (full manifest retrieval) Collection: escher_skills_<tenant_id> → fallback escher_skills_global

json
{
  "_id": "security.detect_public_ingress",
  "_embedding": [0.034, -0.121, 0.095, "...768 dims"],
  "_created_at": "2024-01-15T10:00:00Z",
  "_updated_at": "2024-01-15T10:00:00Z",
  "_version": "0.1.0",

  "skill_id":    "security.detect_public_ingress",
  "name":        "detect_public_ingress",
  "domain":      "security",
  "tier":        "basic",
  "tenant_id":   null,
  "status":      "active",

  "display_name":  "Detect Public Network Exposure",
  "description":   "Identifies EC2 security groups, NACLs, and ALBs with rules that allow unrestricted inbound access (0.0.0.0/0 or ::/0).",
  "purpose":       "find cloud resources exposed to the public internet through misconfigured network controls",

  "capabilities": [
    "detect security groups with 0.0.0.0/0 ingress rules",
    "identify public-facing load balancers without WAF",
    "flag NACLs with overly permissive allow rules",
    "rank exposure findings by severity"
  ],
  "capability_id": "security.network_exposure_detection",

  "context": {
    "context_descriptions": [
      "current security group configurations and their ingress/egress rules",
      "list of internet-facing load balancers and their WAF associations"
    ],
    "supported_context_types": [
      "public_exposure_inventory",
      "resource_scope_summary",
      "environment_scope"
    ]
  },

  "output_type": "finding",

  "execution_plan": {
    "steps": [
      {
        "step_id": "step_1",
        "type":    "read",
        "tool_id": "aws.describe_security_groups",
        "description": "Fetch all security groups in scope",
        "cache_ttl_seconds": 300
      },
      {
        "step_id": "step_2",
        "type":    "read",
        "tool_id": "aws.describe_load_balancers",
        "description": "Fetch load balancers and WAF associations",
        "cache_ttl_seconds": 300
      }
    ]
  },

  "approach_hints": [
    "focus on rules allowing 0.0.0.0/0 — these are definitive exposure",
    "flag ::/0 IPv6 as equivalent severity",
    "exclude expected public-facing resources (CDN origins) if tagged appropriately"
  ]
}

Embedding strategy:

Combined vector from: purpose + description + display_name + capability_id + capabilities[] + context_descriptions[]
Searched at Phase 2C Step 1 (routing decision: narrow / broad / miss)
Full document fetched at Phase 3 using skill_id exact match

Tenant scope: tenant → global. Tenant skills take priority; global skills are the fallback.


2.4 Tool

Formula variable: Tool_readonly / Tool_writePhase: Phase 6 (readonly tools for context) + Phase 5A State 7 (write tools for playbook execution) Collection: escher_tools_global

json
{
  "_id": "aws.describe_security_groups",
  "_created_at": "2024-01-15T10:00:00Z",

  "tool_id":      "aws.describe_security_groups",
  "name":         "describe_security_groups",
  "provider":     "aws",
  "service":      "ec2",
  "tool_class":   "readonly",
  "domain":       "security",

  "description": "Returns all EC2 security groups in the given region and account, including their ingress and egress rules.",

  "parameters": [
    {
      "name":        "region",
      "type":        "string",
      "required":    true,
      "description": "AWS region code",
      "example":     "us-east-1"
    },
    {
      "name":        "profile",
      "type":        "string",
      "required":    true,
      "description": "AWS CLI profile name from the estate",
      "example":     "prod_infra"
    },
    {
      "name":        "filters",
      "type":        "object",
      "required":    false,
      "description": "Optional filter map passed to describe-security-groups"
    }
  ],

  "output_schema": {
    "type":   "array",
    "items":  { "type": "object" },
    "description": "List of SecurityGroup objects from the AWS EC2 API"
  },

  "execution_target": "client",
  "language":         "python",
  "requires_approval": false
}

tool_class values:

ValueWhen usedrequires_approval
readonlyPhase 6 — fetches context datafalse
writePhase 5A State 7 — modifies cloud resourcestrue if safety_class: supervised
internalCE internal use only — not exposed to AA

2.5 Guardrail

Formula variable: GuardrailPhase: Phase 7 (AA) + Phase 5A State 7 (playbook execution gate) Collection: escher_guardrails_<tenant_id> → fallback escher_guardrails_global

json
{
  "_id": "security.no_public_write_without_approval",
  "_created_at": "2024-01-15T10:00:00Z",

  "guardrail_id": "security.no_public_write_without_approval",
  "domain":       "security",
  "scope":        "domain",
  "applies_to":   ["security.*"],

  "rule": "Any write operation that modifies security group ingress rules, S3 bucket policies, or IAM role trust policies requires explicit human approval before execution.",

  "enforcement": "hard_block",
  "message":     "This change affects your public exposure surface. Review the proposed modification before approving.",

  "exceptions": [
    {
      "condition": "dry_run == true",
      "allow": true
    }
  ],

  "tenant_id": null
}

Scope resolution hierarchy (AA resolves in order, applies all matches):

skill-level  →  agent-level  →  domain-level  →  global

enforcement values:

ValueEffect
hard_blockAA cannot proceed; returns guardrail message to user
soft_warnAA includes warning in output but proceeds
require_approvalPlaybook step pauses for human approval

2.6 Domain Lens

Formula variable: DomainLensPhase: Phase 7 (AA — advanced tier, on-demand RAG) Collection: escher_domain_lens_<tenant_id> → fallback escher_domain_lens_global

json
{
  "_id": "security.soc2_trust_principles",
  "_created_at": "2024-01-15T10:00:00Z",
  "_content_hash": "sha256:abc123...",

  "lens_id":      "security.soc2_trust_principles",
  "domain":       "security",
  "content_type": "framework",
  "tier":         "advanced",
  "tenant_id":    null,

  "title":       "SOC 2 Trust Services Criteria",
  "description": "AICPA SOC 2 Trust Services Criteria — CC series controls for security, availability, processing integrity, confidentiality, and privacy.",

  "source": {
    "type": "url",
    "ref":  "https://internal-docs.escher.internal/frameworks/soc2-tsc-2017.pdf"
  },

  "tags": ["soc2", "compliance", "audit", "cc-series"]
}

content_type values:

ValueDescription
frameworkCompliance frameworks (SOC2, PCI-DSS, ISO 27001, HIPAA, GDPR)
cloud_serviceAWS/Azure/GCP service concepts (merged from Cloud Knowledge §2.9)
policyInternal policies and runbooks
referenceGeneral domain reference documents

CE owns the RAG infrastructure. At registration time, CE fetches source.ref, chunks the content (512-token overlapping windows), embeds the chunks, and indexes them internally. The Domain Lens document holds only the reference and content_hash for change detection — CE re-indexes automatically if the hash changes on re-registration.


2.7 Template

Formula variable: TemplatePhase: Phase 7 (AA resolves once; UI Agent receives at Phase 9 — no second CE call) Collection: escher_templates_<tenant_id> → fallback escher_templates_global

json
{
  "_id": "security.finding.public_exposure",
  "_created_at": "2024-01-15T10:00:00Z",

  "template_id":   "security.finding.public_exposure",
  "domain":        "security",
  "output_type":   "finding",
  "skill_id":      "security.detect_public_ingress",
  "tier":          "basic",
  "tenant_id":     null,

  "structure": {
    "title":         "{{resource_type}} publicly exposed via {{rule_type}}",
    "severity":      "{{severity}}",
    "summary":       "{{resource_id}} in {{region}} has an ingress rule allowing {{cidr}} on port {{port}}.",
    "impact":        "{{impact_description}}",
    "recommendation": "Restrict the ingress rule to known CIDR ranges or remove it if not required.",
    "evidence_refs": ["{{evidence_id}}"]
  },

  "rendering_hints": {
    "severity_color": {
      "critical": "#FF0000",
      "high":     "#FA504A",
      "medium":   "#FFA500",
      "low":      "#FFD700",
      "info":     "#808080"
    }
  }
}

Resolution fallback chain:

skill_id match → agent match → domain match → global default

Templates have no embedding — resolution is exact match only.


2.8 Playbook

Formula variable: Playbook(p)Phase: 2C write Step 1 (semantic search) + Phase 3 + 5A (full manifest) Collection: escher_playbooks_<tenant_id> → fallback escher_playbooks_global

json
{
  "_id": "security.remediate_public_exposure",
  "_embedding": [0.021, -0.045, "...768 dims"],
  "_version": "0.1.0",

  "playbook_id":   "security.remediate_public_exposure",
  "name":          "Remediate Public Exposure",
  "domain":        "security",
  "tier":          "basic",
  "is_candidate":  false,
  "generated_by":  "registered",
  "tenant_id":     null,

  "trigger_conditions": [
    "public_exposure_finding",
    "open_security_group_detected",
    "internet_facing_resource_unprotected"
  ],

  "context": {
    "context_descriptions": [
      "security group config — ingress rules before write execution",
      "public exposure inventory — resources targeted for remediation"
    ],
    "supported_context_types": [
      "public_exposure_inventory",
      "security_group_config"
    ],
    "declared_context_builders": [
      "security.public_exposure_context"
    ]
  },

  "steps": [
    {
      "step_id":    "step_1",
      "name":       "Fetch current exposure surface",
      "type":       "readonly",
      "tool_id":    "aws.describe_public_ingress_surface",
      "required":   true,
      "on_failure": "stop"
    },
    {
      "step_id":    "step_2",
      "name":       "Lock overly permissive security groups",
      "type":       "write",
      "tool_id":    "aws.lock_security_group",
      "required":   true,
      "on_failure": "rollback"
    }
  ],

  "scripts": [
    {
      "step_id":    "step_1",
      "language":   "python",
      "body":       "import boto3\nsession = boto3.Session(profile_name='${user.profile}', region_name='${user.region}')\nec2 = session.client('ec2')\nreturn ec2.describe_security_groups()",
      "params_used": ["profile", "region"]
    },
    {
      "step_id":    "step_2",
      "language":   "python",
      "body":       "import boto3\nsession = boto3.Session(profile_name='${user.profile}', region_name='${user.region}')\nec2 = session.client('ec2')\nec2.revoke_security_group_ingress(GroupId='${sg_id}', IpPermissions=rules)",
      "params_used": ["profile", "region"]
    }
  ],

  "target_language": "python",

  "mandatory_parameters": [
    { "name": "region",  "type": "string",  "description": "AWS region to remediate",  "example": "us-east-1" },
    { "name": "profile", "type": "string",  "description": "AWS CLI profile",           "example": "prod_infra" }
  ],

  "optional_parameters": [
    { "name": "dry_run", "type": "boolean", "description": "Run without making changes", "default": false }
  ],

  "rollback_steps": [
    { "step_id": "rollback_2", "script_id": "step_2_rollback", "reverses": "step_2" }
  ],

  "approach_hints": [
    "prefer least-privilege remediation",
    "avoid broad policy changes"
  ],

  "execution_timeout": 600,
  "rollback_support": true,

  "safety": {
    "safety_class":               "supervised",
    "requires_human_review_for":  ["step_2"]
  },

  "evidence_requirements": [
    "security_group_state_before",
    "security_group_state_after",
    "api_calls_made"
  ]
}

Embedding strategy:

trigger_conditions[] joined as text + name + domain → single dense vector
Phase 2C write Step 1: semantic search → returns playbook_id + confidence
Phase 3 / 5A: direct lookup by playbook_id → full document

2.10 Tag Store

Internal only — not exposed to the Framework directly.Phase: Written on every request; read by AA at Phase 7 for guardrail/template resolution. Collection: escher_tag_store (TTL auto-delete on expires_at)

json
{
  "_id": "req_abc123",
  "expires_at": "2024-01-15T11:00:00Z",

  "request_id":   "req_abc123",
  "tenant_id":    "acme_corp",
  "session_id":   "sess_xyz789",
  "user_id":      "user_456",
  "tier":         "advanced",

  "flow":         "skill",
  "skill_id":     "security.detect_public_ingress",
  "playbook_id":  null,
  "domain":       "security",
  "persona":      "devops",
  "output_type":  "finding",

  "execution_location": "client",
  "owner_agent_id":     "domain.security.exposure"
}

flow values and what they write:

flowFields written
skillskill_id, owner_agent_id, domain, tier, persona, execution_location, output_type
read_broadskill_ids[], domain, tier, persona
rag_searchrag_params, provider, domain, persona
knowledgedomain, persona — required so AA can resolve guardrails + template at Phase 7
writeplaybook_id, domain, tier, persona, tool_ids, param_set
code_gen_read / code_gen_write / code_gen_directdomain, persona

WARNING

domain and persona must be written for every flow — including knowledge flow. AA uses both at Phase 7 to call /resolve/guardrails and /resolve/template.


2.11 Agent Registry

Formula variable: Agent manifest + capabilities (Code Agent grounding only) Phase: Phase 2C miss + 3W miss only — not searched on every prompt Collection: escher_agent_registry_global

json
{
  "_id": "domain.security.exposure",
  "_embedding": [0.034, -0.121, 0.095, "...768 dims"],

  "agent_id":     "domain.security.exposure",
  "name":         "exposure",
  "display_name": "Security Exposure Agent",
  "agent_type":   "domain",
  "domain":       "security",
  "tier_support": ["basic", "advanced"],
  "status":       "active",
  "tenant_id":    null,

  "capabilities": [
    "detect public ingress and network exposure risks",
    "detect public storage access and open S3 buckets",
    "rank and prioritize exposure findings by severity",
    "suggest basic remediation paths for exposure risks"
  ],
  "capabilities_embedding": [0.034, -0.121, "...768 dims"],

  "supported_context_types": [
    "public_exposure_inventory",
    "resource_scope_summary",
    "environment_scope"
  ],

  "skill_refs": [
    "security.detect_public_ingress",
    "security.detect_public_storage_access",
    "security.rank_basic_exposure_findings"
  ],

  "composition": {
    "usable_in_profiles":    ["hero_admin", "cspm_deep"],
    "compatible_agents":     ["domain.security.remediation_planning"],
    "conflicts_with_agents": []
  },

  "version":   "0.1.0"
}

When is Agent Registry searched?

Skill collection   → primary routing (every prompt at Phase 2C Step 1)
Agent Registry     → Code Agent grounding (Phase 2C/2W miss only)
                     → provides capabilities + context boundary
                     → Code Agent generates skill or playbook within this boundary

supported_context_types is derived at ADK registration time as the union of all skill supported_context_types across the agent's skill_refs.


2.12 Domain Expert Graph

Formula variable: ExpertGraphPhase: Phase 7, advanced tier only, on-demand (AA decides at runtime) Graph: escher_domain_expert_graph (per-domain partitions)

Node types

Node TypeKey fieldsExample
Controlid, domain, framework, titleSOC2 CC6.1
Requirementid, domain, descriptionencryption_at_rest
EvidenceTypeid, domain, collection_methodKMS_rotation_logs
ResourceTypeid, provider, serviceaws.kms.key
Riskid, domain, severitymissing_encryption
Remediationid, domain, approachenable_kms_encryption

Edge types

EdgeFrom → ToMeaning
requiresControl → RequirementControl demands this requirement
evidenced_byRequirement → EvidenceTypeRequirement satisfied by this evidence
collected_viaEvidenceType → ResourceTypeEvidence collected from this resource type
is_aRisk → Risk categoryRisk classification
remediated_byRisk → RemediationRisk is addressed by this remediation

Example chain

SOC2_CC6.1  ──requires──▶  encryption_at_rest
                ──evidenced_by──▶  KMS_rotation_logs
                    ──collected_via──▶  aws.kms.key

missing_encryption  ──is_a──▶  data_risk
                    ──remediated_by──▶  enable_kms_encryption

Node JSON — Control example

json
{
  "id":        "soc2.cc6.1",
  "type":      "Control",
  "domain":    "security",
  "framework": "SOC2",
  "title":     "Logical and Physical Access Controls CC6.1",
  "tenant_id": null
}

Domain teams author expert_graph/*.yaml in their ADK package. ADK validates referential integrity and writes nodes + edges at adk register. Advanced tier packages only.


2.13 Domain Registry

Phase: ADK registration only — not called at runtime Collection: escher_domain_registry_global

json
{
  "_id": "security",
  "domain_id":    "security",
  "display_name": "Security",
  "description":  "Cloud security posture, exposure detection, and threat analysis",
  "domain_type":  "domain",
  "status":       "active",
  "tenant_id":    null
}

Seeded domains (present at platform init):

domain_iddisplay_namedomain_type
platformPlatformplatform
securitySecuritydomain
complianceCompliancedomain
costCost Optimizationdomain
performancePerformancedomain
reliabilityReliabilitydomain

ADK validates classification.domain in every agent, skill, tool, and domain lens against this registry before accepting registration.


2.14 Evidence (design intent — not yet implemented)

WARNING

This section documents the target Evidence model decided in the 2026-05-13 design review. The Evidence collection does not exist in the running CE API (context_engine_api.py) today. Storage location, retention, and signing are pending engineering sign-off. Source spec: /playbooks/evidence-reports and audit-2026-05-07/evidence-design-spec.md in the repo.

Purpose: Stores immutable, typed proof for every claim in a Report Item, Finding, Plan, Bundle, or Run. Each record carries native cloud-console deep links so any claim can be verified in the source-of-truth cloud UI with one click.

Collection: escher_evidence_<tenant_id> (proposed) Mutability: append-only — never overwritten Indexes: primary on id, secondary on each attached_to.* field, on (tenant_id, captured_at), on (source.system, source.native_id)

json
{
  "_id": "ev_01HX...",
  "schema_version": 1,
  "_created_at": "2026-05-13T09:14:22Z",

  "type": "cloudtrail_event",
  "captured_at": "2026-05-13T09:14:22Z",
  "estate_view_id": "ev_42",

  "source": {
    "system": "aws.cloudtrail",
    "region": "us-east-1",
    "account_id": "123456789012",
    "tenant_id": null,
    "native_id": "e-1234-5678-90ab",
    "api_call": "lookup-events"
  },

  "console_links": [
    {
      "label": "View event in CloudTrail",
      "href": "https://us-east-1.console.aws.amazon.com/cloudtrail/home?region=us-east-1#/events/e-1234-5678-90ab"
    }
  ],

  "payload": {
    "EventId": "e-1234-5678-90ab",
    "EventName": "UpdateAccessKey",
    "EventTime": "2026-05-13T09:13:00Z",
    "Username": "bot-deploy",
    "...": "..."
  },

  "summary": "IAM user 'bot-deploy' rotated its access key at 09:13 UTC.",

  "attached_to": [
    { "report_item_id": "ri_..." },
    { "finding_id": "fnd_..." }
  ],

  "tenant_id": "acme_corp",
  "redaction_class": "standard"
}

Type enum (closed set — see Evidence & Reports for the full list):

cloudtrail_event · azure_activity_log · config_snapshot · iam_policy · billing_line · metric_point · log_line · deployment_record · pr_record · commit_record · approval_record · api_call · manual_note

console_links[] requirement: at least one entry per record. Generated at capture time by a pure function over (type, source, native_id). AWS Console and Azure Portal URL patterns are documented in Evidence & Reports.

Embedding strategy: none. Evidence is retrieved by direct lookup (id) or filter (attached_to.*, type, captured_at range, source.system). Not searched semantically — the citing artifact (Finding, Report Item) already carries the semantic signal.


Appendix — Design decisions

Why Skill is searched first at Phase 2C (not Agent Registry)

Skill already has purpose + description + capability_id embeddings — exactly what Phase 2C needs for routing. Agent Registry is only called on a Skill miss to provide Code Agent with domain boundaries for generation.

Skill collection   → primary routing — Phase 2C Step 1
Agent Registry     → Code Agent grounding — Phase 2C/2W miss only

Why Guardrails and Templates have no embedding

Guardrails are retrieved by exact scope hierarchy (skill → agent → domain → global). Semantic search would add noise and risk returning wrong guardrails. Templates are the same — output_type + skill_id → agent_id → domain — exact match, no ambiguity.

Why Context Builder collection is deprecated

Context collection fields are now embedded directly in the Skill execution_plan.steps and context block. The Skill document is self-contained. Phase 6 reads it directly — no separate CE call, no separate collection.

Before: Skill → context_builder_ids → Phase 6 CE /resolve/context → Context Builder
After:  Skill → execution_plan.steps + context block → Phase 6 direct

Next steps

Escher — Agentic CloudOps by Tessell