Schema Reference

Developer Reference

This page covers internal implementation details. It is not included in the User Guide.

Complete JSON schema for all 13 Context Engine collections. These are the authoritative data structures populated by adk register and read at runtime by the Analysis Agent, Playbook Agent, and Gateway.

Collection map

#	Collection	Formula variable	Phase accessed	Embedding
2.2	Skill	`Skill(x)`	2C Step 1 (search) + Phase 3 (full)	purpose + description + capabilities + context_descriptions
2.4	Tool	`Tool_readonly` / `Tool_write`	Phase 6 + 5A State 7	none — filter by tool_id / tool_class
2.5	Guardrail	`Guardrail`	Phase 7 + 5A State 7	none — filter only
2.6	Domain Lens	`DomainLens`	Phase 7 (RAG)	source ref only — CE handles RAG internally
2.7	Template	`Template`	Phase 7 (AA resolves once)	none — filter only
2.8	Playbook	`Playbook(p)`	2C write Step 1 + Phase 3 + 5A	trigger_conditions + name
2.10	Tag Store	`tags` (internal)	All phases (request-scoped)	none
2.11	Agent Registry	Agent manifest	Phase 2C/2W miss only	capabilities[]
2.12	Domain Expert Graph	`ExpertGraph`	Phase 7 (advanced, on-demand)	none — graph traversal
2.13	Domain Registry	—	ADK registration only	none — exact match
2.14	Evidence (design intent)	—	All artifact stages (Report Item / Finding / Plan / Run)	none — direct lookup by `id`; indexed by `attached_to.*`

~~Intent (2.1)~~ and ~~Context Builder (2.3)~~ are deprecated. Cloud Knowledge (2.9) is merged into Domain Lens (§2.6) via content_type: "cloud_service".
⚠ Evidence (2.14) is design intent — not yet implemented in the running CE API. See §2.14 below and the Evidence & Reports design spec. The collection name, storage location (CE collection vs separate write-optimised store), and indexing strategy are pending engineering sign-off.

2.2 Skill

Formula variable: Skill(x)Phase: 2C Step 1 (semantic search) → Phase 3 (full manifest retrieval) Collection: escher_skills_<tenant_id> → fallback escher_skills_global

json

{
  "_id": "security.detect_public_ingress",
  "_embedding": [0.034, -0.121, 0.095, "...768 dims"],
  "_created_at": "2024-01-15T10:00:00Z",
  "_updated_at": "2024-01-15T10:00:00Z",
  "_version": "0.1.0",

  "skill_id":    "security.detect_public_ingress",
  "name":        "detect_public_ingress",
  "domain":      "security",
  "tier":        "basic",
  "tenant_id":   null,
  "status":      "active",

  "display_name":  "Detect Public Network Exposure",
  "description":   "Identifies EC2 security groups, NACLs, and ALBs with rules that allow unrestricted inbound access (0.0.0.0/0 or ::/0).",
  "purpose":       "find cloud resources exposed to the public internet through misconfigured network controls",

  "capabilities": [
    "detect security groups with 0.0.0.0/0 ingress rules",
    "identify public-facing load balancers without WAF",
    "flag NACLs with overly permissive allow rules",
    "rank exposure findings by severity"
  ],
  "capability_id": "security.network_exposure_detection",

  "context": {
    "context_descriptions": [
      "current security group configurations and their ingress/egress rules",
      "list of internet-facing load balancers and their WAF associations"
    ],
    "supported_context_types": [
      "public_exposure_inventory",
      "resource_scope_summary",
      "environment_scope"
    ]
  },

  "output_type": "finding",

  "execution_plan": {
    "steps": [
      {
        "step_id": "step_1",
        "type":    "read",
        "tool_id": "aws.describe_security_groups",
        "description": "Fetch all security groups in scope",
        "cache_ttl_seconds": 300
      },
      {
        "step_id": "step_2",
        "type":    "read",
        "tool_id": "aws.describe_load_balancers",
        "description": "Fetch load balancers and WAF associations",
        "cache_ttl_seconds": 300
      }
    ]
  },

  "approach_hints": [
    "focus on rules allowing 0.0.0.0/0 — these are definitive exposure",
    "flag ::/0 IPv6 as equivalent severity",
    "exclude expected public-facing resources (CDN origins) if tagged appropriately"
  ]
}

Embedding strategy:

Combined vector from: purpose + description + display_name + capability_id + capabilities[] + context_descriptions[]
Searched at Phase 2C Step 1 (routing decision: narrow / broad / miss)
Full document fetched at Phase 3 using skill_id exact match

Tenant scope: tenant → global. Tenant skills take priority; global skills are the fallback.

2.4 Tool

Formula variable: Tool_readonly / Tool_writePhase: Phase 6 (readonly tools for context) + Phase 5A State 7 (write tools for playbook execution) Collection: escher_tools_global

json

{
  "_id": "aws.describe_security_groups",
  "_created_at": "2024-01-15T10:00:00Z",

  "tool_id":      "aws.describe_security_groups",
  "name":         "describe_security_groups",
  "provider":     "aws",
  "service":      "ec2",
  "tool_class":   "readonly",
  "domain":       "security",

  "description": "Returns all EC2 security groups in the given region and account, including their ingress and egress rules.",

  "parameters": [
    {
      "name":        "region",
      "type":        "string",
      "required":    true,
      "description": "AWS region code",
      "example":     "us-east-1"
    },
    {
      "name":        "profile",
      "type":        "string",
      "required":    true,
      "description": "AWS CLI profile name from the estate",
      "example":     "prod_infra"
    },
    {
      "name":        "filters",
      "type":        "object",
      "required":    false,
      "description": "Optional filter map passed to describe-security-groups"
    }
  ],

  "output_schema": {
    "type":   "array",
    "items":  { "type": "object" },
    "description": "List of SecurityGroup objects from the AWS EC2 API"
  },

  "execution_target": "client",
  "language":         "python",
  "requires_approval": false
}

tool_class values:

Value	When used	`requires_approval`
`readonly`	Phase 6 — fetches context data	`false`
`write`	Phase 5A State 7 — modifies cloud resources	`true` if `safety_class: supervised`
`internal`	CE internal use only — not exposed to AA	—

2.5 Guardrail

Formula variable: GuardrailPhase: Phase 7 (AA) + Phase 5A State 7 (playbook execution gate) Collection: escher_guardrails_<tenant_id> → fallback escher_guardrails_global

json

{
  "_id": "security.no_public_write_without_approval",
  "_created_at": "2024-01-15T10:00:00Z",

  "guardrail_id": "security.no_public_write_without_approval",
  "domain":       "security",
  "scope":        "domain",
  "applies_to":   ["security.*"],

  "rule": "Any write operation that modifies security group ingress rules, S3 bucket policies, or IAM role trust policies requires explicit human approval before execution.",

  "enforcement": "hard_block",
  "message":     "This change affects your public exposure surface. Review the proposed modification before approving.",

  "exceptions": [
    {
      "condition": "dry_run == true",
      "allow": true
    }
  ],

  "tenant_id": null
}

Scope resolution hierarchy (AA resolves in order, applies all matches):

skill-level  →  agent-level  →  domain-level  →  global

enforcement values:

Value	Effect
`hard_block`	AA cannot proceed; returns guardrail message to user
`soft_warn`	AA includes warning in output but proceeds
`require_approval`	Playbook step pauses for human approval

2.6 Domain Lens

Formula variable: DomainLensPhase: Phase 7 (AA — advanced tier, on-demand RAG) Collection: escher_domain_lens_<tenant_id> → fallback escher_domain_lens_global

json

{
  "_id": "security.soc2_trust_principles",
  "_created_at": "2024-01-15T10:00:00Z",
  "_content_hash": "sha256:abc123...",

  "lens_id":      "security.soc2_trust_principles",
  "domain":       "security",
  "content_type": "framework",
  "tier":         "advanced",
  "tenant_id":    null,

  "title":       "SOC 2 Trust Services Criteria",
  "description": "AICPA SOC 2 Trust Services Criteria — CC series controls for security, availability, processing integrity, confidentiality, and privacy.",

  "source": {
    "type": "url",
    "ref":  "https://internal-docs.escher.internal/frameworks/soc2-tsc-2017.pdf"
  },

  "tags": ["soc2", "compliance", "audit", "cc-series"]
}

content_type values:

Value	Description
`framework`	Compliance frameworks (SOC2, PCI-DSS, ISO 27001, HIPAA, GDPR)
`cloud_service`	AWS/Azure/GCP service concepts (merged from Cloud Knowledge §2.9)
`policy`	Internal policies and runbooks
`reference`	General domain reference documents

CE owns the RAG infrastructure. At registration time, CE fetches source.ref, chunks the content (512-token overlapping windows), embeds the chunks, and indexes them internally. The Domain Lens document holds only the reference and content_hash for change detection — CE re-indexes automatically if the hash changes on re-registration.

2.7 Template

Formula variable: TemplatePhase: Phase 7 (AA resolves once; UI Agent receives at Phase 9 — no second CE call) Collection: escher_templates_<tenant_id> → fallback escher_templates_global

json

{
  "_id": "security.finding.public_exposure",
  "_created_at": "2024-01-15T10:00:00Z",

  "template_id":   "security.finding.public_exposure",
  "domain":        "security",
  "output_type":   "finding",
  "skill_id":      "security.detect_public_ingress",
  "tier":          "basic",
  "tenant_id":     null,

  "structure": {
    "title":         "{{resource_type}} publicly exposed via {{rule_type}}",
    "severity":      "{{severity}}",
    "summary":       "{{resource_id}} in {{region}} has an ingress rule allowing {{cidr}} on port {{port}}.",
    "impact":        "{{impact_description}}",
    "recommendation": "Restrict the ingress rule to known CIDR ranges or remove it if not required.",
    "evidence_refs": ["{{evidence_id}}"]
  },

  "rendering_hints": {
    "severity_color": {
      "critical": "#FF0000",
      "high":     "#FA504A",
      "medium":   "#FFA500",
      "low":      "#FFD700",
      "info":     "#808080"
    }
  }
}

Resolution fallback chain:

skill_id match → agent match → domain match → global default

Templates have no embedding — resolution is exact match only.

2.8 Playbook

Formula variable: Playbook(p)Phase: 2C write Step 1 (semantic search) + Phase 3 + 5A (full manifest) Collection: escher_playbooks_<tenant_id> → fallback escher_playbooks_global

json

{
  "_id": "security.remediate_public_exposure",
  "_embedding": [0.021, -0.045, "...768 dims"],
  "_version": "0.1.0",

  "playbook_id":   "security.remediate_public_exposure",
  "name":          "Remediate Public Exposure",
  "domain":        "security",
  "tier":          "basic",
  "is_candidate":  false,
  "generated_by":  "registered",
  "tenant_id":     null,

  "trigger_conditions": [
    "public_exposure_finding",
    "open_security_group_detected",
    "internet_facing_resource_unprotected"
  ],

  "context": {
    "context_descriptions": [
      "security group config — ingress rules before write execution",
      "public exposure inventory — resources targeted for remediation"
    ],
    "supported_context_types": [
      "public_exposure_inventory",
      "security_group_config"
    ],
    "declared_context_builders": [
      "security.public_exposure_context"
    ]
  },

  "steps": [
    {
      "step_id":    "step_1",
      "name":       "Fetch current exposure surface",
      "type":       "readonly",
      "tool_id":    "aws.describe_public_ingress_surface",
      "required":   true,
      "on_failure": "stop"
    },
    {
      "step_id":    "step_2",
      "name":       "Lock overly permissive security groups",
      "type":       "write",
      "tool_id":    "aws.lock_security_group",
      "required":   true,
      "on_failure": "rollback"
    }
  ],

  "scripts": [
    {
      "step_id":    "step_1",
      "language":   "python",
      "body":       "import boto3\nsession = boto3.Session(profile_name='${user.profile}', region_name='${user.region}')\nec2 = session.client('ec2')\nreturn ec2.describe_security_groups()",
      "params_used": ["profile", "region"]
    },
    {
      "step_id":    "step_2",
      "language":   "python",
      "body":       "import boto3\nsession = boto3.Session(profile_name='${user.profile}', region_name='${user.region}')\nec2 = session.client('ec2')\nec2.revoke_security_group_ingress(GroupId='${sg_id}', IpPermissions=rules)",
      "params_used": ["profile", "region"]
    }
  ],

  "target_language": "python",

  "mandatory_parameters": [
    { "name": "region",  "type": "string",  "description": "AWS region to remediate",  "example": "us-east-1" },
    { "name": "profile", "type": "string",  "description": "AWS CLI profile",           "example": "prod_infra" }
  ],

  "optional_parameters": [
    { "name": "dry_run", "type": "boolean", "description": "Run without making changes", "default": false }
  ],

  "rollback_steps": [
    { "step_id": "rollback_2", "script_id": "step_2_rollback", "reverses": "step_2" }
  ],

  "approach_hints": [
    "prefer least-privilege remediation",
    "avoid broad policy changes"
  ],

  "execution_timeout": 600,
  "rollback_support": true,

  "safety": {
    "safety_class":               "supervised",
    "requires_human_review_for":  ["step_2"]
  },

  "evidence_requirements": [
    "security_group_state_before",
    "security_group_state_after",
    "api_calls_made"
  ]
}

Embedding strategy:

trigger_conditions[] joined as text + name + domain → single dense vector
Phase 2C write Step 1: semantic search → returns playbook_id + confidence
Phase 3 / 5A: direct lookup by playbook_id → full document

2.10 Tag Store

Internal only — not exposed to the Framework directly.Phase: Written on every request; read by AA at Phase 7 for guardrail/template resolution. Collection: escher_tag_store (TTL auto-delete on expires_at)

json

{
  "_id": "req_abc123",
  "expires_at": "2024-01-15T11:00:00Z",

  "request_id":   "req_abc123",
  "tenant_id":    "acme_corp",
  "session_id":   "sess_xyz789",
  "user_id":      "user_456",
  "tier":         "advanced",

  "flow":         "skill",
  "skill_id":     "security.detect_public_ingress",
  "playbook_id":  null,
  "domain":       "security",
  "persona":      "devops",
  "output_type":  "finding",

  "execution_location": "client",
  "owner_agent_id":     "domain.security.exposure"
}

flow values and what they write:

flow	Fields written
`skill`	skill_id, owner_agent_id, domain, tier, persona, execution_location, output_type
`read_broad`	skill_ids[], domain, tier, persona
`rag_search`	rag_params, provider, domain, persona
`knowledge`	domain, persona — required so AA can resolve guardrails + template at Phase 7
`write`	playbook_id, domain, tier, persona, tool_ids, param_set
`code_gen_read` / `code_gen_write` / `code_gen_direct`	domain, persona

WARNING

domain and persona must be written for every flow — including knowledge flow. AA uses both at Phase 7 to call /resolve/guardrails and /resolve/template.

2.11 Agent Registry

Formula variable: Agent manifest + capabilities (Code Agent grounding only) Phase: Phase 2C miss + 3W miss only — not searched on every prompt Collection: escher_agent_registry_global

json

{
  "_id": "domain.security.exposure",
  "_embedding": [0.034, -0.121, 0.095, "...768 dims"],

  "agent_id":     "domain.security.exposure",
  "name":         "exposure",
  "display_name": "Security Exposure Agent",
  "agent_type":   "domain",
  "domain":       "security",
  "tier_support": ["basic", "advanced"],
  "status":       "active",
  "tenant_id":    null,

  "capabilities": [
    "detect public ingress and network exposure risks",
    "detect public storage access and open S3 buckets",
    "rank and prioritize exposure findings by severity",
    "suggest basic remediation paths for exposure risks"
  ],
  "capabilities_embedding": [0.034, -0.121, "...768 dims"],

  "supported_context_types": [
    "public_exposure_inventory",
    "resource_scope_summary",
    "environment_scope"
  ],

  "skill_refs": [
    "security.detect_public_ingress",
    "security.detect_public_storage_access",
    "security.rank_basic_exposure_findings"
  ],

  "composition": {
    "usable_in_profiles":    ["hero_admin", "cspm_deep"],
    "compatible_agents":     ["domain.security.remediation_planning"],
    "conflicts_with_agents": []
  },

  "version":   "0.1.0"
}

When is Agent Registry searched?

Skill collection   → primary routing (every prompt at Phase 2C Step 1)
Agent Registry     → Code Agent grounding (Phase 2C/2W miss only)
                     → provides capabilities + context boundary
                     → Code Agent generates skill or playbook within this boundary

supported_context_types is derived at ADK registration time as the union of all skill supported_context_types across the agent's skill_refs.

2.12 Domain Expert Graph

Formula variable: ExpertGraphPhase: Phase 7, advanced tier only, on-demand (AA decides at runtime) Graph: escher_domain_expert_graph (per-domain partitions)

Node types

Node Type	Key fields	Example
`Control`	`id`, `domain`, `framework`, `title`	SOC2 CC6.1
`Requirement`	`id`, `domain`, `description`	encryption_at_rest
`EvidenceType`	`id`, `domain`, `collection_method`	KMS_rotation_logs
`ResourceType`	`id`, `provider`, `service`	aws.kms.key
`Risk`	`id`, `domain`, `severity`	missing_encryption
`Remediation`	`id`, `domain`, `approach`	enable_kms_encryption

Edge types

Edge	From → To	Meaning
`requires`	Control → Requirement	Control demands this requirement
`evidenced_by`	Requirement → EvidenceType	Requirement satisfied by this evidence
`collected_via`	EvidenceType → ResourceType	Evidence collected from this resource type
`is_a`	Risk → Risk category	Risk classification
`remediated_by`	Risk → Remediation	Risk is addressed by this remediation

Example chain

SOC2_CC6.1  ──requires──▶  encryption_at_rest
                ──evidenced_by──▶  KMS_rotation_logs
                    ──collected_via──▶  aws.kms.key

missing_encryption  ──is_a──▶  data_risk
                    ──remediated_by──▶  enable_kms_encryption

Node JSON — Control example

json

{
  "id":        "soc2.cc6.1",
  "type":      "Control",
  "domain":    "security",
  "framework": "SOC2",
  "title":     "Logical and Physical Access Controls CC6.1",
  "tenant_id": null
}

Domain teams author expert_graph/*.yaml in their ADK package. ADK validates referential integrity and writes nodes + edges at adk register. Advanced tier packages only.

2.13 Domain Registry

Phase: ADK registration only — not called at runtime Collection: escher_domain_registry_global

json

{
  "_id": "security",
  "domain_id":    "security",
  "display_name": "Security",
  "description":  "Cloud security posture, exposure detection, and threat analysis",
  "domain_type":  "domain",
  "status":       "active",
  "tenant_id":    null
}

Seeded domains (present at platform init):

domain_id	display_name	domain_type
`platform`	Platform	`platform`
`security`	Security	`domain`
`compliance`	Compliance	`domain`
`cost`	Cost Optimization	`domain`
`performance`	Performance	`domain`
`reliability`	Reliability	`domain`

ADK validates classification.domain in every agent, skill, tool, and domain lens against this registry before accepting registration.

2.14 Evidence (design intent — not yet implemented)

WARNING

This section documents the target Evidence model decided in the 2026-05-13 design review. The Evidence collection does not exist in the running CE API (context_engine_api.py) today. Storage location, retention, and signing are pending engineering sign-off. Source spec: /playbooks/evidence-reports and audit-2026-05-07/evidence-design-spec.md in the repo.

Purpose: Stores immutable, typed proof for every claim in a Report Item, Finding, Plan, Bundle, or Run. Each record carries native cloud-console deep links so any claim can be verified in the source-of-truth cloud UI with one click.

Collection: escher_evidence_<tenant_id> (proposed) Mutability: append-only — never overwritten Indexes: primary on id, secondary on each attached_to.* field, on (tenant_id, captured_at), on (source.system, source.native_id)

json

{
  "_id": "ev_01HX...",
  "schema_version": 1,
  "_created_at": "2026-05-13T09:14:22Z",

  "type": "cloudtrail_event",
  "captured_at": "2026-05-13T09:14:22Z",
  "estate_view_id": "ev_42",

  "source": {
    "system": "aws.cloudtrail",
    "region": "us-east-1",
    "account_id": "123456789012",
    "tenant_id": null,
    "native_id": "e-1234-5678-90ab",
    "api_call": "lookup-events"
  },

  "console_links": [
    {
      "label": "View event in CloudTrail",
      "href": "https://us-east-1.console.aws.amazon.com/cloudtrail/home?region=us-east-1#/events/e-1234-5678-90ab"
    }
  ],

  "payload": {
    "EventId": "e-1234-5678-90ab",
    "EventName": "UpdateAccessKey",
    "EventTime": "2026-05-13T09:13:00Z",
    "Username": "bot-deploy",
    "...": "..."
  },

  "summary": "IAM user 'bot-deploy' rotated its access key at 09:13 UTC.",

  "attached_to": [
    { "report_item_id": "ri_..." },
    { "finding_id": "fnd_..." }
  ],

  "tenant_id": "acme_corp",
  "redaction_class": "standard"
}

Type enum (closed set — see Evidence & Reports for the full list):

cloudtrail_event · azure_activity_log · config_snapshot · iam_policy · billing_line · metric_point · log_line · deployment_record · pr_record · commit_record · approval_record · api_call · manual_note

console_links[] requirement: at least one entry per record. Generated at capture time by a pure function over (type, source, native_id). AWS Console and Azure Portal URL patterns are documented in Evidence & Reports.

Embedding strategy: none. Evidence is retrieved by direct lookup (id) or filter (attached_to.*, type, captured_at range, source.system). Not searched semantically — the citing artifact (Finding, Report Item) already carries the semantic signal.

Appendix — Design decisions

Why Skill is searched first at Phase 2C (not Agent Registry)

Skill already has purpose + description + capability_id embeddings — exactly what Phase 2C needs for routing. Agent Registry is only called on a Skill miss to provide Code Agent with domain boundaries for generation.

Skill collection   → primary routing — Phase 2C Step 1
Agent Registry     → Code Agent grounding — Phase 2C/2W miss only

Why Guardrails and Templates have no embedding

Guardrails are retrieved by exact scope hierarchy (skill → agent → domain → global). Semantic search would add noise and risk returning wrong guardrails. Templates are the same — output_type + skill_id → agent_id → domain — exact match, no ambiguity.

Why Context Builder collection is deprecated

Context collection fields are now embedded directly in the Skill execution_plan.steps and context block. The Skill document is self-contained. Phase 6 reads it directly — no separate CE call, no separate collection.

Before: Skill → context_builder_ids → Phase 6 CE /resolve/context → Context Builder
After:  Skill → execution_plan.steps + context block → Phase 6 direct

Next steps

Context Engine — REST API that serves these collections
agent.yaml Reference — How to author agent packages
ADK Reference — adk register populates these collections

Schema Reference ​

Collection map ​

2.2 Skill ​

2.4 Tool ​

2.5 Guardrail ​

2.6 Domain Lens ​

2.7 Template ​

2.8 Playbook ​

2.10 Tag Store ​

2.11 Agent Registry ​

2.12 Domain Expert Graph ​

Node types ​

Edge types ​

Example chain ​

Node JSON — Control example ​

2.13 Domain Registry ​

2.14 Evidence (design intent — not yet implemented) ​

Appendix — Design decisions ​

Why Skill is searched first at Phase 2C (not Agent Registry) ​

Why Guardrails and Templates have no embedding ​

Why Context Builder collection is deprecated ​

Next steps ​

Schema Reference

Collection map

2.2 Skill

2.4 Tool

2.5 Guardrail

2.6 Domain Lens

2.7 Template

2.8 Playbook

2.10 Tag Store

2.11 Agent Registry

2.12 Domain Expert Graph

Node types

Edge types

Example chain

Node JSON — Control example

2.13 Domain Registry

2.14 Evidence (design intent — not yet implemented)

Appendix — Design decisions

Why Skill is searched first at Phase 2C (not Agent Registry)

Why Guardrails and Templates have no embedding

Why Context Builder collection is deprecated

Next steps