Skip to content

CI/CD

Developer Reference

This page covers internal implementation details. It is not included in the User Guide.

Escher uses v3-escher-deployment as the central CI/CD orchestration layer. All 63 repos in escher-dbai are built and deployed through a single shared pipeline using GitHub Actions with OIDC-based AWS authentication.


Pipeline overview

Git push / tag push


GitHub Actions  (v3-escher-deployment/.github/workflows/)

        ├─ detect changed services ──▶ matrix build (parallel)
        │                                  │
        │                          build → test → ECR push
        │                                  │
        ├─ tag convention ─────────────────┼──────────────────────────────────
        │   v*-dev  → deploy-dev           │
        │   v*-stg  → deploy-staging       │
        │   v*      → deploy-prod          │
        │                                  ▼
        └─ deploy ──────────────▶  ECS update-service (per component)


                                   Slack notification
                                  (success / failure)

Repository structure

v3-escher-deployment/
  .github/workflows/
    deploy-dev.yml         ← tag v*-dev triggers this
    deploy-staging.yml     ← tag v*-stg triggers this
    deploy-prod.yml        ← tag v* (no suffix) triggers this
    build-matrix.yml       ← reusable — parallel image builds
    e2e-integration.yml    ← runs v3-testing-framework after deploy
    release-notes.yml      ← runs v2-release-notes on prod deploy
  deploy.py                ← Python orchestrator (called by workflows)
  services/                ← per-service configs (image names, health paths)
  environments/            ← dev / staging / prod variable files
  scripts/
    build.sh               ← docker buildx build + ECR push
    migrate.sh             ← Asset Store schema migration
    smoke-test.sh          ← post-deploy health checks

Branch and tag conventions

Tag patternDeploys toExample
v*-devDevelopmentv2.16.0-dev
v*-stgStagingv2.16.0-stg
v* (no suffix)Productionv2.16.0

Branch naming for component repos:

BranchPurpose
mainStable production code
developIntegration branch
feature/TICKET-descFeature work
fix/TICKET-descBug fixes

AWS authentication (OIDC)

The pipeline uses GitHub Actions OIDC — no long-lived AWS credentials are stored in GitHub Secrets. Each workflow assumes a deployment role via a trust policy that validates the GitHub token's repository and ref claims.

yaml
# Shared OIDC auth step (in every deploy workflow)
- name: Configure AWS credentials
  uses: aws-actions/configure-aws-credentials@v4
  with:
    role-to-assume: arn:aws:iam::ACCOUNT:role/escher-github-deploy
    role-session-name: github-${{ github.run_id }}
    aws-region: us-west-1

IAM trust policy on the escher-github-deploy role:

json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::ACCOUNT:oidc-provider/token.actions.githubusercontent.com"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "token.actions.githubusercontent.com:aud": "sts.amazonaws.com"
        },
        "StringLike": {
          "token.actions.githubusercontent.com:sub": "repo:escher-dbai/v3-escher-deployment:*"
        }
      }
    }
  ]
}

Parallel matrix build

Component images are built in parallel using a GitHub Actions matrix strategy. Each matrix entry builds, tests, and pushes its own Docker image independently.

yaml
# build-matrix.yml (excerpt)
strategy:
  matrix:
    service:
      - name: gateway
        repo: v2-gateway-service-java
        dockerfile: Dockerfile
        ecr_repo: escher-gateway
      - name: analysis-agent
        repo: v2-analysis-agent-go
        dockerfile: Dockerfile
        ecr_repo: escher-analysis-agent
      - name: playbook-agent
        repo: v2-playbook-agent-go
        dockerfile: Dockerfile
        ecr_repo: escher-playbook-agent
      - name: context-engine
        repo: v4-context-engine
        dockerfile: Dockerfile
        ecr_repo: escher-context-engine
      - name: asset-store
        repo: v4-asset-store-go
        dockerfile: Dockerfile
        ecr_repo: escher-asset-store
      - name: autoresolver
        repo: v2-autoresolver-java
        dockerfile: Dockerfile
        ecr_repo: escher-autoresolver
      - name: integrations-agent
        repo: v4-integrations-agent-go
        dockerfile: Dockerfile
        ecr_repo: escher-integrations-agent

Skip toggles allow deploying a subset of services:

yaml
# Input params on workflow_dispatch (manual trigger)
inputs:
  skip_gateway:          { type: boolean, default: false }
  skip_analysis_agent:   { type: boolean, default: false }
  skip_playbook_agent:   { type: boolean, default: false }
  skip_context_engine:   { type: boolean, default: false }
  skip_asset_store:      { type: boolean, default: false }
  skip_autoresolver:     { type: boolean, default: false }
  skip_integrations:     { type: boolean, default: false }
  skip_e2e:              { type: boolean, default: false }

Deploy dev workflow (annotated)

yaml
# .github/workflows/deploy-dev.yml
name: Deploy — Dev

on:
  push:
    tags: ['v*-dev']
  workflow_dispatch:
    inputs:
      # skip_* toggles as above

jobs:
  build:
    uses: ./.github/workflows/build-matrix.yml   # parallel image builds
    with:
      environment: dev
      tag: ${{ github.ref_name }}
    secrets: inherit

  deploy:
    needs: build
    runs-on: ubuntu-latest
    environment: dev
    steps:
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ vars.DEPLOY_ROLE_ARN }}
          aws-region: us-west-1

      - name: Deploy services (ordered)
        run: python deploy.py --env dev --tag ${{ github.ref_name }}
        # deploy.py deploys in dependency order:
        # 1. asset-store  (no deps)
        # 2. context-engine  (needs asset-store healthy)
        # 3. analysis-agent, playbook-agent, integrations-agent  (parallel)
        # 4. gateway  (needs agents healthy)
        # 5. autoresolver  (independent)

      - name: Smoke test
        run: ./scripts/smoke-test.sh dev

      - name: Notify Slack
        if: always()
        uses: slackapi/slack-github-action@v1.26.0
        with:
          payload: |
            {
              "text": "${{ job.status == 'success' && '✅' || '❌' }} Deploy dev ${{ github.ref_name }} — ${{ job.status }}",
              "channel": "#escher-deployments"
            }
        env:
          SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}

E2E integration

After every successful deploy to dev or staging, the pipeline triggers the v3 testing framework:

yaml
# e2e-integration.yml
- name: Run E2E suite
  run: |
    cd v3-testing-framework
    python run_tests.py --env ${{ inputs.environment }} --suite smoke

E2E results are posted to the #escher-test-results Slack channel.


Release notes generation

On production deploys, the pipeline runs v2-release-notes to generate and publish release notes:

yaml
- name: Generate release notes
  run: |
    cd v2-release-notes
    python generate_release_notes.py \
      --repos gateway,analysis-agent,context-engine,asset-store \
      --from-tag ${{ env.PREV_TAG }} \
      --to-tag ${{ github.ref_name }} \
      --publish-slack

The tool uses Claude 4.5 Sonnet to summarize commits across repos into human-readable release notes, then posts to #escher-releases via Slack Block Kit.


Cutting a release

bash
# 1. Merge all feature branches to develop
# 2. Test on staging (v*-stg tag)
git tag v2.16.0-stg
git push origin v2.16.0-stg

# 3. After staging validation, cut production release
git tag v2.16.0
git push origin v2.16.0

# Pipeline automatically:
# - Builds all images
# - Deploys in dependency order
# - Runs E2E smoke suite
# - Generates + publishes release notes
# - Posts to #escher-releases

Environment variable management

Variable typeWhere stored
AWS resource ARNsGitHub Environment variables (per env)
Secrets (API keys, tokens)AWS Secrets Manager — injected via ECS secretsFrom
Non-secret configenvironments/dev.env, staging.env, prod.env in v3-escher-deployment

No secrets are stored in GitHub Secrets (except SLACK_BOT_TOKEN for notifications). All runtime secrets are fetched from AWS Secrets Manager at task start.


Next steps

Escher — Agentic CloudOps by Tessell