CI/CD
Developer Reference
This page covers internal implementation details. It is not included in the User Guide.
Escher uses v3-escher-deployment as the central CI/CD orchestration layer. All 63 repos in escher-dbai are built and deployed through a single shared pipeline using GitHub Actions with OIDC-based AWS authentication.
Pipeline overview
Git push / tag push
│
▼
GitHub Actions (v3-escher-deployment/.github/workflows/)
│
├─ detect changed services ──▶ matrix build (parallel)
│ │
│ build → test → ECR push
│ │
├─ tag convention ─────────────────┼──────────────────────────────────
│ v*-dev → deploy-dev │
│ v*-stg → deploy-staging │
│ v* → deploy-prod │
│ ▼
└─ deploy ──────────────▶ ECS update-service (per component)
│
▼
Slack notification
(success / failure)Repository structure
v3-escher-deployment/
.github/workflows/
deploy-dev.yml ← tag v*-dev triggers this
deploy-staging.yml ← tag v*-stg triggers this
deploy-prod.yml ← tag v* (no suffix) triggers this
build-matrix.yml ← reusable — parallel image builds
e2e-integration.yml ← runs v3-testing-framework after deploy
release-notes.yml ← runs v2-release-notes on prod deploy
deploy.py ← Python orchestrator (called by workflows)
services/ ← per-service configs (image names, health paths)
environments/ ← dev / staging / prod variable files
scripts/
build.sh ← docker buildx build + ECR push
migrate.sh ← Asset Store schema migration
smoke-test.sh ← post-deploy health checksBranch and tag conventions
| Tag pattern | Deploys to | Example |
|---|---|---|
v*-dev | Development | v2.16.0-dev |
v*-stg | Staging | v2.16.0-stg |
v* (no suffix) | Production | v2.16.0 |
Branch naming for component repos:
| Branch | Purpose |
|---|---|
main | Stable production code |
develop | Integration branch |
feature/TICKET-desc | Feature work |
fix/TICKET-desc | Bug fixes |
AWS authentication (OIDC)
The pipeline uses GitHub Actions OIDC — no long-lived AWS credentials are stored in GitHub Secrets. Each workflow assumes a deployment role via a trust policy that validates the GitHub token's repository and ref claims.
# Shared OIDC auth step (in every deploy workflow)
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::ACCOUNT:role/escher-github-deploy
role-session-name: github-${{ github.run_id }}
aws-region: us-west-1IAM trust policy on the escher-github-deploy role:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::ACCOUNT:oidc-provider/token.actions.githubusercontent.com"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"token.actions.githubusercontent.com:aud": "sts.amazonaws.com"
},
"StringLike": {
"token.actions.githubusercontent.com:sub": "repo:escher-dbai/v3-escher-deployment:*"
}
}
}
]
}Parallel matrix build
Component images are built in parallel using a GitHub Actions matrix strategy. Each matrix entry builds, tests, and pushes its own Docker image independently.
# build-matrix.yml (excerpt)
strategy:
matrix:
service:
- name: gateway
repo: v2-gateway-service-java
dockerfile: Dockerfile
ecr_repo: escher-gateway
- name: analysis-agent
repo: v2-analysis-agent-go
dockerfile: Dockerfile
ecr_repo: escher-analysis-agent
- name: playbook-agent
repo: v2-playbook-agent-go
dockerfile: Dockerfile
ecr_repo: escher-playbook-agent
- name: context-engine
repo: v4-context-engine
dockerfile: Dockerfile
ecr_repo: escher-context-engine
- name: asset-store
repo: v4-asset-store-go
dockerfile: Dockerfile
ecr_repo: escher-asset-store
- name: autoresolver
repo: v2-autoresolver-java
dockerfile: Dockerfile
ecr_repo: escher-autoresolver
- name: integrations-agent
repo: v4-integrations-agent-go
dockerfile: Dockerfile
ecr_repo: escher-integrations-agentSkip toggles allow deploying a subset of services:
# Input params on workflow_dispatch (manual trigger)
inputs:
skip_gateway: { type: boolean, default: false }
skip_analysis_agent: { type: boolean, default: false }
skip_playbook_agent: { type: boolean, default: false }
skip_context_engine: { type: boolean, default: false }
skip_asset_store: { type: boolean, default: false }
skip_autoresolver: { type: boolean, default: false }
skip_integrations: { type: boolean, default: false }
skip_e2e: { type: boolean, default: false }Deploy dev workflow (annotated)
# .github/workflows/deploy-dev.yml
name: Deploy — Dev
on:
push:
tags: ['v*-dev']
workflow_dispatch:
inputs:
# skip_* toggles as above
jobs:
build:
uses: ./.github/workflows/build-matrix.yml # parallel image builds
with:
environment: dev
tag: ${{ github.ref_name }}
secrets: inherit
deploy:
needs: build
runs-on: ubuntu-latest
environment: dev
steps:
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ vars.DEPLOY_ROLE_ARN }}
aws-region: us-west-1
- name: Deploy services (ordered)
run: python deploy.py --env dev --tag ${{ github.ref_name }}
# deploy.py deploys in dependency order:
# 1. asset-store (no deps)
# 2. context-engine (needs asset-store healthy)
# 3. analysis-agent, playbook-agent, integrations-agent (parallel)
# 4. gateway (needs agents healthy)
# 5. autoresolver (independent)
- name: Smoke test
run: ./scripts/smoke-test.sh dev
- name: Notify Slack
if: always()
uses: slackapi/slack-github-action@v1.26.0
with:
payload: |
{
"text": "${{ job.status == 'success' && '✅' || '❌' }} Deploy dev ${{ github.ref_name }} — ${{ job.status }}",
"channel": "#escher-deployments"
}
env:
SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}E2E integration
After every successful deploy to dev or staging, the pipeline triggers the v3 testing framework:
# e2e-integration.yml
- name: Run E2E suite
run: |
cd v3-testing-framework
python run_tests.py --env ${{ inputs.environment }} --suite smokeE2E results are posted to the #escher-test-results Slack channel.
Release notes generation
On production deploys, the pipeline runs v2-release-notes to generate and publish release notes:
- name: Generate release notes
run: |
cd v2-release-notes
python generate_release_notes.py \
--repos gateway,analysis-agent,context-engine,asset-store \
--from-tag ${{ env.PREV_TAG }} \
--to-tag ${{ github.ref_name }} \
--publish-slackThe tool uses Claude 4.5 Sonnet to summarize commits across repos into human-readable release notes, then posts to #escher-releases via Slack Block Kit.
Cutting a release
# 1. Merge all feature branches to develop
# 2. Test on staging (v*-stg tag)
git tag v2.16.0-stg
git push origin v2.16.0-stg
# 3. After staging validation, cut production release
git tag v2.16.0
git push origin v2.16.0
# Pipeline automatically:
# - Builds all images
# - Deploys in dependency order
# - Runs E2E smoke suite
# - Generates + publishes release notes
# - Posts to #escher-releasesEnvironment variable management
| Variable type | Where stored |
|---|---|
| AWS resource ARNs | GitHub Environment variables (per env) |
| Secrets (API keys, tokens) | AWS Secrets Manager — injected via ECS secretsFrom |
| Non-secret config | environments/dev.env, staging.env, prod.env in v3-escher-deployment |
No secrets are stored in GitHub Secrets (except SLACK_BOT_TOKEN for notifications). All runtime secrets are fetched from AWS Secrets Manager at task start.
Next steps
- ECS Deployment — ECS cluster, task definitions, and IAM roles
- Docker Compose — Local development without CI/CD
- Changelog — Release history