Release management

Build once. Promote everywhere.

Same binary across every environment. crane copies at the registry level, digest verified after every promotion.

1 · Pipeline architecture

Forge uses a split pipeline. CI builds and pushes to Dev ECR. Classic Release handles promotion. Argo CD reconciles the cluster from the Ledger.

CI
YAML CI pipeline
Version-controlled in Git (Conveyor repo). Builds, runs Trivy scans (filesystem + image), pushes to Dev ECR with an immutable Build.BuildId tag. No latest.
CD
Classic Release
Azure DevOps UI (not in Git — accepted trade-off). Visual stage promotion: Dev → QA → Prod. One-click rollback. Approval gates per stage. Calls version-controlled scripts from conveyor/scripts/.
GIT
Argo CD · GitOps sync
Per-cluster Argo CD watches the Ledger. When the pipeline updates image.tag, Argo CD syncs automatically (dev/QA) or waits for manual sync (prod).
ComponentTypeIn Git?Purpose
application-ci-pipeline.ymlYAML templateYesBuild, Trivy scan, push to Dev ECR
Classic Release pipelineADO UINoPromote artifacts: Dev → QA → Prod
copy-artifact.shScriptYesCross-account image copy via crane
update-ledger.shScriptYesUpdate image.tag in Ledger manifest
ec2-deploy-pipeline.ymlYAML templateYesEC2 deployment via SSM (Dagster, etc.)

2 · Promotion flow

01
Build
CI build

Developer pushes to main. Trivy filesystem scan → language build (.NET / Python) → Docker build → Trivy image scan → push to Dev ECR with tag Build.BuildId.

02
Dev · auto
Dev stage

Classic Release auto-triggers. Runs update-ledger.sh on the dev manifest. Argo CD syncs within 30 seconds.

03
QA · click-to-promote
QA stage

Runs copy-artifact.sh to copy the image from Dev ECR to QA ECR via crane (digest verified). Then update-ledger.sh on the QA manifest. Argo CD syncs.

04
Prod · manual + approval
Prod stage

Requires platform-engineer approval (no self-approve). Copies artifact to Prod ECR, updates Ledger. Argo CD does not auto-sync — engineer manually syncs after verification.

Same binary everywhere: the Docker image built once in CI is the exact binary that runs in all environments. crane copies at the registry level — no Docker daemon, no re-build. Digest is verified after every copy (SOC 2 compliant).

3 · Key policies

T
Immutable tags
Build.BuildId only. No latest. ECR repos configured with immutable tags. The CI template enforces this — no override.
E
No ECR auto-create
ECR repos are provisioned by the Platform team via Blueprint. Pipeline fails with a clear error pointing developers to #platform-support.
L
Ledger as source of truth
The Ledger contains Argo CD Application manifests for every service in every environment. Pipeline updates image.tag via update-ledger.sh with optimistic locking and retry.
O
OIDC authentication
No static credentials. Pipelines authenticate via OIDC to a hub role in security-tooling, then assume ConveyorDeployRole-<env> in the target account.

4 · Environment configuration

EnvironmentCI triggerPromotionApprovalArgo CD sync
DevPush to mainAutomaticNoneAuto
QAClick-to-promote or autoOptional (QA lead)Auto
Prod-WealthManual + approvalRequired (no self-approve)Manual only

5 · EC2 deployments

EC2 services (e.g. Dagster) use ec2-deploy-pipeline.yml instead of Ledger / Argo CD:

  1. CI builds and pushes to ECR (same template as EKS services)
  2. Deploy template sends docker-compose.yml to the EC2 via SSM Run Command
  3. EC2 runs forge-refresh-env — pulls config from SSM Parameter Store
  4. IMAGE_TAG is written to .env (Build ID from pipeline)
  5. docker-compose pull, then docker-compose up -d restarts

EC2s are targeted by tag (e.g. forge:dagster-enabled=true). No SSH — all operations go through SSM.

6 · Rollback strategy

EKS services

MethodSpeedUse when
Classic Release: redeploy previous stageMinutesStandard rollback
Argo CD UI: click RollbackSecondsEmergency — prod has no auto-sync, stays rolled back
Ledger PR: revert image.tagMinutesFormalise the rollback in Git
git revert on LedgerMinutesNuclear — revert an entire promotion

EC2 services

MethodSpeedUse when
Classic Release: redeploy previous stageMinutesStandard rollback
SSM: manual docker-compose pull with old tagMinutesEmergency