Skip to main content

CI/CD Architecture

This document provides a comprehensive technical deep-dive into Infrastream's built-in Continuous Integration and Continuous Deployment architecture. It is intended for platform engineers and architects who need to understand the internal design, data flows, and security mechanisms that power the pipeline.

For the user-facing guide on branching, versioning, and deployment tracks, see The Internal Developer Platform.


System Overview

The CI/CD system is composed of three primary services, each with distinct responsibilities and trust boundaries:


Component Architecture

The CD Manager

The CD Manager (internal/cd/manager.go in svc-portal) is the orchestration hub of the deployment pipeline. It runs inside the Infrastream Portal and manages the full deployment lifecycle.

Core Functions:

FunctionRole
HandlePush()Entry point for CI build notifications. Matches the buildDefinitionId + container to an Application manifest, creates a PreRelease deployment, and kicks off the pipeline.
ProcessPipeline()State machine evaluator. Checks approval gates, determines the next stage, and calls executePromotion() when conditions are met.
executePromotion()Writes the resolved version to the target DeploymentConfig YAML files, commits to git, and pushes. For Release tracks, also handles RC image tagging via the Releaser.
HandleDeploymentStatus()Receives status webhooks from the Engine after reconciliation. Updates the stage record and re-invokes ProcessPipeline() to evaluate the next step.

Design Decisions:

  • The CD Manager is stateless — all deployment state is in Spanner. The manager can be restarted or scaled horizontally without losing context.
  • ProcessPipeline() is idempotent — calling it multiple times for the same deployment/stage produces the same result. This allows safe retries.
  • Git operations use a managed service account with push access to the manifest repository. The commit message includes the deployment ID for auditability.

The Releaser

The Releaser (internal/cd/releaser/) handles Release-specific operations that require Artifact Registry and GitHub API access:

FunctionRole
CreateReleaseBranches()For each app in the release, creates a release/<appSet>/<bundle> branch in the source repository, pointing at the git tag of the selected semver.
tagImagesAsRCForRelease()Tags each app's container image in Artifact Registry with an immutable RC tag (<appSet>-<bundle>-rc.<N>). Counts existing RC tags to determine the increment.
tagImageForProduction()On pipeline completion, creates a clean production tag by stripping the -rc.N suffix. If the clean tag already exists, creates a hotfix variant.

Why tag AR before writing manifests? The RC tag must exist in Artifact Registry before it is referenced in a DeploymentConfig. If the Engine reconciles a manifest with a version that doesn't exist as a container tag, the deployment will fail with an image pull error. By tagging AR first, every version in a manifest is always resolvable.

The Engine Runner

The Engine (infrastream-runner) is a Cloud Run Job that executes the actual infrastructure reconciliation. It is triggered by a git push to the manifest repository and executes a deterministic lifecycle:

Scoped Execution: When triggered by a CD promotion, the engine only processes the manifests affected by the version change, minimizing execution time and blast radius. The scope is determined by the changed files in the git commit.

The Token Broker

The Token Broker (internal/token/broker.go in svc-portal) provides short-lived GitHub tokens to CI workflows that need cross-repository access. This is critical for:

  • Pulling private Go modules during builds
  • Accessing shared workflow definitions in the managed workflows repository
  • Reading private Docker base images from other organization repositories

Security model:

  • The Token Broker validates the OIDC token from the GitHub Actions runner
  • It verifies the requesting repository is authorized to access the target resources
  • It issues a GitHub App Installation Token scoped to the specific repositories needed
  • Tokens have a maximum lifetime of 1 hour and are never persisted

Data Model

Spanner Tables

All deployment state is stored in Google Cloud Spanner, providing strong consistency and global replication.

PreRelease Track

TablePrimary KeyPurpose
PreReleaseDeploymentsDeploymentIDDeployment metadata: version, status, stage index, entity ID (Application state ID), build definition reference
PreReleaseStagesDeploymentID, StageIndexPer-stage execution records: commit SHA, run ID, start/end timestamps, status

Release Track

TablePrimary KeyPurpose
ReleaseDeploymentsDeploymentIDBundle metadata: version (bundle name), status, entity ID (ApplicationSet state ID), last deployed versions (JSON map)
ReleaseStagesDeploymentID, StageIndexPer-stage execution records
ReleaseBundleAppsDeploymentID, AppNamePer-app version mapping within the bundle

Shared Tables

TablePrimary KeyPurpose
DeploymentApprovalsDeploymentID, ApprovalIDImmutable, append-only approval ledger. Records actor, action (APPROVE/REJECT), stage index, timestamp, and optional comment.
DeploymentChangelogsDeploymentID, ChangelogIDAuto-generated changelogs from conventional commit history between versions

Deployment Status Flow


Workflow Generation Architecture

The Computer Pipeline

The CI workflow generation is powered by the engine's Computer system — a phase that runs during the Compute stage of the engine lifecycle. The key components are:

  1. GithubRepositoryComputer — Reads the strategy field and computes a BranchConfig map. Each entry defines: target patterns, release type, allowed merge methods, required reviewers, and status checks.

  2. BuildDefinitionComputer — Consumes the BranchConfig from the parent repository and the BuildDefinition spec. For each branch config, it creates a BranchConfiguration that the workflow builder uses to generate the appropriate trigger and versioning logic.

  3. InternalWorkflowBuilder — A typed builder interface. Each build type (Golang, Docker, Flutter, etc.) implements this interface. The builder generates:

    • Push workflows — Triggered on push to matching branches. Run the full pipeline including versioning, build, test, containerize, and deploy notification.
    • Pull request workflows — Triggered on PRs targeting matching branches. Run analysis, test, and build stages only.

Builder Interface

Every build type implements the InternalWorkflowBuilder interface:

type InternalWorkflowBuilder interface {
BranchConfigurations() map[string]*BranchConfiguration
Pipeline(workflow *ManagedWorkflow)
DownstreamPipeline(workflow *ManagedWorkflow)
SupportsPullRequestWorkflow() bool
}
MethodPurpose
BranchConfigurations()Returns the set of branch configs this builder will generate workflows for. Filtered from the repository-level config based on what's relevant for the build type.
Pipeline()Populates the managed workflow with the build-type-specific stages (test, build, containerize, etc.).
DownstreamPipeline()For builds that trigger downstream workflows (e.g., a library publish that triggers consumer rebuilds).
SupportsPullRequestWorkflow()Whether this build type generates PR-specific workflows (most do).

Managed Workflow Structure

Each generated managed workflow contains these common components:

ComponentPurpose
InitializeCheckout, validate conventional commits, compute version, determine skip flag
Stage JobsBuild-type-specific jobs (test, build, publish, etc.) with matrix support
TagCreate git tag on successful build (push workflows only)
ContainerizeMulti-arch Docker build with build-args injection
PublishPush to Artifact Registry, create BinAuthz attestation
ReleaseCreate GitHub Release with artifacts and changelog
NotifyPublish PubSub notification to CD Manager

Security Hardening

Generated managed workflows include multiple security layers:

  1. SHA-pinned actions — Every uses: reference is pinned to a full commit SHA, not a mutable tag
  2. Minimal permissions — Each job declares only the GitHub token permissions it needs
  3. Workload Identity Federation — GCP authentication via OIDC, no service account keys
  4. Read-only checkout — PR workflows use persist-credentials: false
  5. Isolated secrets — Build secrets are injected via the TOKEN_BROKER service, never stored in GitHub

Version Resolution Deep-Dive

The compute-version Script

The version computation logic runs in the Initialize stage of every push workflow. It is the single source of truth for determining what version a build produces.

Algorithm:

1. Find the merged PR that produced this push commit
2. Read PR labels:
- "release/major" → MAJOR bump
- "release/minor" → MINOR bump
- (none) → PATCH bump (default)
3. Find the latest non-pre-release semver tag reachable from HEAD
4. Apply the branch's release type:
- LATEST → bump(base_tag, level) → v2.5.0
- ALPHA → base_tag + "-alpha." + count → v2.4.12-alpha.3
- BETA → base_tag + "-beta." + count → v2.4.12-beta.0
- HOTFIX → base_tag + "-hotfix." + count → v2.4.12-hotfix.1
- NONE → skip (no tag)
5. Check if the tag already exists → if yes, set skip=true
6. Output: target_version, skip, release_type

Release Candidate Resolution

When a Release deployment is promoted, the system resolves per-app RC versions:

1. Parse dep.LastDeployedVersions → {"svc-portal": "v2.4.13", "svc-cloud-portal": "v0.26.2"}
2. For each app:
a. Resolve BuildDefinition and container name
b. Derive release tag base: "<appSet>-<bundle>" → "infrastream-cloud-2026.11"
c. Query AR for existing RC tags matching "infrastream-cloud-2026.11-rc.*"
d. Increment: count(existing) → rcCount
e. Create tag: "infrastream-cloud-2026.11-rc.<rcCount>"
f. Point tag to same image digest as the semver tag
3. Store computed RC tag on dep.Version
4. Write RC tag to spec.version in each DeploymentConfig

Tag Immutability Guarantee

All tags in Artifact Registry are immutable — they are never moved, overwritten, or deleted. This provides:

  • Auditability — Complete history of every promotion attempt (rc.0, rc.1, rc.2, ...)
  • Rollback safety — Any previous version can be re-deployed by referencing its tag
  • Reproducibility — The same tag always resolves to the same image digest

Approval System Architecture

Gate Evaluation

The ProcessPipeline() function evaluates stage gates before each promotion. Gates are defined in the ReleaseTrack manifest and can include:

  • Auto-approve — The stage proceeds immediately (typical for pre-release environments)
  • Manual approval — Requires explicit sign-off from an authorized actor
  • Policy-based — Evaluates conditions (e.g., all PreRelease stages must be SUCCESS before Release initiation)

Approval Ledger

Approvals are stored in the DeploymentApprovals table — an immutable, append-only ledger:

FieldTypeDescription
DeploymentIDSTRINGLinks to the PreRelease or Release deployment
ApprovalIDSTRINGUnique identifier for the approval action
StageIndexINT64Which pipeline stage the approval applies to
ActorSTRINGThe identity of the approver (email or service account)
ActionSTRINGAPPROVE or REJECT
TimestampTIMESTAMPServer-side commit timestamp
CommentSTRINGOptional comment from the approver

Approvals cannot be modified or deleted. A REJECT action halts the pipeline. To resume, a new approval with APPROVE must be added.


Build-Time Secrets Management

Secrets Injection Flow

Secret Types

SecretSourceScopeLifetime
GCP credentialsWorkload Identity FederationPer-job OIDC token~1 hour
GitHub tokensToken BrokerScoped to specific repositories~1 hour
Registry credentialsDerived from GCP WIFPush access to Artifact RegistryPer-step
KMS signing keysGCP IAMBinAuthz attestation signingPer-operation
Build secretsGitHub Actions secrets (org-level)Injected as env varsPer-job

Cross-Registry Authentication

For builds that need to pull base images from private registries or push to multiple registries, the workflow uses a multi-step authentication flow:

  1. WIF authentication — Obtain GCP credentials via Workload Identity Federation
  2. Docker login — Authenticate to Artifact Registry using the WIF token
  3. Registry enumeration — The BuildDefinition specifies which registries the container should be pushed to
  4. Multi-push — The same image is pushed to all configured registries in parallel

Container Metadata & Attestation

Image Labeling

Every container image built by the pipeline includes standardized OCI labels:

LabelValueExample
org.opencontainers.image.sourceRepository URLhttps://github.com/pvotal-tech/svc-portal
org.opencontainers.image.versionSemver tagv2.4.13
org.opencontainers.image.revisionGit commit SHAabc123def456
org.opencontainers.image.createdBuild timestamp2026-05-12T14:00:00Z

Binary Authorization

After a container is pushed, the workflow creates a Binary Authorization attestation:

  1. The attestor is provisioned by the engine as part of the BuildDefinition executor
  2. The attestation is signed using a Cloud KMS key
  3. Target GCP projects enforce a Binary Authorization policy that requires a valid attestation
  4. Containers without attestation are rejected at admission time (deploy-time, not build-time)

This ensures that only containers built through the official pipeline can run in production environments.


Error Handling & Recovery

Pipeline Failures

Failure PointBehaviorRecovery
CI build failureNo PubSub notification sentFix code, push again
PubSub delivery failureMessage is retried (at-least-once)Idempotent HandlePush()
Manifest commit failureDeployment stays in PENDINGRetry via Portal UI
Engine reconciliation failureStatus webhook reports FAILEDFix manifest, re-trigger
AR tagging failureRelease initiation failsRetry release initiation

Idempotency Guarantees

  • HandlePush() — Deduplicates by buildDefinitionId + container + version. Same notification processed twice produces the same deployment.
  • executePromotion() — Checks if the target manifest already has the desired version before committing. No-op if already correct.
  • tagImagesAsRCForRelease() — Counts existing RC tags before creating a new one. If the exact RC tag already exists, it is not recreated.
  • Engine execution — The Plan phase produces NO_ACTION for resources that are already in the desired state.