CI/CD Architecture
This document provides a comprehensive technical deep-dive into Infrastream's built-in Continuous Integration and Continuous Deployment architecture. It is intended for platform engineers and architects who need to understand the internal design, data flows, and security mechanisms that power the pipeline.
For the user-facing guide on branching, versioning, and deployment tracks, see The Internal Developer Platform.
System Overview
The CI/CD system is composed of three primary services, each with distinct responsibilities and trust boundaries:
Component Architecture
The CD Manager
The CD Manager (internal/cd/manager.go in svc-portal) is the orchestration hub of the deployment pipeline. It runs inside the Infrastream Portal and manages the full deployment lifecycle.
Core Functions:
| Function | Role |
|---|---|
HandlePush() | Entry point for CI build notifications. Matches the buildDefinitionId + container to an Application manifest, creates a PreRelease deployment, and kicks off the pipeline. |
ProcessPipeline() | State machine evaluator. Checks approval gates, determines the next stage, and calls executePromotion() when conditions are met. |
executePromotion() | Writes the resolved version to the target DeploymentConfig YAML files, commits to git, and pushes. For Release tracks, also handles RC image tagging via the Releaser. |
HandleDeploymentStatus() | Receives status webhooks from the Engine after reconciliation. Updates the stage record and re-invokes ProcessPipeline() to evaluate the next step. |
Design Decisions:
- The CD Manager is stateless — all deployment state is in Spanner. The manager can be restarted or scaled horizontally without losing context.
ProcessPipeline()is idempotent — calling it multiple times for the same deployment/stage produces the same result. This allows safe retries.- Git operations use a managed service account with push access to the manifest repository. The commit message includes the deployment ID for auditability.
The Releaser
The Releaser (internal/cd/releaser/) handles Release-specific operations that require Artifact Registry and GitHub API access:
| Function | Role |
|---|---|
CreateReleaseBranches() | For each app in the release, creates a release/<appSet>/<bundle> branch in the source repository, pointing at the git tag of the selected semver. |
tagImagesAsRCForRelease() | Tags each app's container image in Artifact Registry with an immutable RC tag (<appSet>-<bundle>-rc.<N>). Counts existing RC tags to determine the increment. |
tagImageForProduction() | On pipeline completion, creates a clean production tag by stripping the -rc.N suffix. If the clean tag already exists, creates a hotfix variant. |
Why tag AR before writing manifests? The RC tag must exist in Artifact Registry before it is referenced in a DeploymentConfig. If the Engine reconciles a manifest with a version that doesn't exist as a container tag, the deployment will fail with an image pull error. By tagging AR first, every version in a manifest is always resolvable.
The Engine Runner
The Engine (infrastream-runner) is a Cloud Run Job that executes the actual infrastructure reconciliation. It is triggered by a git push to the manifest repository and executes a deterministic lifecycle:
Scoped Execution: When triggered by a CD promotion, the engine only processes the manifests affected by the version change, minimizing execution time and blast radius. The scope is determined by the changed files in the git commit.
The Token Broker
The Token Broker (internal/token/broker.go in svc-portal) provides short-lived GitHub tokens to CI workflows that need cross-repository access. This is critical for:
- Pulling private Go modules during builds
- Accessing shared workflow definitions in the managed workflows repository
- Reading private Docker base images from other organization repositories
Security model:
- The Token Broker validates the OIDC token from the GitHub Actions runner
- It verifies the requesting repository is authorized to access the target resources
- It issues a GitHub App Installation Token scoped to the specific repositories needed
- Tokens have a maximum lifetime of 1 hour and are never persisted
Data Model
Spanner Tables
All deployment state is stored in Google Cloud Spanner, providing strong consistency and global replication.
PreRelease Track
| Table | Primary Key | Purpose |
|---|---|---|
PreReleaseDeployments | DeploymentID | Deployment metadata: version, status, stage index, entity ID (Application state ID), build definition reference |
PreReleaseStages | DeploymentID, StageIndex | Per-stage execution records: commit SHA, run ID, start/end timestamps, status |
Release Track
| Table | Primary Key | Purpose |
|---|---|---|
ReleaseDeployments | DeploymentID | Bundle metadata: version (bundle name), status, entity ID (ApplicationSet state ID), last deployed versions (JSON map) |
ReleaseStages | DeploymentID, StageIndex | Per-stage execution records |
ReleaseBundleApps | DeploymentID, AppName | Per-app version mapping within the bundle |
Shared Tables
| Table | Primary Key | Purpose |
|---|---|---|
DeploymentApprovals | DeploymentID, ApprovalID | Immutable, append-only approval ledger. Records actor, action (APPROVE/REJECT), stage index, timestamp, and optional comment. |
DeploymentChangelogs | DeploymentID, ChangelogID | Auto-generated changelogs from conventional commit history between versions |
Deployment Status Flow
Workflow Generation Architecture
The Computer Pipeline
The CI workflow generation is powered by the engine's Computer system — a phase that runs during the Compute stage of the engine lifecycle. The key components are:
-
GithubRepositoryComputer— Reads thestrategyfield and computes aBranchConfigmap. Each entry defines: target patterns, release type, allowed merge methods, required reviewers, and status checks. -
BuildDefinitionComputer— Consumes theBranchConfigfrom the parent repository and theBuildDefinitionspec. For each branch config, it creates aBranchConfigurationthat the workflow builder uses to generate the appropriate trigger and versioning logic. -
InternalWorkflowBuilder— A typed builder interface. Each build type (Golang, Docker, Flutter, etc.) implements this interface. The builder generates:- Push workflows — Triggered on push to matching branches. Run the full pipeline including versioning, build, test, containerize, and deploy notification.
- Pull request workflows — Triggered on PRs targeting matching branches. Run analysis, test, and build stages only.
Builder Interface
Every build type implements the InternalWorkflowBuilder interface:
type InternalWorkflowBuilder interface {
BranchConfigurations() map[string]*BranchConfiguration
Pipeline(workflow *ManagedWorkflow)
DownstreamPipeline(workflow *ManagedWorkflow)
SupportsPullRequestWorkflow() bool
}
| Method | Purpose |
|---|---|
BranchConfigurations() | Returns the set of branch configs this builder will generate workflows for. Filtered from the repository-level config based on what's relevant for the build type. |
Pipeline() | Populates the managed workflow with the build-type-specific stages (test, build, containerize, etc.). |
DownstreamPipeline() | For builds that trigger downstream workflows (e.g., a library publish that triggers consumer rebuilds). |
SupportsPullRequestWorkflow() | Whether this build type generates PR-specific workflows (most do). |
Managed Workflow Structure
Each generated managed workflow contains these common components:
| Component | Purpose |
|---|---|
| Initialize | Checkout, validate conventional commits, compute version, determine skip flag |
| Stage Jobs | Build-type-specific jobs (test, build, publish, etc.) with matrix support |
| Tag | Create git tag on successful build (push workflows only) |
| Containerize | Multi-arch Docker build with build-args injection |
| Publish | Push to Artifact Registry, create BinAuthz attestation |
| Release | Create GitHub Release with artifacts and changelog |
| Notify | Publish PubSub notification to CD Manager |
Security Hardening
Generated managed workflows include multiple security layers:
- SHA-pinned actions — Every
uses:reference is pinned to a full commit SHA, not a mutable tag - Minimal permissions — Each job declares only the GitHub token permissions it needs
- Workload Identity Federation — GCP authentication via OIDC, no service account keys
- Read-only checkout — PR workflows use
persist-credentials: false - Isolated secrets — Build secrets are injected via the
TOKEN_BROKERservice, never stored in GitHub
Version Resolution Deep-Dive
The compute-version Script
The version computation logic runs in the Initialize stage of every push workflow. It is the single source of truth for determining what version a build produces.
Algorithm:
1. Find the merged PR that produced this push commit
2. Read PR labels:
- "release/major" → MAJOR bump
- "release/minor" → MINOR bump
- (none) → PATCH bump (default)
3. Find the latest non-pre-release semver tag reachable from HEAD
4. Apply the branch's release type:
- LATEST → bump(base_tag, level) → v2.5.0
- ALPHA → base_tag + "-alpha." + count → v2.4.12-alpha.3
- BETA → base_tag + "-beta." + count → v2.4.12-beta.0
- HOTFIX → base_tag + "-hotfix." + count → v2.4.12-hotfix.1
- NONE → skip (no tag)
5. Check if the tag already exists → if yes, set skip=true
6. Output: target_version, skip, release_type
Release Candidate Resolution
When a Release deployment is promoted, the system resolves per-app RC versions:
1. Parse dep.LastDeployedVersions → {"svc-portal": "v2.4.13", "svc-cloud-portal": "v0.26.2"}
2. For each app:
a. Resolve BuildDefinition and container name
b. Derive release tag base: "<appSet>-<bundle>" → "infrastream-cloud-2026.11"
c. Query AR for existing RC tags matching "infrastream-cloud-2026.11-rc.*"
d. Increment: count(existing) → rcCount
e. Create tag: "infrastream-cloud-2026.11-rc.<rcCount>"
f. Point tag to same image digest as the semver tag
3. Store computed RC tag on dep.Version
4. Write RC tag to spec.version in each DeploymentConfig
Tag Immutability Guarantee
All tags in Artifact Registry are immutable — they are never moved, overwritten, or deleted. This provides:
- Auditability — Complete history of every promotion attempt (
rc.0,rc.1,rc.2, ...) - Rollback safety — Any previous version can be re-deployed by referencing its tag
- Reproducibility — The same tag always resolves to the same image digest
Approval System Architecture
Gate Evaluation
The ProcessPipeline() function evaluates stage gates before each promotion. Gates are defined in the ReleaseTrack manifest and can include:
- Auto-approve — The stage proceeds immediately (typical for pre-release environments)
- Manual approval — Requires explicit sign-off from an authorized actor
- Policy-based — Evaluates conditions (e.g., all PreRelease stages must be
SUCCESSbefore Release initiation)
Approval Ledger
Approvals are stored in the DeploymentApprovals table — an immutable, append-only ledger:
| Field | Type | Description |
|---|---|---|
DeploymentID | STRING | Links to the PreRelease or Release deployment |
ApprovalID | STRING | Unique identifier for the approval action |
StageIndex | INT64 | Which pipeline stage the approval applies to |
Actor | STRING | The identity of the approver (email or service account) |
Action | STRING | APPROVE or REJECT |
Timestamp | TIMESTAMP | Server-side commit timestamp |
Comment | STRING | Optional comment from the approver |
Approvals cannot be modified or deleted. A REJECT action halts the pipeline. To resume, a new approval with APPROVE must be added.
Build-Time Secrets Management
Secrets Injection Flow
Secret Types
| Secret | Source | Scope | Lifetime |
|---|---|---|---|
| GCP credentials | Workload Identity Federation | Per-job OIDC token | ~1 hour |
| GitHub tokens | Token Broker | Scoped to specific repositories | ~1 hour |
| Registry credentials | Derived from GCP WIF | Push access to Artifact Registry | Per-step |
| KMS signing keys | GCP IAM | BinAuthz attestation signing | Per-operation |
| Build secrets | GitHub Actions secrets (org-level) | Injected as env vars | Per-job |
Cross-Registry Authentication
For builds that need to pull base images from private registries or push to multiple registries, the workflow uses a multi-step authentication flow:
- WIF authentication — Obtain GCP credentials via Workload Identity Federation
- Docker login — Authenticate to Artifact Registry using the WIF token
- Registry enumeration — The
BuildDefinitionspecifies which registries the container should be pushed to - Multi-push — The same image is pushed to all configured registries in parallel
Container Metadata & Attestation
Image Labeling
Every container image built by the pipeline includes standardized OCI labels:
| Label | Value | Example |
|---|---|---|
org.opencontainers.image.source | Repository URL | https://github.com/pvotal-tech/svc-portal |
org.opencontainers.image.version | Semver tag | v2.4.13 |
org.opencontainers.image.revision | Git commit SHA | abc123def456 |
org.opencontainers.image.created | Build timestamp | 2026-05-12T14:00:00Z |
Binary Authorization
After a container is pushed, the workflow creates a Binary Authorization attestation:
- The attestor is provisioned by the engine as part of the
BuildDefinitionexecutor - The attestation is signed using a Cloud KMS key
- Target GCP projects enforce a Binary Authorization policy that requires a valid attestation
- Containers without attestation are rejected at admission time (deploy-time, not build-time)
This ensures that only containers built through the official pipeline can run in production environments.
Error Handling & Recovery
Pipeline Failures
| Failure Point | Behavior | Recovery |
|---|---|---|
| CI build failure | No PubSub notification sent | Fix code, push again |
| PubSub delivery failure | Message is retried (at-least-once) | Idempotent HandlePush() |
| Manifest commit failure | Deployment stays in PENDING | Retry via Portal UI |
| Engine reconciliation failure | Status webhook reports FAILED | Fix manifest, re-trigger |
| AR tagging failure | Release initiation fails | Retry release initiation |
Idempotency Guarantees
HandlePush()— Deduplicates bybuildDefinitionId+container+version. Same notification processed twice produces the same deployment.executePromotion()— Checks if the target manifest already has the desired version before committing. No-op if already correct.tagImagesAsRCForRelease()— Counts existing RC tags before creating a new one. If the exact RC tag already exists, it is not recreated.- Engine execution — The Plan phase produces
NO_ACTIONfor resources that are already in the desired state.