Blue-Green Deployments with GitHub Actions on Kubernetes

Blue-green deployment is one of the simplest ways to reduce release risk in Kubernetes. Instead of replacing pods in place, you run two environments, blue and green, with only one receiving live traffic. Your deployment pipeline ships the new version to the inactive color, verifies it, and then flips a Kubernetes Service selector. If something breaks, rollback is just another selector change.

This tutorial shows a practical setup using GitHub Actions and Kubernetes. The example assumes a single application and one production Service, but the pattern works just as well behind an ingress or service mesh. Keep your YAML clean before committing it; a quick pass through TechBytes' Code Formatter helps prevent indentation mistakes that break CI.

Key Takeaway

The safest implementation is to treat blue and green as separate Deployments with identical labels except for color. Your production Service always targets one color at a time, which makes promotion and rollback operationally trivial.

Prerequisites

A Kubernetes cluster and a namespace for the application.
A container image already published to a registry such as GHCR or Docker Hub.
A GitHub repository with Actions enabled.
kubectl access from CI, usually via a stored kubeconfig or cloud-auth step.
An app that exposes a readiness endpoint so Kubernetes can verify healthy pods before traffic shifts.

In this walkthrough, the application name is demo-app, the namespace is production, and the image is published as ghcr.io/acme/demo-app.

Implementation Steps

1. Create separate blue and green Deployments

Start with two Deployment manifests that are operationally identical except for the color label and the resource name. Only one color will receive traffic at a time, but both can exist in the cluster. That gives you a warm standby during release.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: demo-app-blue
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: demo-app
      color: blue
  template:
    metadata:
      labels:
        app: demo-app
        color: blue
    spec:
      containers:
        - name: demo-app
          image: ghcr.io/acme/demo-app:stable
          ports:
            - containerPort: 8080
          readinessProbe:
            httpGet:
              path: /healthz
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5

Create the green version with the same structure and change demo-app-blue to demo-app-green and color: blue to color: green. You can keep both manifests in k8s/ and template only the image tag at deploy time.

2. Add a stable Service that points to one color

The production Service should select the active color. At launch, point it to blue. Promotion later becomes a patch operation, not a rebuild of the Service object.

apiVersion: v1
kind: Service
metadata:
  name: demo-app
  namespace: production
spec:
  selector:
    app: demo-app
    color: blue
  ports:
    - port: 80
      targetPort: 8080

This is the core of the pattern: clients keep using the same DNS name and same Service, while the selector controls which Deployment gets traffic.

3. Decide the inactive color during deployment

Your workflow needs to inspect the Service and determine which color is currently live. If the Service points to blue, the inactive target is green. If it points to green, deploy to blue. That logic can be implemented in a small shell step.

ACTIVE_COLOR=$(kubectl -n production get svc demo-app -o jsonpath='{.spec.selector.color}')
if [ "$ACTIVE_COLOR" = "blue" ]; then
  TARGET_COLOR="green"
else
  TARGET_COLOR="blue"
fi

echo "ACTIVE_COLOR=$ACTIVE_COLOR" >> $GITHUB_ENV
echo "TARGET_COLOR=$TARGET_COLOR" >> $GITHUB_ENV

By reading live cluster state, the pipeline becomes stateless. You do not need a separate release flag in GitHub or a manually updated environment variable.

4. Build the GitHub Actions workflow

The workflow below checks out code, authenticates to Kubernetes, resolves the target color, updates the image for the inactive Deployment, waits for readiness, and switches the Service selector. It then annotates the Service so the last promotion is visible in cluster metadata.

name: deploy-blue-green

on:
  push:
    branches: [main]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Set up kubectl
        uses: azure/setup-kubectl@v4

      - name: Write kubeconfig
        run: |
          mkdir -p $HOME/.kube
          echo "${{ secrets.KUBECONFIG }}" > $HOME/.kube/config
          chmod 600 $HOME/.kube/config

      - name: Resolve active and target colors
        run: |
          ACTIVE_COLOR=$(kubectl -n production get svc demo-app -o jsonpath='{.spec.selector.color}')
          if [ "$ACTIVE_COLOR" = "blue" ]; then
            TARGET_COLOR="green"
          else
            TARGET_COLOR="blue"
          fi
          echo "ACTIVE_COLOR=$ACTIVE_COLOR" >> $GITHUB_ENV
          echo "TARGET_COLOR=$TARGET_COLOR" >> $GITHUB_ENV

      - name: Deploy new image to inactive color
        run: |
          kubectl -n production set image deployment/demo-app-${TARGET_COLOR} \
            demo-app=ghcr.io/acme/demo-app:${{ github.sha }}
          kubectl -n production rollout status deployment/demo-app-${TARGET_COLOR} --timeout=180s

      - name: Switch production Service
        run: |
          kubectl -n production patch svc demo-app \
            -p '{"spec":{"selector":{"app":"demo-app","color":"'"${TARGET_COLOR}"'"}}}'
          kubectl -n production annotate svc demo-app \
            deployed-sha=${{ github.sha }} promoted-color=${TARGET_COLOR} --overwrite

If you deploy to a managed cluster such as EKS, GKE, or AKS, replace the kubeconfig step with the platform's official auth action. The deployment logic stays the same.

5. Add a manual approval or smoke-test gate

For higher-risk systems, do not switch traffic immediately after pods become ready. Insert a smoke-test step against the inactive Deployment first, or use a protected GitHub environment for approval. A simple in-cluster check can hit the pod IPs or a preview Service.

kubectl -n production expose deployment demo-app-${TARGET_COLOR} \
  --name=demo-app-preview --port=8080 --target-port=8080 --dry-run=client -o yaml | kubectl apply -f -

PREVIEW_IP=$(kubectl -n production get svc demo-app-preview -o jsonpath='{.spec.clusterIP}')
kubectl -n production run curl --rm -i --restart=Never \
  --image=curlimages/curl:8.7.1 -- \
  http://${PREVIEW_IP}:8080/healthz

Once that check passes, switch the production Service. This extra gate is often enough to catch bad config, missing migrations, or broken startup dependencies before customers see them.

Verification and Expected Output

After the workflow runs, verify three things: the target Deployment is healthy, the Service points to the new color, and the application responds correctly.

kubectl -n production get deploy -l app=demo-app
kubectl -n production get svc demo-app -o yaml
kubectl -n production get endpoints demo-app
kubectl -n production rollout status deployment/demo-app-green

Expected signals:

The inactive Deployment updates to the new image tag and reaches successfully rolled out.
The Service selector changes from color: blue to color: green, or the reverse.
The endpoints list only pods from the promoted color.
User traffic continues without a visible interruption because the Service name never changes.

If you need to rollback, patch the Service back to the previous color:

kubectl -n production patch svc demo-app \
  -p '{"spec":{"selector":{"app":"demo-app","color":"blue"}}}'

That rollback path is why blue-green deployment remains attractive for teams that want low operational complexity without a full progressive delivery stack.

Troubleshooting Top 3

1. Traffic switched, but requests fail immediately

The usual cause is weak readiness checks. If Kubernetes marks pods ready before the app can serve real traffic, the Service starts routing too early. Tighten the readinessProbe, verify startup dependencies, and ensure your health endpoint exercises critical paths rather than returning a static 200.

2. The workflow cannot talk to the cluster

This is typically an authentication problem inside GitHub Actions. Confirm that the kubeconfig secret is valid, the CI identity still has namespace access, and the cluster endpoint is reachable from GitHub-hosted runners. For cloud-native auth, prefer short-lived credentials over long-lived static kubeconfigs.

3. Old and new versions behave differently behind one Service

This is often a compatibility issue, not a Kubernetes issue. If blue and green depend on different schema versions, shared cache keys, or incompatible session formats, traffic switching becomes risky. Keep contracts backward compatible, or split the release into database-first and application-second phases. If logs contain sensitive payloads during debugging, sanitize them before sharing with teammates by using a masking workflow or a tool like the TechBytes Data Masking Tool.

What's Next

Once basic blue-green deployments are stable, extend the pattern rather than replacing it too early. Add automatic smoke tests, promotion approvals, deployment notifications, and metrics checks around error rate and latency. If you need gradual exposure instead of an instant switch, the next step is canary deployment with an ingress controller or service mesh.

The important design principle is to keep the production switch small and reversible. In Kubernetes, that usually means changing routing metadata, not rebuilding infrastructure. With GitHub Actions handling the orchestration and a Service selector controlling traffic, you get a release flow that is easy to reason about, quick to verify, and fast to roll back under pressure.