GitHub-hosted runners cost $0.008 per minute (Linux, 2 vCPU / 7 GB). If your team runs 500 minutes of CI per day, that’s $1,200/month. For a startup with a moderately active engineering team, CI bills are often the largest single cloud cost line item.

The standard alternative — persistent self-hosted runners — solves the cost problem but introduces uptime management: servers that are always on, patching to handle, capacity to predict. On bursty workloads (PR-heavy development, release days), they’re either under-provisioned (queued builds) or over-provisioned (idle machines you’re paying for).

jit-runners is our answer: GitHub Actions runners that provision on demand when a workflow starts, use EC2 Spot for 60–80% cost reduction versus on-demand, and terminate when the job finishes.

The architecture

GitHub Actions workflow queued


GitHub webhook (workflow_job: queued)


API Gateway → Lambda (Go)

        ├── EC2 RunInstances (Spot, user data: GitHub runner install + register)


EC2 instance comes online

        ├── Registers with GitHub as ephemeral runner


GitHub assigns job to runner

        ├── Job executes


workflow_job: completed webhook

        ├── Lambda terminates instance (or instance self-terminates on idle)

Three components:

  1. Lambda function (Go) — receives webhook events, provisions runners, handles termination
  2. EC2 Spot instances — ephemeral runners with JIT registration tokens
  3. GitHub App — handles webhook delivery and runner registration tokens

Why Lambda + Go

Lambda is the right compute for this use case:

  • Event-driven — webhook fires, Lambda runs, done. No polling, no scheduler.
  • Cost — Lambda invocations for webhook handling are essentially free (well within the free tier)
  • Scale — 1,000 concurrent PRs? Lambda scales horizontally without configuration

Go is the right language for this Lambda:

  • Cold start — compiled Go binaries have 10–50ms Lambda cold starts. Python or Node.js are fine too, but Go is genuinely fast.
  • Single binary — no runtime dependencies, trivial deployment (GOARCH=amd64 GOOS=linux go build)
  • AWS SDK v2 — the official Go SDK is well-maintained and performant

The Lambda function is small: parse webhook event, call EC2 RunInstances with a user data script, handle errors. The whole thing fits comfortably in a single file.

EC2 Spot for runners

Spot instances offer unused EC2 capacity at 60–80% discount versus on-demand. The risk: AWS can reclaim capacity with 2 minutes notice.

For CI runners, this risk is manageable:

  • Spot interruption = job failure = job retry — GitHub Actions retries interrupted jobs automatically (with retry-failed-jobs: true) or manually
  • Short job duration — most CI jobs finish in 5–15 minutes. Spot interruption probability over that window is low.
  • Diversified instance types — using a Spot Fleet or instance type diversification reduces interruption frequency

Our default configuration: c5.2xlarge (8 vCPU / 16 GB) as primary, c5a.2xlarge as fallback. Average CI cost: ~$0.05/hour versus ~$0.34/hour on-demand.

The user data script

When the EC2 instance starts, user data handles runner setup:

#!/bin/bash
set -euo pipefail

# Install runner
mkdir -p /actions-runner && cd /actions-runner
curl -sSL https://github.com/actions/runner/releases/latest/download/actions-runner-linux-x64-*.tar.gz | tar -xz

# Register as ephemeral runner
./config.sh \
  --url "https://github.com/ORG_NAME" \
  --token "REGISTRATION_TOKEN" \
  --name "spot-$(ec2-metadata --instance-id)" \
  --labels "self-hosted,linux,spot" \
  --ephemeral \
  --unattended

# Run (exits after one job when --ephemeral)
./run.sh

The --ephemeral flag is key: the runner deregisters automatically after completing one job. No cleanup needed, no state to manage between runs.

The registration token (REGISTRATION_TOKEN) is fetched by the Lambda function via the GitHub API just before calling RunInstances, then injected into user data. Tokens expire after one hour — tight enough window for a newly provisioned instance.

Handling Spot interruptions gracefully

GitHub Actions doesn’t automatically retry jobs on runner failure. To handle Spot interruptions:

  1. Termination notice poller — user data starts a background process that polls the EC2 metadata endpoint for termination notices. When a notice arrives, it sends SIGTERM to the runner process.
  2. Runner cancels job — the runner marks the job as canceled on SIGTERM, which is retriable.
  3. Lambda retriggers — a new workflow_job: queued event fires for the canceled job, provisioning a fresh instance.

This adds ~2 minutes of latency on interruption (termination notice → cancel → new instance provision → runner online). For most CI workloads this is acceptable; for time-critical jobs, use on-demand instance types.

Cost comparison

For a team running 1,000 CI minutes/day:

ApproachMonthly cost
GitHub-hosted runners$240
EC2 on-demand (c5.2xlarge)~$180
EC2 Spot (c5.2xlarge, ~70% discount)~$55

At scale the savings compound. 5,000 minutes/day: GitHub → $1,200/month, Spot → ~$275/month.

The break-even on engineering time to set up jit-runners is typically under a week of CI spend.

What jit-runners handles

The open source jit-runners project provides:

  • Lambda function (Go) for webhook handling and instance provisioning
  • Terraform module for Lambda, API Gateway, IAM roles, and security groups
  • GitHub App configuration guide
  • User data templates for Ubuntu and Amazon Linux 2
  • Spot interruption handler sidecar

The deployment guide is in the repository README. Setup takes about 30 minutes if you have AWS credentials and GitHub App access.


jit-runners is open source under MIT. If you hit a Spot interruption rate that’s causing real problems, open an issue — there are several approaches to mitigation we haven’t implemented yet.