Skip to content

How It Works

Lifecycle overview

GitLab Runner
  -> config
  -> prepare
       -> resolve image
       -> lock state
       -> deduplicate image pull
       -> create or reclaim VM
       -> start VM
       -> wait for RUNNING
       -> enable SSH
       -> wait for SSH and toolchain
  -> run
       -> load state
       -> reconnect SSH if needed
       -> stream script into VM
  -> cleanup
       -> disable SSH
       -> stop and delete VM
       -> remove state and lock files

config

The config stage returns metadata GitLab Runner needs:

  • build and cache directories
  • shell and hostname details
  • selected executor settings

It is intentionally fast and mostly environment-driven.

prepare

This is where most of the operational complexity lives.

1. Precondition checks

The executor loads config, parses the GitLab job context, and makes sure a Jeballto token is available.

2. State lock

It acquires a file lock keyed by runner and job ID. That prevents overlapping prepare, run, and cleanup processes for the same job from trampling shared state.

3. Image resolution

The executor resolves the image from the job context, env vars, and defaults. See Image Policy for the exact precedence.

4. Image pull deduplication

If multiple jobs need the same image:

  • one process acquires the per-image lock
  • that process pulls the image if needed
  • other processes wait instead of triggering duplicate pulls

5. VM creation or reclamation

The executor creates a new VM with a deterministic name. If creation succeeds but state persistence fails, the next attempt can reclaim the same VM by name.

6. Resource update

When an image is supplied, resource overrides are patched after VM creation and before start. This mirrors the current Jeballto API behavior.

7. Start and capacity wait

The executor starts the VM. If the host is full, it waits and retries until another slot opens or the full prepare timeout is consumed.

8. SSH readiness

Once the VM reaches RUNNING, the executor enables SSH forwarding, waits for SSH readiness, then probes the VM for required tools.

run

The run stage:

  • loads the saved state
  • rehydrates SSH info if needed
  • re-enables SSH when forwarding has been lost
  • pipes the GitLab job script into the VM with OpenSSH
  • maps the result to build-failure or system-failure semantics

cleanup

Cleanup is best-effort, but it aims to leave no residue:

  • disable SSH forwarding early
  • stop and delete the VM
  • remove the job state JSON
  • remove lock files

If cleanup cannot find a VM ID, it still removes local state so the next run is not poisoned by stale files.

Important code areas

  • internal/app/app.go contains the stage orchestration
  • internal/api/client.go wraps the Jeballto API
  • internal/gitlab/context.go resolves GitLab environment and images
  • internal/state/store.go manages job state and lock files
  • internal/runner/ssh.go handles OpenSSH execution