How It Works¶
Lifecycle overview¶
GitLab Runner
-> config
-> prepare
-> resolve image
-> lock state
-> deduplicate image pull
-> create or reclaim VM
-> start VM
-> wait for RUNNING
-> enable SSH
-> wait for SSH and toolchain
-> run
-> load state
-> reconnect SSH if needed
-> stream script into VM
-> cleanup
-> disable SSH
-> stop and delete VM
-> remove state and lock files
config¶
The config stage returns metadata GitLab Runner needs:
- build and cache directories
- shell and hostname details
- selected executor settings
It is intentionally fast and mostly environment-driven.
prepare¶
This is where most of the operational complexity lives.
1. Precondition checks¶
The executor loads config, parses the GitLab job context, and makes sure a Jeballto token is available.
2. State lock¶
It acquires a file lock keyed by runner and job ID. That prevents overlapping prepare, run, and cleanup processes for the same job from trampling shared state.
3. Image resolution¶
The executor resolves the image from the job context, env vars, and defaults. See Image Policy for the exact precedence.
4. Image pull deduplication¶
If multiple jobs need the same image:
- one process acquires the per-image lock
- that process pulls the image if needed
- other processes wait instead of triggering duplicate pulls
5. VM creation or reclamation¶
The executor creates a new VM with a deterministic name. If creation succeeds but state persistence fails, the next attempt can reclaim the same VM by name.
6. Resource update¶
When an image is supplied, resource overrides are patched after VM creation and before start. This mirrors the current Jeballto API behavior.
7. Start and capacity wait¶
The executor starts the VM. If the host is full, it waits and retries until another slot opens or the full prepare timeout is consumed.
8. SSH readiness¶
Once the VM reaches RUNNING, the executor enables SSH forwarding, waits for SSH readiness, then probes the VM for required tools.
run¶
The run stage:
- loads the saved state
- rehydrates SSH info if needed
- re-enables SSH when forwarding has been lost
- pipes the GitLab job script into the VM with OpenSSH
- maps the result to build-failure or system-failure semantics
cleanup¶
Cleanup is best-effort, but it aims to leave no residue:
- disable SSH forwarding early
- stop and delete the VM
- remove the job state JSON
- remove lock files
If cleanup cannot find a VM ID, it still removes local state so the next run is not poisoned by stale files.
Important code areas¶
internal/app/app.gocontains the stage orchestrationinternal/api/client.gowraps the Jeballto APIinternal/gitlab/context.goresolves GitLab environment and imagesinternal/state/store.gomanages job state and lock filesinternal/runner/ssh.gohandles OpenSSH execution