Skip to content

How It Works

End-to-end flow

queued workflow_job
  -> controller receives event or poll result
  -> state entry created to deduplicate the job
  -> label selects image and sizing profile
  -> Jeballto health check
  -> create VM
  -> start VM
  -> wait for RUNNING
  -> enable SSH
  -> wait for SSH
  -> verify runner files exist
  -> request JIT runner config from GitHub
  -> launch runner inside VM
  -> verify runner is online
  -> wait for completion
  -> disable SSH, delete VM, remove runner, clear state

Event sources

Poll mode

The controller periodically lists queued workflow_job records for one repository. It also re-checks active jobs so cleanup still happens even without webhook delivery.

Webhook mode

The controller exposes /webhook and validates the workflow_job payload signature with github.webhook_secret.

Both mode

Both sources can observe the same job. The state store prevents duplicate provisioning.

Provisioning pipeline

The controller performs these major steps:

  1. verify the Jeballto Agent is healthy
  2. create the VM from the configured OCI image
  3. if the selected label overrides CPU, memory, or disk, patch those resources before start
  4. start the VM and wait for RUNNING
  5. enable SSH and wait for readiness
  6. probe for run.sh, config.sh, bash, and git
  7. obtain a JIT runner configuration from GitHub
  8. start the runner inside the VM
  9. verify the runner shows as online

If JIT configuration is not supported, the controller falls back to registration-token mode and still verifies that the runner appears online.

Completion handling

When the job reaches completed, the controller:

  • disables SSH forwarding
  • force-deletes the VM
  • removes the GitHub runner if it knows the runner ID
  • deletes the local state record

VMs are also created with ephemeral: true, so the agent can auto-delete them after stop or error as a safety net. Controller-driven cleanup is still the primary teardown path while a VM is running.

Garbage collection

GC catches misses caused by:

  • controller restart
  • lost webhook delivery
  • incomplete local state cleanup

It periodically scans for stale state and idle VMs, then cleans them up according to the configured thresholds.

Important code areas

  • internal/controller/controller.go ties together modes, workers, and cleanup
  • internal/controller/poller.go handles poll-mode detection and completion checks
  • internal/github/webhook.go validates and parses webhook events
  • internal/config/config.go validates mode-specific settings
  • internal/ssh/runner.go starts the guest runner process