This system provides intelligent, dynamic scaling of ephemeral GitHub Actions runners based on workload demand. It replaces static, single-runner setups with a sophisticated orchestrator that automatically manages runner lifecycle.
- Intelligent Queue Monitoring: Continuously monitors GitHub Actions queue length
- Auto-scaling: Automatically provisions/deprovisions runners based on demand
- Configurable Thresholds: Customizable scale-up/down triggers
- Pool Management: Maintains a minimum pool of always-available runners
- Ephemeral Runners: Runners are created on-demand and destroyed when idle
- Docker-in-Docker: Full Docker support for containerized workflows
- Network Isolation: Dedicated Docker networks for security
- Resource Management: Automatic cleanup of containers and volumes
- REST API: Full management API for status, metrics, and control
- Prometheus Metrics: Built-in metrics collection for monitoring
- Structured Logging: Comprehensive logging with correlation IDs
- Health Checks: Container health monitoring and self-healing
- GitHub PAT Integration: Uses Personal Access Token for API access
- Graceful Shutdown: Proper cleanup and runner unregistration
- Error Recovery: Automatic retry logic and error handling
- Signal Handling: Proper process management and termination
┌─────────────────────────────────────────────────────────────────┐
│ GitHub API Integration │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Orchestrator Container │
│ ┌───────────────┐ ┌──────────────┐ ┌─────────────────────┐ │
│ │ Queue Monitor│ │ Runner Pool │ │ Metrics & Scaling │ │
│ │ │ │ Manager │ │ │ │
│ └───────────────┘ └──────────────┘ └─────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Docker Daemon / Docker API │
└─────────────────────────────────────────────────────────────────┘
│
┌───────────────┼───────────────┐
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Runner-1 │ │ Runner-2 │ │ Runner-N │
│ (Ephemeral) │ │ (Ephemeral) │ │ (Ephemeral) │
└──────────────┘ └──────────────┘ └──────────────┘
- Docker and Docker Compose
- GitHub Personal Access Token with appropriate permissions
- Access to GitHub repository/organization
# Clone the repository
git clone <your-repo-url>
cd self-hosted-github-action-runner
# Copy and configure environment
cp .env.example .env
# Edit .env with your settings
# Build and start the orchestrator
# (Optional) Build the custom runner image used by the orchestrator and then start services
# Build using the repository root as the build context so files like `daemon.json`
# (located at the repo root) are available to the Dockerfile. The Dockerfile is
# taken from `runner-image/Dockerfile`.
docker build -t apex-runner:local -f runner-image/Dockerfile . || true
docker compose up -d --buildEdit .env:
# GitHub Configuration - REQUIRED
GITHUB_TOKEN=your_github_personal_access_token
GITHUB_ORG=your-organization # OR use GITHUB_REPO
GITHUB_REPO=owner/repo-name # Alternative to ORG
# Scaling Configuration
MIN_RUNNERS=2 # Always maintain this many runners
MAX_RUNNERS=10 # Never exceed this many runners
SCALE_UP_THRESHOLD=3 # Scale up when queue length >= this
SCALE_DOWN_THRESHOLD=1 # Scale down when queue length <= this# View orchestrator logs
docker logs -f orchestrator
# Check status via API
curl http://localhost:8080/api/v1/status
# View Prometheus metrics
curl http://localhost:8080/api/v1/metrics
# Access web dashboard (if configured)
open http://localhost:8080/docsGET /api/v1/status
# Returns orchestrator status, runner counts, queue infoGET /api/v1/runners
# List all runners (Docker + GitHub)
POST /api/v1/runners/scale-up
# Manually trigger scale up
POST /api/v1/runners/scale-down
# Manually trigger scale down
DELETE /api/v1/runners/{runner_id}
# Remove a specific runner
GET /api/v1/runners/{runner_id}/logs
# Get logs from a specific runnerGET /api/v1/metrics
# Prometheus-style metricsGITHUB_TOKEN: Personal Access Token (required)GITHUB_ORG: Organization name (for org-level runners)GITHUB_REPO: Repository in format "owner/repo" (alternative to ORG)
MIN_RUNNERS: Minimum runners to maintain (default: 2)MAX_RUNNERS: Maximum runners allowed (default: 10)SCALE_UP_THRESHOLD: Queue length to trigger scale up (default: 3)SCALE_DOWN_THRESHOLD: Queue length to trigger scale down (default: 1)IDLE_TIMEOUT: Seconds before idle runners are terminated (default: 300)
POLL_INTERVAL: Seconds between GitHub API polls (default: 30)LOG_LEVEL: Logging level (default: INFO)STRUCTURED_LOGGING: Enable structured JSON logging (default: true)
The orchestrator includes built-in Prometheus metrics:
# prometheus.yml included in the setup
- job_name: 'orchestrator'
static_configs:
- targets: ['orchestrator:8080']
metrics_path: '/api/v1/metrics'Configure custom labels for your runners:
environment:
ORCHESTRATOR_RUNNER_LABELS: "docker-dind,linux,self-hosted,my-custom-label"
### Customizing the Runner Image
This project now builds and uses a local runner image by default (`apex-runner:local`).
You can customize the runner image by editing `runner-image/Dockerfile` and adding
any packages or tools your workflows need. The `setup-orchestrator.sh` script will
attempt to build `apex-runner:local` from `runner-image/` during setup. If you prefer
to use a remote image, set `ORCHESTRATOR_RUNNER_IMAGE` in `.env` to the desired
image (e.g. `ghcr.io/yourorg/runner:tag`).For multiple orchestrator instances, enable Redis coordination:
environment:
REDIS_URL: "redis://redis:6379/0"-
Orchestrator won't start
# Check logs docker logs orchestrator # Verify GitHub token permissions curl -H "Authorization: token YOUR_TOKEN" https://api.github.com/user
-
Runners not registering
# Check runner logs docker logs $(docker ps -q --filter label=managed-by=runner-orchestrator) # Verify network connectivity docker exec -it orchestrator ping github.com
-
Scaling not working
# Check queue monitoring curl http://localhost:8080/api/v1/status | jq '.queue' # Manually trigger scaling curl -X POST http://localhost:8080/api/v1/runners/scale-up
Enable debug logging:
environment:
ORCHESTRATOR_LOG_LEVEL: DEBUGcd orchestrator
pip install -r requirements.txt
python main.py# Test GitHub API connectivity
python -c "
from src.github_client import GitHubClient
client = GitHubClient('your-token', org='your-org')
import asyncio
print(asyncio.run(client.get_runners()))
"MIN_RUNNERS: 1
MAX_RUNNERS: 5
SCALE_UP_THRESHOLD: 2
POLL_INTERVAL: 60MIN_RUNNERS: 2
MAX_RUNNERS: 10
SCALE_UP_THRESHOLD: 3
POLL_INTERVAL: 30MIN_RUNNERS: 5
MAX_RUNNERS: 20
SCALE_UP_THRESHOLD: 5
POLL_INTERVAL: 15- Single Container → Orchestrator + Ephemeral Runners
- Manual Scaling → Automatic Scaling
- Static Configuration → Dynamic Management
- Simple Entrypoint → Full API & Monitoring
- Backup existing setup
- Stop old runners:
docker stop my-self-hosted-runner - Deploy orchestrator: Follow Quick Start guide
- Update workflow labels: Use
self-hosted,orchestratedlabels - Monitor and tune: Adjust scaling parameters
This system completely replaces the previous static runner setup. If you were using the v1.0 system:
- Static → Dynamic: Runners are now ephemeral and auto-scale
- Single Runner → Pool: Maintains multiple runners automatically
- Manual → Automated: No more manual runner registration
- Limited → Scalable: Scales from 0 to configurable maximum
- Stop old runners:
docker compose down(if using old setup) - Update configuration: Use new
.envformat (see Configuration section) - Deploy orchestrator:
docker compose up -d --build - Update workflows: No changes needed - existing workflows work automatically
docker compose.yml→ Now orchestrator-only.env.example→ Simplified environment variablesDockerfile→ Now builds orchestrator imagesetup-orchestrator.sh→ Updated for new file structure
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.