Skip to main content

What to Do When You Can't Rely on Public CI Runners

What self-hosted runners give you that public ones don’t, and how to make the self-hosted part a whole lot easier.

Headshot of James Walker
James WalkerSoftware Engineer
What to Do When You Can't Rely On Public CI Runners

CI/CD pipelines enhance the software delivery cycle by automating key tasks such as code tests, artifact builds and deployments to production infrastructure. Runner processes are responsible for executing your builds and sending the results to your CI/CD server. Simple projects may use a single type of runner, while others can require multiple configurations to build against different target architectures, operating systems and software dependencies.

Runners have privileged access to your source code, the processes started by your pipelines and their outputs. Consequently, runners must be properly secured to prevent intrusion and maintain compliance, all while supporting optimal DevOps velocity.

The choice of public or self-hosted runners is one of the main factors affecting CI/CD performance and security. This article will analyze the pros and cons of these options and compare how well they support configurability, isolation, privacy and compliance.

Comparing Public vs. Self-Hosted Runners for High-Performance, High-Security CI/CD Pipelines

Runners represent the hardware instances that run your CI/CD jobs. A runner is a compute node that hosts an agent process provided by your CI/CD server. When a new pipeline is created, the server assigns the pipeline's jobs to an available runner; the runner then executes your scripts and uploads their output back to the CI/CD server.

Public runners are managed in the cloud by the CI/CD service they belong to. You don't need to manually configure any infrastructure or install any software to use them. This provides a simplified CI/CD setup experience but can create performance and security problems.

Self-hosted runners reside on your own infrastructure. You supply the compute nodes and provision them with your CI/CD service's agent software. This gives you full control over the environments in which your jobs run, at the expense of more complicated runner configuration and maintenance.

Public Runners

Public runners are shared resources operated by your CI/CD provider. Cloud CI/CD services such as GitHub Actions and GitLab CI/CD provide managed runners that are hosted as virtual machines in a variety of different sizes. When you create CI/CD pipelines, they're automatically assigned to a ready-to-use runner.

Pros and Cons of Performance and Hardware Configurability

Public runners are highly scalable, as you can easily provision new instances with minimal delay. Autoscaling options allow your runner fleet to dynamically scale with your CI/CD pipeline usage, ensuring consistent performance. Although you can't directly configure the underlying hardware, this is compensated by the support for automated upgrades to new runner agent releases, operating systems and hardware performance tiers, just by changing the runner tag that you use.

However, because public runners depend on shared resources, CI pipeline performance can be inconsistent and unpredictable. Noisy neighbors—where other users assigned to the same resources run demanding workloads—could affect your pipelines, depending on how the service providers configured their infrastructure.

In addition, you're restricted to the hardware choices the provider offers; for example, if you need a particular CPU, OS and GPU configuration, then there might not be a shared runner that is suitable. This is frequently an issue for intensive pipelines such as those created by AI and ML projects.

Pros and Cons of Security Vulnerabilities and Attack Vectors

Public runners are continually maintained by the CI/CD service provider, typically with support from specialized security teams. They normally come preconfigured for an acceptable security baseline, with clear guidance available on how to further strengthen defenses. If a new vulnerability is detected in the runner agent or the host's operating system, the provider can deploy new patched instances that automatically protect you.

However, shared infrastructure can be vulnerable to isolation breakouts and environment cross-contamination. For example, a compromised pipeline running in one user's account could theoretically gain access to your own resources. In addition, the lack of direct hardware control means it can be challenging to independently audit runner security or implement custom security controls. This may render shared runners unsuitable for organizations subject to specific security standards.

Pros and Cons of Data Privacy and Access Control

Because the service provider manages public runner instances, you benefit from a simplified approach to access management that's integrated into the platform. You can have a high degree of confidence that only authorized users can interact with your runners and the shared infrastructure they run on. In addition, providers often include best practice privacy mechanisms, such as automatic encryption for your data as it's transferred to and stored on the runner.

However, keep in mind that your pipeline data is held on third-party servers that other users could potentially access if there's a vulnerability at the shared infrastructure level. There's constant potential for mass data leaks should the platform's security be compromised. It's also possible that insiders at the service provider could access your data without your knowledge.

Pros and Cons of Compliance with Industry Regulations

CI/CD services often hold industry-standard compliance certifications for data management and security. The service provider should regularly conduct compliance audits to detect and address any risks. This can help automate your compliance if you're targeting the same frameworks as the provider.

On the more negative side, public runners are inherently tied to a one-size-fits-all compliance model that can't meet the needs of individual industries. When you're subject to specific compliance standards, shared infrastructure may prove incompatible with your requirements or make it challenging to verify alignment. Without direct hardware access, you have limited ability to enforce custom compliance policies.

Pros and Cons of Infrastructure Control and Isolation

Public runners place the responsibility for infrastructure management on your CI/CD provider. This means fewer infrastructure administration overheads for your teams, in addition to quick setup and scaling options that are integrated into the platform. High availability, fault tolerance and redundancy are included, so you can be sure your pipelines will always run successfully.

The flip side is limited control of the public runner infrastructure and an inability to enforce strict isolation controls. If you need to install specific software on the runner hosts, public runners will be unsuitable because host-level access isn't available. You're restricted to managing the virtualized environments that your CI/CD jobs are executed in.

Self-Hosted Runners

Self-hosted runners are an advanced option for teams that need more control over their CI/CD pipelines. This is usually for the purposes of performance, security and compliance, where administrators are prepared to take on the burden of managing runners to obtain isolation improvements. Operating your own runners on dedicated compute resources ensures they'll only ever handle your CI/CD jobs.

To self-host a runner, you'll need to choose a CI/CD service that supports this workflow, then follow the provider's instructions to install its runner application on each of your hardware nodes. This process can be automated at scale using infrastructure as code (IaC) tools such as Ansible and Terraform.

Pros and Cons of Performance and Hardware Configurability

Self-hosted runners give you dependable performance because they run jobs on hardware that's owned by or reserved for you. This also provides nearly limitless configurability, as you can provision the specific hardware types, OS releases and software dependencies your jobs require. Because you have host-level access to the infrastructure, you can precisely manage network settings such as proxies, VPNs and firewalls to safely integrate with your other services, including those on private networks that would be inaccessible from public runners.

The negative side of hardware configurability with self-hosted runners is that it can be harder to maintain performance over time since you need to regularly upgrade your runner fleet with new hardware. It's also challenging to scale self-hosted runners, so your jobs may be more easily bottlenecked by performance problems and resource exhaustion. If a job urgently needs a new runner configuration, it might not be possible to fulfill it until you've sourced new hardware or cloud resources and provisioned another runner instance.

In addition, overall operating costs can be higher because you have to pay for your dedicated resources, even when pipelines aren't actively using them.

Pros and Cons of Security Vulnerabilities and Attack Vectors

Self-hosted runners give you complete control over your security posture. You can apply vulnerability patches as they become available without waiting for external providers. It's possible to integrate any security monitoring and automated protection software you require, while the exclusive infrastructure means there's no risk of third-party users compromising your resources.

However, keep in mind that you have full responsibility for detecting and mitigating security issues and vulnerabilities. Unless you have a dedicated security team tasked with protecting your infrastructure, this can mean a delayed response to threats compared to continually updated public runners. Establishing a security team is a costly investment requiring in-house expertise that can be hard to acquire.

Pros and Cons of Data Privacy and Access Control

Self-hosting a runner means you have total control over how it handles and stores data. You can take advantage of different storage types, confidently enforce best practice safety measures such as encryption and easily conduct audits to ensure ongoing protection. This provides confidence that only authorized users can access confidential data.

However, enforcing proper data controls for self-hosted runners can be complex and time-consuming. There's a risk that access policies, encryption settings and retention periods could be misconfigured, leading to data loss or a compliance breach. Data teams may struggle to understand how data within CI/CD systems should be monitored and maintained.

Pros and Cons of Compliance with Industry Regulations

The extra control afforded by self-hosting allows you to implement compliance measures that are tailored to your specific regulatory requirements. Full ownership of your infrastructure, runner environments and data stores makes it easier to maintain and prove compliance by aligning the operating structure of your systems with that of your audit and certification frameworks.

The flip side is that when self-hosting runners, you're solely responsible for ensuring compliance. This can be an administrative burden that requires specialist expertise, leading to additional costs for audits and certifications. It requires significant effort to stay informed about changing compliance regulations and the controls they require. Although this may be inevitable for regulated organizations, having to replicate compliance with standards that public runner operators already provide can be unnecessarily inconvenient for smaller teams.

Pros and Cons of Infrastructure Control and Isolation

Self-hosted runners provide complete control over your infrastructure and its configuration. You can achieve total isolation from the public internet and other users via a directly managed security perimeter. You're also free to combine the specific hardware characteristics and software services you require, then make changes at any time. This can facilitate an improved balance between overall performance and security.

However, with control comes complexity. You need detailed technical knowledge to provision dedicated infrastructure, configure the CI/CD agent process and maintain security over time. You're in charge of ensuring your infrastructure is prepared for high availability and includes suitable disaster recovery options. New CI/CD server and agent upgrades may be incompatible with your architecture or require significant changes, potentially introducing delays before teams can utilize new features and improvements.

Conclusion

Choosing between public and self-hosted CI/CD runners can be challenging. Public runners provide convenience, but your pipelines could be affected by noisy neighbor problems and potential security issues. Conversely, self-hosted runners guarantee complete isolation, but you're responsible for administering them. This workload can be onerous for DevOps teams that maintain large runner fleets.

There are no set rules for when you should switch to self-hosted runners, but in general, they're best suited to environments where configurability, control and compliance are key. For example, if you're operating a regulated service that requires hundreds of pipeline runs per day, you're more likely to find self-hosted runners useful than a team that builds internal services that have a low pipeline frequency.

You may choose to combine both public and self-hosted runners to optimally balance efficiency and cost. Use public runners for noncritical systems to accelerate project setup times, but maintain a small fleet of self-hosted runners to serve performance- and security-sensitive pipelines. This helps reduce the workload placed on runner administrators.

Self-hosting your runners doesn't have to mean operating your own hardware, either. Using dedicated cloud resources allows you to start new runners when you need them without having to reconfigure any on-premises infrastructure. Dedicated cloud gives you private compute instances that are reserved for your exclusive use, mitigating any runner isolation concerns. You also benefit from secure connectivity to other infrastructure components that your pipelines depend on, such as artifact registries and secrets managers, in addition to integrations with your cloud provider's security infrastructure (including firewalls and observability suites).

Equinix Metal provides private automated infrastructure on demand that’s ideally suited for building secure and performant CI/CD pipelines right away. Equinix’s dedicated cloud resources can support your pipelines on a global scale, with safe private networking and seamless interconnection with public cloud platforms, network operators and an entire global ecosystem of service providers you can tap into for an optimal infrastructure solution for your self-hosted CI pipeline.

Published on

25 June 2024

Category

Tags

Subscribe to our newsletter

A monthly digest of the latest news, articles, and resources.