Skip to main content

SUSE Rancher Kubernetes Engine on Equinix Metal

Adopting Rancher Kubernetes Engine on Equinix Metal using Terraform

SUSE Rancher Kubernetes Engine on Equinix Metal

Run Kubernetes on Equinix Metal with SUSE’s compliant and supported management layer, Rancher Kubernetes Engine.

Who’s This For?

Are you looking to run Kubernetes on bare metal? There’s a lot of challenges and if you’re an existing SUSE customer, adopting Rancher Kubernetes Engine could be the best route for you. You’ll get a fully CNCF compliant Kubernetes cluster that you can hook into your support contracts across the full-stack.

Getting Started

In order to get started, there are a few prerequisites that must be satisfied.

Equinix Metal Account

You’ll need to create a project and a project level API key. To create these, please follow these steps:[1]

  1. Login to the Equinix Metal Console
  2. Browse to the Organizations page.
  3. Select your Organization.
  4. Click “Add new” to create a new project.
  5. Click on “Project Settings”.
  6. Copy the “Project ID” and keep it handy for later.
  7. Click on “API Keys”.
  8. Click “Add an API Key”.
  9. Name it anything you like, such as “Terraform” and ensure it is “Read/Write” and create.
  10. Copy this token and keep it handy for later.

[1] Because Equinix Metal uses the organization ID in the URLs, we can’t provide nice links for these creations.

Terraform Variables

Export Equinix Metal Variables

export TF_VAR_project_id="COPIED PROJECT ID FROM EQUINIX METAL SETUP INSTRUCTIONS"
export METAL_AUTH_TOKEN="COPIED API KEY FROM EQUINIX METAL SETUP INSTRUCTIONS"

Helping Future You

If you’re using this setup for a production cluster, you’ll want to prepare these tokens for multiple runs and perhaps even some automation. As such, you can store these exports inside a .envrc file and store it securely, or make these variables available within your CI system, such as Github Actions.

Creating the Terraform Project

You’ll be using Terraform to provision the underlying Equinix Metal hardware, as well as handling the Rancher Kubernetes Engine bootstrap.

Create a new directory and create terraform.tf file to host our provider configuration.

terraform {
  required_providers {
    cloudinit = {
      source = "hashicorp/cloudinit"
    }


    equinix = {
      source = "equinix/equinix"
    }


    rke = {
      source  = "rancher/rke"
      version = "1.4.1"
    }


    tls = {
      source  = "hashicorp/tls"
      version = "4.0.4"
    }
  }
}

provider "equinix" {}
provider "rke" {}

Create variables.tf that configures the variables we’ll accept for the Terraform project.

variable "project_id" {
  type = string
}

variable "control_plane_count" {
  type    = number
  default = 1
}

variable "worker_count" {
  type    = number
  default = 1
}

variable "kubernetes_version" {
  type    = string
  default = "v1.24.10-rancher4-1"
}

Feel free to modify the default values based on the setup you’d like to encourage within your organization.

For our nodes to bootstrap successfully, we need to ensure Docker is installed. Create the following file within your directory: ./cloudinit/bootstrap.sh with the following contents:

#!/usr/bin/env bash
set -xeuo pipefail
curl -fsSL https://get.docker.io | bash

If you want to customize the machines in any other way, modifying this file is probably your best option.

Lastly, we need main.tf that contains the resources required for this setup.

resource "tls_private_key" "ssh_key" {
  algorithm = "ED25519"
}

resource "equinix_metal_project_ssh_key" "ssh_key" {
  name       = "rancher"
  public_key = tls_private_key.ssh_key.public_key_openssh
  project_id = var.project_id
}

data "cloudinit_config" "bootstrap" {
  gzip          = false
  base64_encode = false


  part {
    filename     = "0-bootstrap.sh"
    content_type = "text/x-shellscript"


    content = file("${path.module}/cloudinit/bootstrap.sh")
  }
}

resource "equinix_metal_device" "rancher_control_plane" {
  count               = var.control_plane_count
  hostname            = "rancher-control-plane-${count.index}"
  plan                = "c3.small.x86"
  metro               = "da"
  operating_system    = "ubuntu_20_04"
  billing_cycle       = "hourly"
  project_id          = var.project_id
  project_ssh_key_ids = [equinix_metal_project_ssh_key.ssh_key.id]
  user_data           = data.cloudinit_config.bootstrap.rendered
}

resource "equinix_metal_device" "rancher_worker" {
  count               = var.control_plane_count
  hostname            = "rancher-worker-${count.index}"
  plan                = "c3.small.x86"
  metro               = "da"
  operating_system    = "ubuntu_20_04"
  billing_cycle       = "hourly"
  project_id          = var.project_id
  project_ssh_key_ids = [equinix_metal_project_ssh_key.ssh_key.id]
  user_data           = data.cloudinit_config.bootstrap.rendered
}

resource "rke_cluster" "cluster" {
  ignore_docker_version = true
  kubernetes_version = var.kubernetes_version
  delay_on_creation     = 60
  enable_cri_dockerd    = true

  cloud_provider {
    name = "external"
  }

  dynamic "nodes" {
    for_each = equinix_metal_device.rancher_control_plane

    content {
      address = nodes.value.access_public_ipv4
      user    = "root"
      role    = ["controlplane", "etcd"]
      ssh_key = tls_private_key.ssh_key.private_key_openssh
    }
  }

  dynamic "nodes" {
    for_each = equinix_metal_device.rancher_worker

    content {
      address = nodes.value.access_public_ipv4
      user    = "root"
      role    = ["worker"]
      ssh_key = tls_private_key.ssh_key.private_key_openssh
    }
  }

  upgrade_strategy {
    drain                  = true
    max_unavailable_worker = "20%"
  }
}

output "control_plane_ips" {
  value = equinix_metal_device.rancher_control_plane.*.access_public_ipv4
}

resource "local_file" "ssh_key" {
  filename        = "${path.root}/private-ssh-key"
  file_permission = "0400"
  content         = tls_private_key.ssh_key.private_key_openssh
}

resource "local_file" "kubeconfig" {
  filename = "${path.root}/kubeconfig"
  content  = rke_cluster.cluster.kube_config_yaml
}

Setup

Using an .envrc or a terraform.tfvars file, configure your project ID and METAL_AUTH_TOKEN.

Here’s an example .envrc.

export METAL_AUTH_TOKEN="abc-secret"
export TF_VAR_project_id="abc-123"

Here’s an example terraform.tfvars.

project_id = "abc-123"
# The METAL_AUTH_TOKEN cannot be set this way

Deploying

In order to deploy, we first need to grab the required providers with terraform init. Then we can run terraform apply.

It’s important to note that the first time you run this, you may get an error suggesting that the rke_cluster couldn’t connect to the machines. The user data can take a few seconds, potentially minutes, to install Docker and as such it will fail if the second step happens first.

We attempt to handle this with the delay_on_creation property on the rke_cluster resource, but there’s still various things that could cause this to delay longer (such as network congestion). You can either increase this value or just run terraform apply again after a minute or so.

Accessing Rancher Kubernetes Engine

As part of the Terraform run, two files are created in your current working directory:

  1. kubeconfig
  2. private-ssh-key

You can use the KUBECONFIG like so:

kubectl --kubeconfig kubeconfig get nodes

Configuration and Next Steps

Configuring the CNI

By default, RKE deploys Calico as the Container Networking Interface (CNI). If you wish to change this, you can modify the network block on the rke_cluster resource. Here’s an example of deploying Weavenet.

network {
  plugin = "weave"
}

RKE supports the following options:

  • ACI
  • Calico
  • Canal
  • Flannel
  • Weave

If you wanted to use Cilium, you would configure the networking plugin to “none” and handle this deployment yourself.

network {
  plugin = "none"
}

Then, using Helm, you could run:

helm --kubeconfig kubeconfig upgrade --install cilium cilium/cilium  \
                --version 1.13.1 \
                --namespace kube-system \
                --set image.repository=quay.io/cilium/cilium \
                --set global.ipam.mode=cluster-pool \
                --set global.ipam.operator.clusterPoolIPv4PodCIDR=192.168.0.0/16 \
                --set global.ipam.operator.clusterPoolIPv4MaskSize=23 \
                --set global.nativeRoutingCIDR=192.168.0.0/16 \
                --set global.endpointRoutes.enabled=true \
                --set global.hubble.relay.enabled=true \
                --set global.hubble.enabled=true \
                --set global.hubble.listenAddress=":4244" \
                --set global.hubble.ui.enabled=true

Configuring the Ingress

By default, RKE deploys nginx as the ingress controller. Much like the CNI, you can also specify “none” as the provider to enable your own.

ingress {
  provider = "none"
}

RKE supports additional manifests via the addons configuration block.

In order to use another ingress controller, you’ll likely want to deploy the Equinix Metal Cloud Controller Manager. You can do so as an addon_include.

Instructions for installing kube-vip can be found in their documentation.

Bootstrapping a GitOps Pipeline

And finally, using the addons configuration block, you can configure any other additional YAML manifests you want to apply at provision time, such as a GitOps toolchain.

helm --kubeconfig kubeconfig -n gitops-system install --create-namespace --wait \
    fleet-crd https://github.com/rancher/fleet/releases/download/v0.6.0-rc.4/fleet-crd-0.6.0-rc.4.tgz
helm --kubeconfig kubeconfig -n gitops-system install --create-namespace --wait \
    fleet https://github.com/rancher/fleet/releases/download/v0.6.0-rc.4/fleet-0.6.0-rc.4.tgz

Last updated

07 May, 2024

Category

Tagged

Technical
Subscribe to our newsletter

A monthly digest of the latest news, articles, and resources.