Skip to main content

Object storage using distributed MinIO with Terraform

How to deploy a distributed MinIO object storage system using Terraform on Equinix Metal

Object storage using distributed MinIO with Terraform

The Distributed MinIO with Terraform project is a Terraform configuration that will deploy MinIO on Equinix Metal. MinIO is a high-performance object storage server compatible with Amazon S3. MinIO is a great option for Equinix Metal users who want accessible, S3-compatible object storage, because Equinix Metal offers instance types with storage options including SATA SSDs, NVMe SSDs, and high capacity SATA HDDs.

In this guide, you'll set up Terraform and the Distributed MinIO project, deploy a MinIO cluster, and use it to upload files to S3 storage.

Install Terraform

Terraform is distributed as a single binary. Visit their download page, choose your operating system, make the binary executable, and move it into your path.

This repository currently supports Terraform v0.13 or higher.

Download the Distributed MinIO with Terrform project

To download the project, run the following command:

git clone https://github.com/packet-labs/terraform-distributed-minio.git

Initialize Terraform

Terraform uses modules to deploy infrastructure. In order to initialize the modules, run: terraform init. This will download modules into a hidden directory .terraform

Modify your variables

We've added .tfvars to the .gitignore file, but you can copy the template with:

cp vars.template terraform.tfvars`

In the terraform.tfvars file, modify the following variables:

  • auth_token - This is your Equinix Metal API Key.
  • project_id - This is your Equinix Metal Project ID.

Learn about Equinix Metal API Keys and Project IDs.

Optional variables are:

  • plan - We're using s3.xlarge.x86 servers by default.
  • operating_system - Though this does work on other Linux distros like CentOS and Debian, this install is verified for Ubuntu 20.04 (ubuntu_20_04) since it performs best.
  • metro - Where the servers should be deployed; we're using dc.
  • cluster_size - The number of servers in the cluster; we default to 4.
  • hostname - The naming scheme for your MinIO nodes; the default is minio-storage-node.
  • storage_drive_model - You'll have to know the storage drive model in advance of your deployment so MinIO only uses intended drives (mixing drives is not recommended). We're using HGST_HUS728T8TAL here since that's the current 8TB drive in the s3.xlarge.x86.
  • minio_region_name - The name for your cluster; the default is us-east-1.
  • port - The port on which to listen; the default is the MinIO standard of 9000.
  • public - Whether to listen on all IP addresses (including public and localhost), or just the single private IP; the default is true.

The following two settings are relevant when setting up your cluster as they define how performant (particularly when using HDDs) and how protected your data is. Consider how large the files you are storing are. The smaller the file (1MB and lower), you should use a lower erasure set size to gain more performance, though this consideration is based on the type of disks you are using.

  • minio_erasure_set_drive_count - This defines how many drives are in an erasure set. It should be a multiple of the cluster size. We're going with 8, which with our default settings means we will have 6 sets of 8 drives.
  • minio_storage_class_standard - This defines how many parity drives will be used in an erasure set, we're setting this to EC:2. With our default settings, that means for 8 drives in an erasue set, 2 will be dedicated to parity so you can lose up to 2 drives without losing any data.

For both settings, you can choose to pass default. Default favors resiliency: the erasure set will be calculated such that it's a multiple of the number of servers in a cluster, and can't be more than 16. The default parity is n/2, or half the number of drives in an erasure set, meaning 50% of the clusters total storage will be dedicated to parity. Again, these are things to consider based on business and performance goals, and how reselient you want your cluster to be.

To learn what storage drive model a given Equinix Metal server instance is using, you can deploy the instance with a Linux distribution such as Ubuntu, Debian, or CentOS and run:

lsblk -d -o name,size,model,rota

Specifying multiple drives is also an option when you are using the same server type with slightly revised drive models. To specify multiple drive models for MinIO to use, you can pass: DRIVE_MODEL_1\|DRIVE_MODEL_2 where each drive model name is separated by \|. For example:

DRIVE_MODEL="HGST_HUS728T8TAL\|Micron_5200_MTFD"

Also, leaving the string empty DRIVE_MODEL="" will make the script use any drive model.

If you wish to modify the filesystem to be used along with the parent path of the directories where the drives will be mounted, you can do so in the user_data.sh bash script in the /templates folder in this repository. The relevant bash variables are DATA_BASE for the parent directory path and FILESYSTEM_TYPE for the filesystem you wish to use.

Deploy the MinIO Cluster

Once your MinIO cluster is configured, the next step is deployment:

terraform apply --auto-approve

When this is complete you'll see output similar to this:

Apply complete! Resources: 10 added, 0 changed, 0 destroyed.

Outputs:

minio_access_key = Xe245QheQ7Nwi20dxsuF
minio_access_secret = 9g4LKJlXqpe7Us4MIwTPluNyTUJv4A5T9xVwwcZh
minio_endpoints = [
  "minio-storage-node1 minio endpoint is http://147.75.65.29:9000",
  "minio-storage-node2 minio endpoint is http://147.75.39.227:9000",
  "minio-storage-node3 minio endpoint is http://147.75.66.53:9000",
  "minio-storage-node4 minio endpoint is http://147.75.194.101:9000",
]
minio_region_name = us-east-1

Logging in to the MinIO Cluster

To log in and administer your cluster through the web UI, you can navigate to any of the endpoints provided at the end of the Terraform deploy message in your web browser and enter the provided access key and secret.

You can also use the CLI MinIO Client (MC) if you prefer. To connect the MinIO client with any of your hosts, log in to any of the MinIO nodes through SSH and run the following command:

mc config host add $ALIAS $MINIO_ENDPOINT $MINIO_ACCESS_KEY $MINIO_SECRET_KEY

Where $ALIAS can be any name (we are using minio as the alias). For $MINIO_ENDPOINT you can either use the public instance IP or the localhost address. $MINIO_ACCESS_KEY and $MINIO_SECRET_KEY are given in the Terraform output results. Here's an example:

mc config host add minio http://127.0.0.1:9000 Xe245QheQ7Nwi20dxsuF 9g4LKJlXqpe7Us4MIwTPluNyTUJv4A5T9xVwwcZh

Use this command to get some info on your cluster:

mc admin info minio --json | jq .info.backend

This will provide info about the Erasure Coding configuration used in both the standard and reduced redundancy Minio storage classes:

root@minio-storage-node1:~# mc admin info minio --json | jq .info.backend
{
  "backendType": "Erasure",
  "onlineDisks": 48,
  "rrSCData": 6,
  "rrSCParity": 2,
  "standardSCData": 6,
  "standardSCParity": 2
}

Sample S3 Upload

To use this MinIO setup to upload objects via Terraform to a public bucket on MinIO, you first need to create a bucket (public is the name of the bucket in this example). To create the bucket, log in to one of the MinIO servers with SSH and run the following. The command to add a host to the MinIO client is in the format of mc config host add $ALIAS $MINIO_ENDPOINT $MINIO_ACCESS_KEY $MINIO_SECRET_KEY. You can also add the following as part of the automation in the Terraform script.

mc config host add minio http://127.0.0.1:9000 Xe245QheQ7Nwi20dxsuF 9g4LKJlXqpe7Us4MIwTPluNyTUJv4A5T9xVwwcZh
mc mb minio/public
mc policy set public minio/public

To upload files through Terraform you can add the following code to the main.tf file:

provider "aws" {
    region = "us-east-1"
    access_key = "Xe245QheQ7Nwi20dxsuF"
    secret_key = "9g4LKJlXqpe7Us4MIwTPluNyTUJv4A5T9xVwwcZh"
    skip_credentials_validation = true
    skip_metadata_api_check = true
    skip_requesting_account_id = true
    s3_force_path_style = true
    endpoints {
        s3 = "http://147.75.65.29:9000"
    }   
}

resource "aws_s3_bucket_object" "object" {
    bucket = "public"
    key = "my_file_name.txt"
    source = "path/to/my_file_name.txt"
    etag = filemd5("path/to/my_file_name.txt")
}

We're using the AWS Terraform Provider here since MinIO is an S3-compliant storage solution.

Load Balancing Your MinIO Cluster

You should load balance the traffic to your MinIO server endpoints through a single endpoint. You can do this with a DNS record that points to your MinIO servers or you could use a Equinix Metal Elastic IP and announce it through BGP on all the MinIO servers to achieve ECMP load balancing.

Conclusion

In this guide you created a MinIO cluster with Terraform and used it as S3-compatible storage. You also saw how you can modify the terraform.tfvars file to specify your Equinix Metal server configuration, and modify your setup based on your performance and resilience needs.

Last updated

26 September, 2024

Category

Tagged

Technical
Subscribe to our newsletter

A monthly digest of the latest news, articles, and resources.