Use Hashicorp Nomad Cluster Autoscaling in OpenStack

Use the Nomad Autoscaler for Cluster Autoscaling in OpenStack

Overview

Hashicorps Workload Orchestrator - Nomad - offers various autoscaling functions, which are implemented with the Nomad Autoscaler. This tutorial shows how you can do “Horizontal Cluster Autoscaling” (which means to dynamically add or remove cluster nodes to resp. from the Nomad cluster) in an OpenStack environment.

Download the Nomad Autoscaler binary

In order to use autoscaling in a Nomad cluster, you need to configure and start the Nomad Autoscaler, which comes as a separate binary. It makes sense to start the Nomad Autoscaler as a Nomad job. Therefore you just download the required binary from releases.hashicorp.com and move it to /usr/local/bin on one of your Nomad clients.

Create a Nomad job file for the Nomad Autoscaler

As we want to run the Autoscaler as a Nomad job, we have to create a job file for it. That file has all the required configuration parameters. We will go through an example file here:

As with all Nomad job files we first configure region and datacenter(s) in which the job could be run. As this job is close the the control plane, we choose to run the job in its own namespace. This makes it necessary to apply a policy with the required rights for this namespace and job and create a token out of this policy. This is the token, that is used in the next steps.

The http endpoint of the Autoscaler is bound to a port - we let Nomad choose it randomly.

Next follow the configuration parameters of the Autoscaler. We configure the directory for plugins as well as directories for the general configuration and scaling-policies.

To let the Autoscaler access the local Nomad client, it needs the neccessary ssl certificates and a Nomad token which has the rights to create cluster nodes (s. above).

The beginning of the job files for the deployment of an instance of the Autoscaler could look like this:

job "autoscaler-prod4" {

  region      = "de-west"
  datacenters = ["prod4"]
  namespace   = "autoscalerprod4"

  group "autoscaler" {

    network {
      port "http" {}
    }

    task "autoscaler_agent" {
      driver = "exec"
      config {
            command = "/usr/local/bin/nomad-autoscaler"
            args = [
              "agent",
              "-plugin-dir=local/nomad-autoscaler/plugins",
              "-config=local/nomad-autoscaler/etc",
              "-policy-dir=local/nomad-autoscaler/etc/policies",
              "-nomad-address=https://127.0.0.1:4646",
              "-http-bind-address=${NOMAD_IP_http}",
              "-http-bind-port=${NOMAD_PORT_http}",
              "-nomad-ca-cert=local/nomad-autoscaler/etc/certificates/ca.pem",
              "-nomad-client-cert=local/nomad-autoscaler/etc/certificates/cert.pem",
              "-nomad-client-key=local/nomad-autoscaler/etc/certificates/private_key.pem",
              "-nomad-region=de-west"
            ]
          }
    [...]

The Nomad Nova Autoscaler Plugin

The autoscaler functions can be extended via plugins (e. g. in order to work with different cloud providers) as shortly mentioned above. The nova plugin for the Nomad Autoscaler is (among others) linked from the Nomad Autoscaler documentation. It is convenient to download the plugin from a cental distribution point, when the nomad job is executed:

    [...] 
      artifact {
        source = "https://github.com/jorgemarey/nomad-nova-autoscaler/releases/download/v0.6.0/nomad-nova-autoscaler-v0.6.0-linux-amd64.tar.gz"
        destination = "local/nomad-autoscaler/plugins"
        options {
            checksum = "md5:fec29af8625842b154d30be8b8db305f"
        }
      }
    [...]

Use of Nomad Variables

In order to keep sensitive information like ssl certificates or tokens out of the job file, it is useful to store them in Nomad Variables or in a Hashicorp Vault. In this part of the job file we see, that the Nomad token, the ssl certificates and the credentials to the OpenStack environment, which the Nova Autoscaler Plugin needs, are stored in Nomad variables. The Nomad APM Plugin, which can deliver cpu- und memory data, is used for the application performance management (APM). It is available automatically with every Nomad client as those are gathering cpu- and memory data of the clients. For more sophisticated metrics and scaling parameters (e. g. connections per second) we would use the Prometheus APM Plugin which obviously needs a running Prometheus instance and matching exporters.

    [...]
      template {
        destination = "${NOMAD_SECRETS_DIR}/env.txt"
        env         = true
        data        = <<EOT  
NOMAD_TOKEN={{ with nomadVar "nomad/jobs/autoscaler-prod4" }}{{ .token }}{{ end }}
EOT
      }
      template {
         data = <<EOH
{{ with nomadVar "nomad/jobs/autoscaler-prod4" }}{{ .cacert }}{{ end }}
         EOH
         destination = "local/nomad-autoscaler/etc/certificates/ca.pem"
      }
      template {
         data = <<EOH
{{ with nomadVar "nomad/jobs/autoscaler-prod4" }}{{ .clientcert }}{{ end }}
         EOH
         destination = "local/nomad-autoscaler/etc/certificates/cert.pem"
      }
      template {
         data = <<EOH
{{ with nomadVar "nomad/jobs/autoscaler-prod4" }}{{ .clientkey }}{{ end }}
         EOH
         destination = "local/nomad-autoscaler/etc/certificates/private_key.pem"
      }

      template {
         data = <<EOH
apm "nomad-apm" {
  driver = "nomad-apm"
}

target "os-nova" {
  driver = "os-nova"
  config = {
    auth_url    = {{- with nomadVar "nomad/jobs/autoscaler-prod4" }} "{{ .osauthurl }}" {{- end }}
    username    = {{- with nomadVar "nomad/jobs/autoscaler-prod4" }} "{{ .osusername }}" {{- end }}
    password    = {{- with nomadVar "nomad/jobs/autoscaler-prod4" }} "{{ .ospassword }}" {{- end }}
    domain_name = {{- with nomadVar "nomad/jobs/autoscaler-prod4" }} "{{ .osdomainname }}" {{- end }}
    project_id  = {{- with nomadVar "nomad/jobs/autoscaler-prod4" }} "{{ .osprojectid }}" {{- end }}
    region_name = {{- with nomadVar "nomad/jobs/autoscaler-prod4" }} "{{ .osregion }}" {{- end }}
  }
}
 
  [...]

Configure the scaling strategy

At least we have to configure the scaling strategy. This consists of a check, the strategy itself and a definition of the target, for which the strategy is going to be used. The check uses the data of the APM plugin to implement the strategy (“keep the percentage of allocated CPU at 70” in this case) for the target (using the “os-nova” target plugin).

As a strategy we chose “target-value”, in order to keep the cpu usage of the Nomad clients at about 70%. In the policy we additionally configure the length of the “cooldown” period, in which the autoscaler “pauses”, after a scaling event. Furthermore we set the “evaluation_interval” in which the autoscaler evaluates wether the number of Nomad clients needs to be changed. Also we set the min and max values, which define the limits in which the autoscaler will create resp. destroy Nomad clients.

The documentation of the OpenStack Nova Autoscaler Plugin has detailed information on all the various parameters. Most of them are self explaining. In order to implement the above strategy the autoscaler needs a defined Node-Pool and a “node_class” which is applied to all created Nomad clients, to have attributes, which apply to all the nodes, the plugin should watch.

As in this example it is possible to apply servergroups and security-groups to the newly created Nomad clients.

In the end we apply some cpu- and memory resources to the job itself.

  [...]
    strategy "target-value" { 
      driver = "target-value"
    }
             EOH
             destination = "local/nomad-autoscaler/etc/nom>
          }

          template {
             data = <<EOH

    scaling "worker_pool_policy" {
      enabled = true
      min     = 1
      max     = 2

      policy {
        cooldown            = "2m"
        evaluation_interval = "1m"

        check "cpu_allocated_percentage" {
          source = "nomad-apm"
          query  = "percentage-allocated_cpu"
          strategy "target-value" {
            target = 70
          }
        }

        target "os-nova" {
          dry-run = false

          stop_first         = true
          image_id           = "0c453c2c-cdc2-416a-95f7-c1>
          flavor_name        = "SCS-2V-2-20"
          pool_name          = "nom-pool"
          name_prefix        = "nom-"
          network_id         = "275b130d-c650-4f20-a25c-1f>
          security_groups    = "default"
          availability_zones = "az1"
          tags               = "nom-pool,ubuntu-minimal"
          server_group_id    = "373265a7-5856-4e5c-a371-43>
  
          node_class                    = "dynamic"
          node_drain_deadline           = "1h"
          node_drain_ignore_system_jobs = false
          node_purge                    = true
          node_selector_strategy        = "least_busy"
        }
      } 
    }

             EOH
             destination = "local/nomad-autoscaler/etc/pol>
          }

          resources {  
            cpu    = 50
            memory = 128
          }

        } 
      }
    }

Result

Using the OpenStack Nova Autoscaler Plugin Nomad can dynamically create and remove Nomad clients in a Nomad cluster. When you have created an image (e. g. using Packer or Terraform) for your dynamically started Nomad clients and have started the Autoscaler with the job file, you should see automatically created nodes showing similar to this:

root@nomad1:~# nomad node status -allocs -os  |grep -i dynamic
d838cb47  nom-pool    prod4    nom-1a607061-1f5c      dynamic     debian  false  eligible     ready   1
7f10377b  nom-pool    prod4    nom-d1d8e976-bc4f      dynamic     debian  false  eligible     ready   1
Complete Nomad job file
        job "autoscaler-ha-prod4" {

          region      = "de-west"
          datacenters = ["prod4"]
          namespace   = "autoscalerprod4"

          group "autoscaler" {

            network {
              port "http" {}
            }

            task "autoscaler_agent" {
              driver = "exec"
              config {
                    command = "/usr/local/bin/nomad-autoscaler"
                    args = [
                      "agent",
                      "-plugin-dir=local/nomad-autoscaler/plugins",
                      "-config=local/nomad-autoscaler/etc",
                      "-policy-dir=local/nomad-autoscaler/etc/policies",
                      "-nomad-address=https://127.0.0.1:4646",
                      "-http-bind-address=${NOMAD_IP_http}",
                      "-http-bind-port=${NOMAD_PORT_http}",
                      "-nomad-ca-cert=local/nomad-autoscaler/etc/certificates/ca.pem",
                      "-nomad-client-cert=local/nomad-autoscaler/etc/certificates/cert.pem",
                      "-nomad-client-key=local/nomad-autoscaler/etc/certificates/private_key.pem",
                      "-nomad-region=de-west"
                    ]
                  }

              template {
                destination = "${NOMAD_SECRETS_DIR}/env.txt"
                env         = true
                data        = <<EOT
        NOMAD_TOKEN={{ with nomadVar "nomad/jobs/autoscaler-ha-prod4" }}{{ .token }}{{ end }}
        EOT
              }
              artifact {
                source = "https://github.com/jorgemarey/nomad-nova-autoscaler/releases/download/v0.6.0/nomad-nova-autoscaler-v0.6.0-linux-amd64.tar.gz"
                destination = "local/nomad-autoscaler/plugins"
                options {
                    checksum = "md5:fec29af8625842b154d30be8b8db305f"
                }
              }

              template {
                data = <<EOH
        {{ with nomadVar "nomad/jobs/autoscaler-ha-prod4" }}{{ .cacert }}{{ end }}
                EOH
                destination = "local/nomad-autoscaler/etc/certificates/ca.pem"
              }

              template {
                data = <<EOH
        {{ with nomadVar "nomad/jobs/autoscaler-ha-prod4" }}{{ .clientcert }}{{ end }}
                EOH
                destination = "local/nomad-autoscaler/etc/certificates/cert.pem"
              }

              template {
                data = <<EOH
        {{ with nomadVar "nomad/jobs/autoscaler-ha-prod4" }}{{ .clientkey }}{{ end }}
                EOH
                destination = "local/nomad-autoscaler/etc/certificates/private_key.pem"
              }

              template {
                data = <<EOH

        apm "nomad-apm" {
          driver = "nomad-apm"
        }

        target "os-nova" {
          driver = "os-nova"
          config = {
            auth_url    = {{- with nomadVar "nomad/jobs/autoscaler-ha-prod4" }} "{{ .osauthurl }}" {{- end }}
            username    = {{- with nomadVar "nomad/jobs/autoscaler-ha-prod4" }} "{{ .osusername }}" {{- end }}
            password    = {{- with nomadVar "nomad/jobs/autoscaler-ha-prod4" }} "{{ .ospassword }}" {{- end }}
            domain_name = {{- with nomadVar "nomad/jobs/autoscaler-ha-prod4" }} "{{ .osdomainname }}" {{- end }}
            project_id  = {{- with nomadVar "nomad/jobs/autoscaler-ha-prod4" }} "{{ .osprojectid }}" {{- end }}
            region_name = {{- with nomadVar "nomad/jobs/autoscaler-ha-prod4" }} "{{ .osregion }}" {{- end }}
          }
        }

        strategy "target-value" {
          driver = "target-value"
        }
                EOH
                destination = "local/nomad-autoscaler/etc/nomad-autoscaler.hcl"
              }

              template {
                data = <<EOH

        scaling "worker_pool_policy" {
          enabled = true
          min     = 1
          max     = 2

          policy {
            cooldown            = "2m"
            evaluation_interval = "1m"

            check "cpu_allocated_percentage" {
              source = "nomad-apm"
              query  = "percentage-allocated_cpu"
              strategy "target-value" {
                target = 70
              }
            }

            target "os-nova" {
              dry-run = false

              stop_first         = true
              image_id           = "0c453c2c-cdc2-416a-95f7-c1779ed2fc54"
              flavor_name        = "SCS-2V-2-20"
              pool_name          = "nom-pool"
              name_prefix        = "nom-"
              network_id         = "275b130d-c650-4f20-a25c-1f6568f520dc"
              security_groups    = "default"
              availability_zones = "az1"
              tags               = "nom-pool,ubuntu-minimal"
              server_group_id    = "373265a7-5856-4e5c-a371-43b923c4a3d0"
              
              node_class                    = "dynamic"
              node_drain_deadline           = "1h"
              node_drain_ignore_system_jobs = false
              node_purge                    = true
              node_selector_strategy        = "least_busy"
            }
          }
        }

                EOH
                destination = "local/nomad-autoscaler/etc/policies/scaling-policy.hcl"
              }

              resources {
                cpu    = 50
                memory = 128
              }

            }
          }
        }
Last modified 03.06.2026: additions I (93489553)