Puppet Concepts by Example: Part 3

John Tucker
5 min readOct 29, 2024

--

This is part of a series of articles starting with Puppet Concepts by Example: Part 1.

Facts

Previously we were able to provide parameters to modules to all nodes or individual nodes, here we explore using facts; from Facts and built-in variables:

Before requesting a catalog for a managed node, or compiling one with puppet apply, Puppet collects system information, called facts, by using the Facter tool. The facts are assigned as values to variables that you can use anywhere in your manifests.

To get the (extensive) list of facts, we run:

# facter -p
aio_agent_version => 8.9.0
augeas => {
version => "1.14.1"
}
cloud => {
provider => "gce"
}
...
os => {
architecture => "amd64",
distro => {
codename => "noble",
description => "Ubuntu 24.04.1 LTS",
id => "Ubuntu",
release => {
full => "24.04",
major => "24.04"
}
},
family => "Debian",
hardware => "x86_64",
name => "Ubuntu",

note: It is interesting that the facts include cloud provider, e.g. Google Container Engine, specific information.

There an extensive list of facts here, let us focus on a familiar one; os.family.

note: In addition to the facts, there are number of Puppet-specific built-in variables, e.g,. we saw one earlier in the hiera.yaml file; ::trusted.certname.

Here we delete the node-specific hiera data file, data/nodes/gue1-jtucker-b-r9sh.us-east1-b.c.alf-testing.internal.yaml.

We create (including the folders) a file os/Debian.yaml with:

---
my_parameterized_module::greet: 'hello Debian'

The data folder is as such:

We update the hiera.yaml; inserting into the hierarchy a per-os item.

---
version: 5
defaults:
# The default value for "datadir" is "data" under the same directory as the hiera.yaml
# file (this file)
# When specifying a datadir, make sure the directory exists.
# See https://puppet.com/docs/puppet/latest/environments_about.html for further details on environments.
# datadir: data
# data_hash: yaml_data
hierarchy:
- name: "Per-node data (yaml version)"
path: "nodes/%{::trusted.certname}.yaml"
- name: "Per-OS data (yaml version)"
path: "os/%{facts.os.family}.yaml"
- name: "Other YAML hierarchy levels"
paths:
- "common.yaml"

With this in place, we get.

# puppet apply /etc/puppetlabs/code/environments/production/
Notice: Compiled catalog for gue1-jtucker-b-r9sh.us-east1-b.c.alf-testing.internal in environment production in 0.04 seconds
Notice: hello Debian
Notice: /Stage[main]/My_parameterized_module/Notify[hello Debian]/message: defined 'message' as 'hello Debian'
Notice: Applied catalog in 0.01 seconds

note: If no matching file in the data/os folder exists, it will fall back to the parameter value in data/common.yaml.

Custom / External Facts

While the list of facts is rather extensive, we often need to create facts to reflect how we organize machines. For example, say we named our machines based on a pattern, [pod]-[role]-[pool]-[hash], and we want to expose the facts: pod, role, and pool.

One approach is use custom facts as described in Custom facts overview.

You can add custom facts by writing snippets of Ruby code on the primary Puppet server. Puppet then uses plug-ins in modules to distribute the facts to the client.

I have two issues with this approach:

  • Don’t know the Ruby language; and really do not want to
  • It feels a little circular to use modules to deploy facts (which is then often used to control what modules are deployed)

Another approach is external facts as described in External facts where the facts are created outside of Puppet in whatever language one prefers.

Executable facts on Unix work by dropping an executable file into the standard external fact path.

So, here we create a simple Bash script pod-role-pool.sh in the folder /etc/puppetlabs/facter/facts.d (created folders as needed).

#!/usr/bin/env bash

HOSTNAME=$(hostname)
ARRAY_HOSTNAME=(${HOSTNAME//-/ })
echo "pod=${ARRAY_HOSTNAME[0]}"
echo "role=${ARRAY_HOSTNAME[1]}"
echo "pool=${ARRAY_HOSTNAME[2]}"

note: This simplified script assumes the machine is named based on the pattern, [pod]-[role]-[pool]-[hash].

With this script in place, we run facter and indeed see the external facts.

# facter -p | grep ^pod
pod => gue1

Let us explore how we might use these external facts to provide parameters to modules. In this example, we consider the following hierarchy (from highest to lowest priority)

  • node
  • role + pod
  • pod
  • common

We create the folders as shown.

and the data/role/jtucker/pod/gue1.yaml file.

---
my_parameterized_module::greet: 'hello jtucker gue1'

We then update hiera.yaml for this setup.

---
version: 5
defaults:
# The default value for "datadir" is "data" under the same directory as the hiera.yaml
# file (this file)
# When specifying a datadir, make sure the directory exists.
# See https://puppet.com/docs/puppet/latest/environments_about.html for further details on environments.
# datadir: data
# data_hash: yaml_data
hierarchy:
- name: "Per-node data (yaml version)"
path: "nodes/%{::trusted.certname}.yaml"
- name: "Per-role-pod data (yaml version)"
path: "role/%{facts.role}/pod/%{facts.pod}.yaml"
- name: "Per-role data (yaml version)"
path: "role/%{facts.role}.yaml"
- name: "Per-pod data (yaml version)"
path: "pod/%{facts.pod}.yaml"
- name: "Other YAML hierarchy levels"
paths:
- "common.yaml"

With this in place, we get.

# puppet apply /etc/puppetlabs/code/environments/production/
Notice: Compiled catalog for gue1-jtucker-b-r9sh.us-east1-b.c.alf-testing.internal in environment production in 0.03 seconds
Notice: hello jtucker gue1
Notice: /Stage[main]/My_parameterized_module/Notify[hello jtucker gue1]/message: defined 'message' as 'hello jtucker gue1'
Notice: Applied catalog in 0.02 seconds

Some Hiera Magic to Assign Classes

In the previous examples, we have been able to harness the power of hiera to flexibly provide parameters to classes.

If you recall, however, we had to rely on a hacky Puppet function to flexibly assign classes; if only we could also use hiera to do this. Turns out that you can (got idea for this solution from my place of work).

Building off our previous example, we update data/common.yaml to:

---
classes:
- my_parameterized_module
my_parameterized_module::greet: 'hola mundo'

We update data/role/jtucker/pod/gue1.yaml to:

---
classes:
- my_module
my_parameterized_module::greet: 'hello jtucker gue1'

And here comes the magic in our updated manifests/site.pp.

lookup('classes', Array[String], 'unique').include

With this in place, we get.

# puppet apply /etc/puppetlabs/code/environments/production/
Notice: Compiled catalog for gue1-jtucker-b-r9sh.us-east1-b.c.alf-testing.internal in environment production in 0.04 seconds
Notice: hello world
Notice: /Stage[main]/My_module/Notify[hello world]/message: defined 'message' as 'hello world'
Notice: hello jtucker gue1
Notice: /Stage[main]/My_parameterized_module/Notify[hello jtucker gue1]/message: defined 'message' as 'hello jtucker gue1'
Notice: Applied catalog in 0.02 seconds

It turns out that this particular pattern is buried in the Looking up data with Hiera documentation.

A unique merge lookup of class names, then adding all of those classes to the catalog:

Wrap Up

So, the tricky thing with the Puppet documentation is that it is hard to know if one has covered all the key concepts.

The known topics that were not covered is what is in Managing environment content with a Puppetfile and the related Using content from Puppet Forge. These topics felt complicated enough to warrant their own investigation.

--

--

John Tucker
John Tucker

Written by John Tucker

Broad infrastructure, development, and soft-skill background

No responses yet