Puppet Concepts by Example: Part 2

John Tucker
5 min readOct 28, 2024

--

This is part of a series of articles starting with Puppet Concepts by Example: Part 1.

Node Definitions

In the previous example, the manifests/site.pp file included my_module on all nodes. So, from the same codedir how could we target specific modules to certain nodes? From Node definitions:

A node definition, also known as a node statement, is a block of Puppet code that is included only in matching nodes’ catalogs. This allows you to assign specific configurations to specific nodes.

Also:

Unlike more general conditional structures, node definitions match nodes only by name. By default, the name of a node is its certname, which defaults to the node’s fully qualified domain name.

note: To obtain your machine’s fully qualified domain name (FDQN) use the hostname command.

$ hostname --fqdn
gue1-jtucker-b-r9sh.us-east1-b.c.alf-testing.internal

We then update the manifests/site.pp fileto be:

node 'gue1-jtucker-b-r9sh.us-east1-b.c.alf-testing.internal' {
include my_module
}

node default {
}

note: We left in the default node definition as a fallback should a node does not have an explicit match.

As expected, running puppet apply includes the module.

# puppet apply /etc/puppetlabs/code/environments/production/
Notice: Compiled catalog for gue1-jtucker-b-r9sh.us-east1-b.c.alf-testing.internal in environment production in 0.02 seconds
Notice: hello world
Notice: /Stage[main]/My_module/Notify[hello world]/message: defined 'message' as 'hello world'
Notice: Applied catalog in 0.01 seconds

But having to create node definitions for each machine is not scalable. Node definitions, however, are a bit more flexible.

A node definition name must be one of the following:

A quoted string containing only letters, numbers, underscores (_), hyphens (-), and periods (.).

A regular expression.

The bare word default. If no other node definition matches a given node, the default node definition will be used for that node.

Here we could have used match any machine with a FQDN that ends in alf-testing.internal.

node /.*\.alf-testing\.internal$/ {
include my_module
}

node default {
}

It is important to know that if a node matches more than one node definition, one (randomly picked) will be used.

External node classifiers (ENC)

What if we want to include classes based on a number of “dimensions”?

For example, thinking about our example FQDN, gue1-jtucker-b-r9sh.us-east1-b.c.alf-testing.internal, say we wanted to target classes based on the combination of the position of the alf-testing string (let us call this the project) and the us-east1 string (let us call this the region).

note: If you have not guessed, I am using Google Compute Engine (GCE) instance.

Because only a single node definition can match a node, we cannot use this approach. There, however, is a solution described in Classifying nodes:

An external node classifier is an executable that Puppet Server or puppet apply can call; it doesn’t have to be written in Ruby. Its only argument is the name of the node to be classified, and it returns a YAML document describing the node.

The problem, however, this external node classifiers feature is only available in the agent-server architecture.

Hacking an ENC Alternative

It turns out that we can create a Puppet function, functions/puppet_node_classifier.pp, as an alternative hack to using ENC. It accepts the node’s name and returns an array of classes to be included.

function puppet_node_classifier(String $name) >> Array[String] {
if $name =~ /.*\.alf-testing\.internal$/ {
$classes = [ 'my_module' ]
} else {
$classes = [ ]
}
$classes
}

note: This is a simple example that replicates the earlier regular expression node definition example.

We can then modify the manifests/site.pp file to use this function to include all the returned classes.

$classes = puppet_node_classifier("${trusted['hostname']}.${trusted['domain']}")
$classes.each |Integer $index, String $value| { include $value }

node default {
}

Here we can see that it indeed works as expected.

# puppet apply /etc/puppetlabs/code/environments/production/
Notice: Compiled catalog for gue1-jtucker-b-r9sh.us-east1-b.c.alf-testing.internal in environment production in 0.03 seconds
Notice: hello world
Notice: /Stage[main]/My_module/Notify[hello world]/message: defined 'message' as 'hello world'
Notice: Applied catalog in 0.01 seconds

Parameterized Modules

So far we have been focusing on whether or not to include a module; here we pivot to changing the behavior of a module instead by using a parametrized class.

Here we create a new module by creating the file (and folders) modules/my_parameterized_module/manifests/init.pp:

# @summary A short summary of the purpose of this class
#
# A description of what this class does
#
# @example
# include my_parameterized_module
class my_parameterized_module (String $greet = 'hello world 2') {
notify { "${greet}": }
}

and use it with an updated manifests/site.pp:

node default {
include my_parameterized_module
}

Here we can see that it indeed works as expected.

# puppet apply /etc/puppetlabs/code/environments/production/
Notice: Compiled catalog for gue1-jtucker-b-r9sh.us-east1-b.c.alf-testing.internal in environment production in 0.04 seconds
Notice: hello world 2
Notice: /Stage[main]/My_parameterized_module/Notify[hello world 2]/message: defined 'message' as 'hello world 2'
Notice: Applied catalog in 0.02 seconds

Hmm.. We have a parametrized class, but without supplying a parameter when we instantiated the class we got the default behavior.

Hiera

Here we introduce another big concept: hiera (as in hierarchical). From About Hiera:

Puppet’s strength is in reusable code. Code that serves many needs must be configurable: put site-specific information in external configuration data files, rather than in the code itself.

We now explore the meaning of the hiera.yaml file that we saw earlier.

---
version: 5
defaults:
# The default value for "datadir" is "data" under the same directory as the hiera.yaml
# file (this file)
# When specifying a datadir, make sure the directory exists.
# See https://puppet.com/docs/puppet/latest/environments_about.html for further details on environments.
# datadir: data
# data_hash: yaml_data
hierarchy:
- name: "Per-node data (yaml version)"
path: "nodes/%{::trusted.certname}.yaml"
- name: "Other YAML hierarchy levels"
paths:
- "common.yaml"

From the document Looking up data with Hiera we see how to provide parameters to our module.

Puppet looks up the values for class parameters in Hiera, using the fully qualified name of the parameter (myclass::parameter_one) as a lookup key.

Here we first provide a parameter to all nodes by creating a new file data/common.yaml.

---
my_parameterized_module::greet: 'hola mundo'

Here we indeed get the expected result.

# puppet apply /etc/puppetlabs/code/environments/production/
Notice: Compiled catalog for gue1-jtucker-b-r9sh.us-east1-b.c.alf-testing.internal in environment production in 0.03 seconds
Notice: hola mundo
Notice: /Stage[main]/My_parameterized_module/Notify[hola mundo]/message: defined 'message' as 'hola mundo'
Notice: Applied catalog in 0.01 seconds

If we look back at the hiera.yaml file, we can see that we can create node specific hiera data, for example we can create a new file (and folders) data/nodes/gue1-jtucker-b-r9sh.us-east1-b.c.alf-testing.internal.yaml:

---
my_parameterized_module::greet: 'hallo welt'

Here we see that the node-specific parameter took effect.

# puppet apply /etc/puppetlabs/code/environments/production/
Notice: Compiled catalog for gue1-jtucker-b-r9sh.us-east1-b.c.alf-testing.internal in environment production in 0.05 seconds
Notice: hallo welt
Notice: /Stage[main]/My_parameterized_module/Notify[hallo welt]/message: defined 'message' as 'hallo welt'
Notice: Applied catalog in 0.02 seconds

The key to understanding why the node-specific parameter took effect (over the one in the common.yaml) is the following from Looking up data with Hiera.

Earlier data sources have priority over later ones. In the example above, the node-specific data has the highest priority, and can override data from any other level. Business group data is separated into local and global sources, with the local one overriding the global one. Common data used by all nodes always goes last.

Next Steps

We continue this series in next article, Puppet Concepts by Example: Part 3.

--

--

John Tucker
John Tucker

Written by John Tucker

Broad infrastructure, development, and soft-skill background

No responses yet