Puppet Concepts by Example: Part 1

6 min readOct 27, 2024

Unfortunately the official Welcome to Puppet® 8.10.0 documentation reads more like a reference than a guided tutorial; as such, I was motivated to write this series of articles to fill this gap.

Unlike the another series of articles I wrote some years ago, Puppet Code by Example: Part 1 [2,3], the focus here will be more on the Puppet concepts rather than the Puppet language.

Requirements

To follow along, you will need a machine that meets the requirements of a supported agent platform (and really should be some variant of *nix).

This tutorial was written using a Google Cloud GCE instance based on the Canonical, Ubuntu, 24.04 LTS Minimal, amd64 noble minimal image built on 2024–09–26 image.

Architecture

From the Puppet architecture page:

Puppet is configured in an agent-server architecture, in which a primary server node controls configuration information for a fleet of managed agent nodes.

If one, however, looks at the Puppet 5.5 equivalent page, there is a second architecture (while deprecated, is still supported in Puppet 8.10):

Alternatively, Puppet can run in a stand-alone architecture, where each managed node has its own complete copy of your configuration info and compiles its own catalog.

Because it is easier to operate, we will use the stand-alone architecture throughout this series of articles; the concepts, however, equally apply to the agent-server architecture.

note: There are other arguments for using the stand-alone architecture that are outside the scope of this article.

Installation

Here we simply follow the Puppet Install Puppet documentation and follow the instructions to installing agents (and thus not Puppet server).

The one deviation is that we completely disable the puppet (agent) service using the following command as root:

# /opt/puppetlabs/bin/puppet resource service puppet ensure=stopped enable=false

Assuming that we followed all the instructions, we can run the puppet CLI, e.g., get help.

# puppet apply -h

Apply

While not used in the agent-server architecture, the puppet apply command is the key to the stand-alone architecture.

Applies a standalone Puppet manifest to the local system.

As root, we will first execute a trivial bit of Puppet code.

# puppet apply --execute 'notify { "hello world": }'
Notice: Compiled catalog for gue1-jtucker-b-r9sh.us-east1-b.c.alf-testing.internal in environment production in 0.02 seconds
Notice: hello world
Notice: /Stage[main]/Main/Notify[hello world]/message: defined 'message' as 'hello world'
Notice: Applied catalog in 0.01 seconds

codedir

From Code and data directory (codedir):

The codedir is the main directory for Puppet code and data. It is used by the primary Puppet server and Puppet apply, but not by Puppet agent. It contains environments (which contain your manifests and modules), a global modules directory for all environments, and your Hiera data and configuration.

On *nix machines this defaults to /etc/puppetlabs/code with the following initial content.

In the agent-server architecture there is a concept of environments:

An environment is an isolated group of agent nodes that a primary server can serve with its own main manifest and set of modules.

In the case of the stand-alone architecture each machine has their own codedir; so this concept is not relevant. For the sake of consistency, however, we will build upon the content in the production folder.

The environment.conf file is effectively empty (only comments) but could be used to use different manifests and modules folders.

We will delve into the hiera.yaml file later.

---
version: 5
defaults:
  # The default value for "datadir" is "data" under the same directory as the hiera.yaml
  # file (this file)
  # When specifying a datadir, make sure the directory exists.
  # See https://puppet.com/docs/puppet/latest/environments_about.html for further details on environments.
  # datadir: data
  # data_hash: yaml_data
hierarchy:
  - name: "Per-node data (yaml version)"
    path: "nodes/%{::trusted.certname}.yaml"
  - name: "Other YAML hierarchy levels"
    paths:
      - "common.yaml"

While it is not obvious from the puppet apply command, the proper use of the single parameter is to reference an environment folder.

# puppet apply /etc/puppetlabs/code/environments/production/
Notice: Compiled catalog for gue1-jtucker-b-r9sh.us-east1-b.c.alf-testing.internal in environment production in 0.01 seconds
Notice: Applied catalog in 0.01 seconds

Modules

The first big concept is modules; from Modules overview:

You’ll keep nearly all of your Puppet code in modules. Each module manages a specific task in your infrastructure, such as installing and configuring a piece of software. Modules serve as the basic building blocks of Puppet and are reusable and shareable.

While it might be obvious, we will create folders under the environments/production/modules as we create our own modules.

The Puppet document would have you install the Puppet Development Kit (PDK), and create a module, named my_module, by running the command pdk new module my_module in the modules folder. The my_modules folder and contents are:

The trouble with this approach is virtually all of these files / folders are optional (and confusing); instead we will create a minimal module manually.

We first create the my_modules and my_modules/manifests folders to create the minimal module folder structure.

Classes

Having created our my_module module, we create a class; from Classes:

Classes are named blocks of Puppet code that are stored in modules and applied later when they are invoked by name.

Here we create the main class of our module.

The init.pp manifest is the main class of a module and, unlike other classes or defined types, it is referred to only by the name of the module itself.

If we were using the PDK, we would run pdk new class my_module from the my_modules folder to create two files; manifests/init.pp where we define the class and spec/classes/my_module_spec.rb which scaffolds a test.

# pdk new class my_module

---------------Files added--------------
/etc/puppetlabs/code/environments/production/modules/my_module/spec/classes/my_module_spec.rb
/etc/puppetlabs/code/environments/production/modules/my_module/manifests/init.pp

The manifests/init.pp file, which we create manually, provides us the structure for our main class.

# @summary A short summary of the purpose of this class
#
# A description of what this class does
#
# @example
#   include my_module
class my_module {
}

To give the class (and thus module) some functionality, we add the hello world Puppet code we used earlier.

# @summary A short summary of the purpose of this class
#
# A description of what this class does
#
# @example
#   include my_module
class my_module {
  notify { "hello world": }
}

With this in place, we can use (include) this module in our puppet apply command.

# puppet apply --execute 'include my_module'  /etc/puppetlabs/code/environments/production/
Notice: Compiled catalog for gue1-jtucker-b-r9sh.us-east1-b.c.alf-testing.internal in environment production in 0.02 seconds
Notice: hello world
Notice: /Stage[main]/My_module/Notify[hello world]/message: defined 'message' as 'hello world'
Notice: Applied catalog in 0.02 seconds

Resources

Moving down the code organizational ladder, from modules to classes, we get to the smallest unit; the resource.

Resources are the fundamental unit for modeling system configurations. Each resource describes the desired state for some aspect of a system, like a specific service or package. When Puppet applies a catalog to the target system, it manages every resource in the catalog, ensuring the actual state matches the desired state.

For example, the notify { “hello world”: } in our earlier examples is a simple resource. Let us consider the structure of a resource:

<TYPE> { '<TITLE>':
  <ATTRIBUTE> => <VALUE>,
  <ATTRIBUTE> => <VALUE>,
}

Where:

<TYPE>: Is one of the numerous resource types, e.g., file
<TITLE>: A arbitrary unique resource identifier across all resources of the same type (across all resources applied to a particular system)
<ATTRIBUTE>: Resource type-specific parameters

In this case notify is the type and hello world is the title.

Main Manifest

In the previous example, we had to execute the include my_module command; here we direct that all nodes, i.e., our machine, to do so using a manifest. From the Main manifest directory:

Puppet starts compiling a catalog either with a single manifest file or with a directory of manifests that are treated like a single file. This starting point is called the main manifest or site manifest.

We create a file (named by convention) manifests/site.pp.

node default {
  include my_module
}

With this in place we can now remove the execute command and still have the module included.

# puppet apply /etc/puppetlabs/code/environments/production/
Notice: Compiled catalog for gue1-jtucker-b-r9sh.us-east1-b.c.alf-testing.internal in environment production in 0.02 seconds
Notice: hello world
Notice: /Stage[main]/My_module/Notify[hello world]/message: defined 'message' as 'hello world'
Notice: Applied catalog in 0.01 seconds

Next Steps

We continue this series in next article, Puppet Concepts by Example: Part 2.