Every time I dive into the source of Puppet, I seem to forget everything about as fast as I figure it out. I have the attention of a small overstimulated chipmunk, and there’s just a lot of detail to absorb so contents tend to slip out of my brain. In light of this, I’ve decided that I’m going to try to blog on each module/class that I manage to decipher. It’ll force me to get my thoughts in one place. I’m also hoping that this will help other people who go delving into the source.
Note: All of this is done against 2.7.x. While I would love to start tearing into 3.0.0, it introduces some new behavior that I don’t want to talk about yet.
Also, this is just what I’ve been able to derive while reading the source, so I could be wrong. If you find something erroneous, please find me in #puppet on freenode and let me know. (Yes, I need to add comments to my blog. It’s on the TODO list.)
Getting started: Puppet::Configurer
The Configurer is the heart of the normal Puppet agent. When you think about the different stages of a normal agent run, it’s all kicked off by the Configurer. It handles pluginsync, uploading facts, retrieving a catalog, applying the catalog, and then submitting the report.
The Configurer class doesn’t seem to be designed much as a general use class.
From what I’ve gleaned, the expectation is that you’ll instantiate the object,
#run on it, and call it a day. But considering that it’s the class that
drives pretty much everything, it’s definitely good to be familiar with it.
It’s also worth noting that the Configurer might eventually become obsolete. With the advent of Puppet Faces, the work that the Configurer does now can probably be replaced by assembling Faces. In fact, I believe the secret agent face does just this. It does make sense to see things moving from the monolithic, one shot architecture used by the Configurer to behavior more akin to the secret agent face.
That being said, if you’re running
puppet agent then you’re using this code.
Before we get started, this code makes heavy use of the indirector. If you aren’t familiar with the indirector, you should read Masterzen’s blog post on the indirector.
Alright, let’s go source diving.
Puppet::Configurer.new # => Your configurer Puppet::Configurer.instance # => the same configurer
It’s important to note that the Configurer is expected to be a singleton
instance. If you instantiate a Configurer object,
Configurer.instance is how
you can get it.
Now what’s interesting is that the agent itself handles locking. By the look of
it, this looks like the
Puppet::Agent class was split off from the Configurer.
In both classes, there’s this comment:
# Just so we can specify that we are "the" instance.
It looks like the configurer and agent were split, and some of the locking/singleton logic was left here. This is more of a historical reference; it may not be used but this will be relevant later.
(Grossly oversimplified) example:
c = Puppet::Configurer.new c.run # OMG PUPPET RUN! No, really, this is basically all you need to do a run.
This is where the magic happens. There’s a pattern that pops up in Puppet fairly frequently, where there’s a number of normal methods, and one method that basically runs everything else. Nothing too unusual, it just means that there’s one point that ties together all the class logic. This is it.
This does a lot, so I’ll summarize.
1. Set up reporting
The first thing we do is generate a report by adding it as a new log
destination; all logged actions will make it here. It’s done by creating a new
Puppet::Transaction::Report object, and adding it as a log destination. This
way, the report that’ll be submitted to the master will be populated in the same
way that logging would be done to syslog, or to the console if you’re using
puppet agent -t.
2. Prepare storage and sync plugins
Some basic prep is done with the
#prepare method. It sets up caching for the
application. If pluginsync is turned on,
#prepare will download our plugins -
Facter facts, types, providers, etc.
After that, facts for catalog compilation are gathered with the
3. Retrieve and apply the catalog
Once we have our facts, we have everything we need to actually perform the run.
#retrieve_and_apply_catalog method is called with the facts we just
4. Upload the report
After we’ve applied the catalog, then the run is complete. The report generated
at the beginning of the run is then sent with the
Whew. Yeah, this method does a lot. Starting from the top, let’s work down
through the methods that
#run calls to see what’s done at a lower level.
This method handles two things - setting up a cache for puppet, and running pluginsync if necessary.
The first part instantiates the
Puppet::Util::Store singleton object for the
rest of the run. This way, the rest of the system can use that for caching, and
not have to worry about how it gets there.
Have you ever CTRL-C’d a puppet run, rerun it, and got an error about a corrupt state file? This is where it whines, and then nukes the old statefile.
(Taken from the aforementioned code)
Puppet.err "State got corrupted"
Familiar? If some part of Puppet was writing to the statefile when Puppet was terminated, this statefile might get mangled. If this file exists and is corrupted, it’s deleted.
The other part of
#prepare is pluginsync. It’s been entirely delegated to
Puppet::Configurer::PluginHandler, which in turn uses
Puppet::Configurer::Downloader. We’ll discuss this later, just know that the
first thing that’s really done in a puppet run is the pluginsync, and it’s
kicked off by this method.
This is the part where we go out and grab our facts. Fact retrieval has actually been indirected, so we don’t directly go and grab the facts from Facter. Instead, the indirector is called, which defaults to Facter itself on the agent. This behavior does allow for some interesting injection of behavior.
So you know the
b64_zlib_yaml format mentioned all over the place when you’re
puppet agent -t --debug? It turns out that this is a custom format
that’s built for handling facts. It’s YAML (a standard Puppet serialization
format), that’s been compressed with zlib, that has been base 64 encoded. This
compressed format was added because of some size
limits on the size of the fact upload, which has since been
So we have these facts, and they might be really hefty. We attempt to use the aforementioned b64_zlib_yaml format on them, else we fall back to uncompressed yaml. After this is done, the format used to store the facts is returned, as well as the CGI escaped facts. The goal of all of this is to have our facts in a format that’s best suited to send to the master.
The logic for all of this is implemented in the
and it’s mixed into Configurer.
So we have all our plugins, we have a facts, so we’re ready to roll. We need to run our prerun command if it exists, apply the catalog, and then run the postrun command.
Getting a catalog is more complex than it looks, because Puppet can either fetch
a new catalog, or apply an existing catalog. Once we have it, we do
catalog.apply and we’re off to the races. Once the catalog is applied, we send
the report. And that’s it! That’s a puppet run!
The logic for catalog retrieval is split into a few methods, so I’ll address them individually.
This method tries to get a catalog from somewhere. We’ve got the two cases mentioned above - by default get a new catalog, or reuse an existing catalog. This method delegates a lot of work to two other methods.
The default behavior implemented in this method is to do a standard REST call to the master. This REST call uploads the facts generated earlier, which the master uses to compile a new catalog. This is then downloaded and cached on the client.
If the configuration indicates that a cached catalog should be used, or if
catalog retrieval fails and
:usecacheonfailure is enabled, we’ll try to use
the catalog that we cached on the last successful run. This is where catalogs
cached on the client in
$vardir/yaml come into play.
After the run has been completed, the resulting report data needs to be handled
in one of a number of ways. If the
:summarize option is turned on in Puppet,
then the last run summary will be displayed to the console. A copy of the run
report will be saved to
/var/lib/puppet/state, and if reporting is turned on
then a copy of that report will be sent to the master.
In summary, when you think of a typical puppet agent run, this is where it’s done. Pluginsync is performed, facts are prepared, they’re sent to the master when the catalog is retrieved, that catalog is applied, and then the report of this all is sent to the master. This is enough of a view from 50,000 feet that you’ll be able to see how other parts fit in later.
Coming up next: pluginsync, in more detail than you EVER WANTED TO KNOW!
Addendum: Puppet 2.7.17 was used as the reference version for this blog.