The Angry Guide to Puppet 3

Update: An additional list of upgrade issues to be aware of can be found on the Puppet wiki.

A little while back I was asked to make the intrepid foray into the Puppet RCs. I was sent into the wild with only a machete and my wits, to explore the new world that would soon be upon us. I have returned, bloodied, torn, but gloriously victorious, and I am here to share my triumph with you all.

How’s that for a dramatic introduction?

Puppet 3 is a significant milestone for Puppet, and is the biggest such milestone since the jump from 0.25 to 2.6, feature wise. It brings a lot of new features, tons of bugfixes and improvements, and mindblowing speed improvements.

One significant part of this release is with Telly, Puppet is going to adhere to semantic versioning. Since Telly is a major version increment, this is a backwards incompatible change, meaning that previously conventional behavior can be SMASHED.

All in all though, upgrading isn’t that bad as long as you know what you’re in for.

Spin up a second master

First off, if you only have a single puppet master, consider bringing up a second one for this. This gives you a lot of flexibility and allows you to do a more incremental migration. Instead of trying to migrate your entire site at once, you can update hosts one by one to use a Puppet 3.0 master.

In addition, having two Puppet masters means that you can bootstrap one master off another. When I was performing my migration, the primary master went down and the changes needed to get it back up were in Puppet - but the master was down. However, with a deft application of puppet agent -t --server backup.puppet.master I was able to move on without having to perform hand configuration.

Having a second Puppet master has a lot of benefits, but trust me - in this case, it can be incredibly helpful.

Dynamic scoping is going to wreck you

Dynamic scoping in Puppet has been a major pain point, and it’s been deprecated for a long time. With Puppet 3, dynamic scoping is replaced with lexical scoping, which is what you’ve probably been using everywhere else.

You would also be very surprised at the number of places you’re unintentionally using dynamic scoping. It will bite you, and it will bite you hard.

For instance, look at this defined type:

class nginx::status($monitorport = '70') {
  nginx::vhost { "status":
    template => "nginx/vhost-status.conf.erb",

And this associated template:

server {
  listen<%= monitorport %>;
  server_name localhost;
  location /nginx_status {
    stub_status on;
    access_log   off;
    deny all;

I had to stare at this manifest blankly for a long time only to have someone look over and point out the fact that monitorport was being used inside of a template within the nginx::vhost defined type. For some reason my brain didn’t process that nginx::vhost wouldn’t have direct access to monitorport. The solution for this was to add some sort of explicit method of passing generic data around; for this case I added the template_options hash. The resulting define looks like this:

class nginx::status($monitorport = '70') {

  nginx::vhost { "status":
    template         => "nginx/vhost-status.conf.erb",
    template_options => {
      'monitorport'  => $monitorport,

And the template looks like this:

server {
  listen<%= template_options["monitorport"] %>;
  server_name localhost;
  location /nginx_status {
    stub_status on;
    access_log   off;
    deny all;

The puppet master will generate a deprecation warning once and only once for any instance of dynamic scoping. To start tracking these down, run grep deprecated /path/to/your/syslog and start fixing every instance of dynamic scoping. If you don’t see any, try restarting your puppet master in case the logs containing the deprecation warnings have been rotated out.

Dealing with dynamic scoping is going to be the largest issue you’re going to have to contend with. The reason it really helps to have two puppet masters is that you can run a Puppet 3.0 master, and test nodes against that master but be able to revert back to a 2.7 master while you work on the migration process.

As a warning, there’s been a ‘lexical’ configuration option available in puppet.conf. It’s also never been used. To quote the great documentarian Nick F, “WAT”. So if you were hoping to use this to try to test lexical scoping over dynamic scoping in 2.7.x… nope. :(

Hiera API changes

Good news: hiera 1.0 has been added to core Puppet! Bad news: if you’ve been using hiera, then your puppet masters will explode!

Hiera 1.0 has also been released and is a hard dependency for Puppet 3. If you were using Hiera 0.3.0 or thereabouts, then trying to use both codebases at once will make things break with the terribly bad error of:

Could not retrieve catalog from remote server: Error 400 on SERVER: undefined method `empty_answer' for Hiera::Backend:Module at /etc/puppet/environments/telly/nodes/metis.pp:9 on node metis.internal

In addition, the hiera-puppet backend has been deprecated and is no longer necessary. In fact, having this around might make the world blow up. The relevant functions have also been moved into core Puppet, so you should not have this installed.

The simple way to handle this is to remove any existing instance of Hiera, and use the Puppet Labs packages. If you’re using RPMs then use, and if you’re using dpkg then use

Rack changes

If you are using a Rack based server for your Puppet master, then expect that to explode violently. There were a couple of bugfixes that affect your, 14609 and 15337.

For the first bug, Passenger wasn’t properly initializing everything which could lead to some really weird behavior. The solution for this was to use the Commandline class to initialize Puppet, to ensure that Passenger doesn’t bypass parts of initialization. This is the API breakage that you need to fix.

If you’re using Unicorn, you’ll see this in stderr.log:
E, [2012-09-21T15:59:52.395429 #14205] ERROR -- : reaped #<Process::Status: pid=1345,exited(1)> worker=3
I, [2012-09-21T15:59:52.399292 #1352]  INFO -- : Refreshing Gem list
E, [2012-09-21T15:59:52.410929 #1346] ERROR -- : undefined method `settings' for Puppet:Module (NoMethodError)
/usr/lib/ruby/1.8/puppet/application.rb:273:in `run_mode'
/usr/lib/ruby/1.8/rubygems/custom_require.rb:31:in `gem_original_require'
/usr/lib/ruby/1.8/rubygems/custom_require.rb:31:in `require'
/var/lib/gems/1.8/gems/rack-1.4.1/lib/rack/builder.rb:51:in `instance_eval'
/var/lib/gems/1.8/gems/rack-1.4.1/lib/rack/builder.rb:51:in `initialize' `new'

This error is triggered because the right files aren’t loaded in the, causing the undefined method \settings’ for Puppet:Module (NoMethodError)`. Using the new rack syntax will get you fixed. And be warned - if you’re using Unicorn, workers will die right away and Unicorn will immediately start them again, so your box will be really heavily loaded as processes are restarted as fast as the machine can manage.

The second bug is a bit esoteric. If you have both a ~/.puppet/puppet.conf and /etc/puppet, and launched the puppet master, both configuration files would be sourced and merged. This would mainly concern people hacking on the puppet master outside of production, but it can’t hurt to add this.

So what’s the solution? Use a that’s specific to the major version of Puppet being used. I’ve been able to publish our puppet module for managing puppet; see our fix to see how we added support for Puppet 2.7 and Puppet 3.0.

routes.yaml finally gets loved

Commit b2831e102f4cc57d6e0101f55e208695188d426a (#2150) introduced the routes.yaml file, which provides a way to configure Puppet terminii. It was added over a year ago, but apparently nobody remembered that this existed. However, things like Puppetdb and the new JSON terminus for catalogs have required more tunable terminus configuration, so this is getting a bit more love. Later on when I discuss the performance improvements to Puppet, the JSON catalog option will take advantage of this. With other things like the node and facts terminus, you can also use routes.yaml to configure these things.

How’s your graph theory?

If you’re using Puppetdb for your storeconfigs backend, you might see some very abstruse errors. If you have a dependency for a resource, like this:

file{ '/etc/logrotate.d/nginx':
  ensure  => file,
  content => template( 'nginx/debian_logrotate.erb' ),
  mode    => '0644',
  owner   => 'root',
  group   => 'root',
  require => Package['xz'],

If for some reason, package { 'xz': } is missing, you’ll get the following error:

Error: Can't synthesize edge: File[/etc/logrotate.d/nginx] -required-by- Package[xz] (param require)
/usr/lib/ruby/1.8/puppet/indirector/catalog/puppetdb.rb:163:in `synthesize_edges'
/usr/lib/ruby/1.8/puppet/indirector/catalog/puppetdb.rb:152:in `each'
/usr/lib/ruby/1.8/puppet/indirector/catalog/puppetdb.rb:152:in `synthesize_edges'
/usr/lib/ruby/1.8/puppet/indirector/catalog/puppetdb.rb:150:in `each'
/usr/lib/ruby/1.8/puppet/indirector/catalog/puppetdb.rb:150:in `synthesize_edges'
/usr/lib/ruby/1.8/puppet/indirector/catalog/puppetdb.rb:147:in `each'
/usr/lib/ruby/1.8/puppet/indirector/catalog/puppetdb.rb:147:in `synthesize_edges'
/usr/lib/ruby/1.8/puppet/indirector/catalog/puppetdb.rb:31:in `munge_catalog'

the “synthesizing edge” comment is only really meaningful if you’re eyeball deep in graph theory. Basically you have a dependency that can’t be found. And you might have noticed, in the error message the arguments are transposed. Neato, huh?

There’s a chance that you might not see this, as it might have already been fixed - but in case you do hit this, you’ll know what’s going on.

Calling Custom Functions in Ruby

Puppet custom functions now require an array instead of a nebulous number of arguments. Before Puppet 3,0, if you did <%= scope.function_template('foo/bar.erb') %> to use partial templates, this would work fine. However, Puppet 3 has been tending towards more strict usage to reduce the amount of unexpected things. Now, if you’re using a custom function in Ruby or in a template like above, you’ll need to wrap all the arguments in an array, like <%= scope.function_template(['foo/bar.erb']) %>. This way there’s no ambiguity as to how the function is being called.

XMLRPC support is dead

XMLRPC was the new hotness when development on Puppet started. Now, XMLRPC is that horrible thing with the XML and the angle brackets and the pain and sad. Because of this, it’s been ripped out from Puppet.

Not that anyone will notice, really.

Mixing versions of Puppet across nodes

Puppet 3 has good master support for 2.7.x. If you upgrade your master, then your agents should be able to continue running as normal, which should help ease rollout - just ensure that your master is upgraded first. If you don’t do this, you’ll probably get this error:

Failed to apply catalog: Could not intern from pson: source '"#<Puppet::Node:0x7f' not in PSON!

The solution - don’t use a 2.7 master with a 3.0 agent.

Mixing versions of Puppet on the same node

If you have puppet installed both as a gem and a package, and one version is 3.0 and the other is 2.x, things will explode. Basically, you’ll be mixing incompatible versions of code, and loading both versions makes the world burn. If you see things like this:

Error: Could not prepare for execution: Could not create resources for managing Puppet's files and directories in sections [:main, :agent, :ssl]: Could not autoload puppet/type/user: Could not autoload puppet/provider/user/directoryservice: undefined method `symbolize' for Puppet::Type::User::ProviderDirectoryservice:Class
Could not autoload puppet/type/user: Could not autoload puppet/provider/user/directoryservice: undefined method `symbolize' for Puppet::Type::User::ProviderDirectoryservice:Class

Or this:

err: Could not retrieve catalog from remote server: Could not intern from pson: Could not autoload package: Could not autoload /usr/lib/ruby/vendor_ruby/puppet/provider/package/windows.rb: cannot load such file -- windows/error

Check to see if you have multiple versions of Puppet in your load path.

Config printing is sane

Puppet run modes are weird. There’s user, agent, and master, and retrieving the right setting depends on the right mode, and there is madness and mayhem. This is a complex problem, but 16189 resolved this issue by renaming --mode to --run_mode. The thing is, we have a custom fact to determine what the agent environment is (as opposed to the master environment). This change has meant that we needed to update the fact to support the different syntaxes for getting the environment. (Be warned, it’s kinda gnarly code.)

Puppet 3 is FAST

This blog post has mainly been focusing on dodging the behavior changes with Puppet 3, but let’s talk about one of the reasons it rocks: it’s a bit mind blowingly fast. Performance of Puppet’s been a longstanding concern, and Daniel Pittman and others did an incredible job of finding and removing bottlenecks.

All sorts of optimizations were done across the board, but I grabbed a few of the more interesting commits for the curious.

When looking at the graphs I’ve generated on the nodes running Puppet 3.0.0rc8, there can be as much as a 60% decrease in runtime, which is jaw dropping and immediately noticeable. Secondly, the variability of runtime has really diminished. The performance work done reduces the amount of objects, which reduces the need for garbage collection, which reduces the amount of work done on non-productive work.

So, metrics! Graphs! Proof and stuff! It should be obvious where these sample nodes moved to Puppet 3.0.

Puppet 3 Performance

That thunk you just heard might have been the runtime falling off a cliff. Alternately, it could be my jaw. I’m still blown away by this.

Did I mention that dynamic scoping is going to wreck you?

It is. This change has been desperately needed, but it’s easy to unintentionally use dynamic scoping.

So, this post wasn’t actually that angry. So why is it the angry guide to Puppet 3? Because fuck.