Fighting with Thor

I started writing this blog post and realized “holy expletives, this might be the most epic blog post title I’ve ever written!” But no, I’m not duking it out with a norse god.

When I first discovered Thor, I was all sorts of thrilled. Before I encountered Thor, all of my command line programs relied very heavily on optparse and I had to add a ton of options and logic to handle subcommands. When I encountered Thor, this was a whole new world. It allowed you to add a lot of discrete tasks without pain. It had excellent option parsing that was both (mostly) simple and powerful.

Better yet, it wasn’t rake. Rake was built as a replacement for Make, but since it allowed people to generate discrete tasks in ruby, it’s been largely coopted into a scripting framework. But fuck’s sake, it’s hacky.

Don’t believe me?

task :eatbabies => :environment do
  puts "OM NOM NOM"

Holy crap. They’re using a hash with a single key/value pair for the title. The name of the action is the key, and the dependencies are specified as the value. And if you want multiple depending tasks…

task :eatbabies => [:environment, :toast] do

This was definitely an original idea, so they get points for this. But this is very unintuitive, and if someone is new they’ll look at this and “wat”. While it’s clever, there are better ways of doing this that are less “wat” inducing. DSLs are one thing, but mangling the syntax of ruby to pull this sort of thing off makes my skin crawl.

So, along came Thor, and I started squealing in glee like a 14 year old girl at a pop concert. It seemed to be the intersection of Rake and Sake, taking the best attributes and leaving out the cruft. Right away, you can tell that Thor is definitely modelled closely off of Rake:

% thor list
thor gen:pass SIZE    # Create a password with SIZE characters, defaults to 8
thor gen:phrase SIZE  # Create a passphrase with SIZE words, defaults to 5

thor puppet:agent [HOST]   # Update the server and run puppet on a host
thor puppet:config         # Generate your ~/.autopuppet
thor puppet:deploy         # Run puppet deployment on master
thor puppet:hotrun [HOST]  # Deploy changes and run agent updates

I’ve got a couple of thor apps installed as you see, and thor maintains the concept of namespaces and tasks, delineated with colons. In place of a Rakefile you can have a Thorfile, which is an obvious facsimile of a Rakefile.

So this is all pretty cool, and it looks like Thor took the best from Rake and make it all much more scripty and whatnot. But given the title of this post, I’m not here to sing the accolades of Thor.

One of the things I started doing with Thor was converting some of my hacky shell script snippets into actual ruby code. While it’s still kinda hacky, anything written in shell is tremendously hacky, so it was a step up. For simple stuff, Thor is absolutely kittens and fairies.

However, as you level on the complexity, things get weird. I wrote a small task to kick off deployments of puppet manifests on my company’s puppet master. Initially it was hardcode city, but as I started adding various features I decided I should make it a bit more reusable. So I pulled out the configuration bits that were only applicable to me and put them in a config file, and moved handling config loading and saving to a different task. This should be easy enough, right?

Well, no. Something that Thor omitted was any sort of dependency management or resolution. Honestly, this is something that would be really nice, especially for loading configuration data or allowing the composition of several tasks in a simple manner. Well, no matter - you can invoke another task and have that handle state loading, right?

Well, no. Since the way to use Thor is to create a class and have that inherit from the Thor class, it makes sense that you could use instance variables to store state and pass information between tasks, right? Well, Thor generates a new instance for every single invocation of a task. Instance variables? Yeah, those don’t work. Since the code was still largely hacky I just switched to using class variables. So I have this nice separation of configuration loading and actual operations, and all it took was horribly abusing object oriented programming.

This simple case demonstrates a few flaws in Thor.

Who needs dependencies anyways?

The first issue mentioned is the lack of dependency resolution. If you’re writing code that solves a set of problems in a particular area, you’ll probably want the ability to compose them. Some sort of clean dependency management for tasks would make this composition really simple and it would be an awesome feature.

I have a specific use case in mind. When I’m working on puppet code, I develop on my laptop, push to a branch on a git repository, SSH to the puppet master to run the deployment script, and then SSH to a target box to test. I want to automate all the work from git push to puppet agent -t. So let’s say I have these tasks:

puppet:config - load configuration data
puppet:deploy - deploy the puppet code on the master
puppet:agent - run the puppet agent on a node.

It would be wonderful if it was possible to make a task with only dependencies to run the relevant tasks, but to my knowledge there’s no way of doing this. You can do this:

desc "hotrun [HOST]", "Deploy changes and run agent updates"
method_option :env, :type => :string, :aliases => "-e"
method_option :noop, :type => :boolean, :aliases => "-n"
def hotrun(*hosts)
  invoke "puppet:deploy", []
  invoke "puppet:agent"

But this is gross. Also, notice that strange array after ‘puppet:deploy’ - things like that make me itch. And then we go back to the fact that there’s no good way for tasks to share data, and it makes me one sad panda.

On the other hand, if Thor is going to be a scripting framework, then it doesn’t really need to have dependency resolution. But damn, that would be nice.

Did you expect object oriented programming?

While Rake is implemented as a DSL, Thor opted to do everything as pure ruby, in a fashion. Initially, this may seem easier - just write ruby methods and call it a day. But at the end of the day, this clever approach adds immense amounts of complexity.

The implementation styles of Thor and Rake have some pretty dramatic effects on how the code works. Rake is somewhat simple - you have Tasks, Tasks have Actions, there’s a TaskManager that mixes into the Task, and that’s a pretty good chunk of it. It’s actually pretty easy to trace how things get done, because all everything really does is pass around blocks, which is easy. This also makes Rake really easy to extend.

Thor is an entirely different game. Thor functions by tapping into the class with hooks like self.inherited and self.method_added which means that the DSL is VERY tightly coupled with the underlying logic. Since it’s using this DSL that’s so tightly coupled with the implementation details, figuring out how something works is like tracing a spider web. It’s never really clear what is being evaluated in which context.

The fact that Thor uses inheritance to work makes things a lot uglier. You can’t share code across commands because you inherit Thor, so if you change one thing in one class you might change it everywhere. And the fact that you’re using inheritance falsely indicates that you can treat the Thor command as a class, and you cannot. This falls into the same trap of Rails - using ruby as a DSL in ways that don’t make natural sense and make it hard to do real object oriented programming.

A byproduct of this is the size of the codebase for Rake and Thor. Rake weighs in at 4.2k lines (2.9k if you yank out the contrib modules), while Thor comes in at 4.9k. Rake also has dependency resolution and a fairly extensive set of helpers, while Thor has more complex argument parsing. However, a major factor in the code base size is the complexity.

Half baked helpers

Thor also goes too far to be helpful. It tries to emulate terminal-table, columnize, colorize, HighLine, Rake helpers, and other single purpose gems. There are things like ask() and say(), but they are not suitable replacements for HighLine. The table printing functionality would actually crash if you had a single column, which is a pretty massive failure. When searching for code that was like Thor, I ran across a forum post that said when trying to implement a command line interface, all existing Ruby libraries were trying to do everything, instead of doing one thing well. Thor is a major culprit because it has so many half baked implementations of other libraries. this.

Is anyone actually using this?

Something that absolutely floored me was how Thor does argument parsing. When Thor reads in ARGV, it looks for the first instance of a dash. Everything after that is an option, everything before is an argument. So say you did

thor puppet:agent --environment testing

This will do NOTHINK. You’ll have to do

thor puppet:agent --environment testing

This is absolutely horrible and stupid. When I saw this, I had to ask - is anyone actually using Thor? How could this ever happen?

It could be me doing something wrong, but this is just astounding.

My conundrum

With all these issues in mind, doing something more complex sounds bad. The use case I’m fighting with is how to dynamically generate Thor tasks and options. That is, given one class, generate methods/action that have option switches for all the attr_writers or allow them to be filled out on the command line. That is,

% program subcommand --status clean

is equivalent to

% program subcommand
Status: clean

To do this, I think it’s going to require generating classes and methods on the fly. While Rake is a little hacky, you don’t have to pull out the mighty death hammer of metaprogramming to make it flexible.