Staying Sane While Writing Chef Cookbooks

I recently wrapped up a project to refactor all of my chef cookbooks to be more maintainable and coherent. This was my second chef project. The first attempt was reasonably successful: I could create new cloud servers and automatically provision them with a single command. But I didn’t have anything resembling a true development or staging environment, so any changes had to be made directly to the production servers. This obviously was not a healthy situation. This post will explain many of the mistakes I made the first time and how I corrected them.

Most of my code is available here: Chef Cookbooks. I’ll explain below how to configure these with your project specific settings.

Cookbooks and Roles and Databags, Oh My!

Chef is known to have a steep learning curve, which is partially because of the bewildering range of features it has for defining various configuration settings. Although I’m sure these features are useful in more complicated organizations, they are actually detrimental to a simple project. In my first attempt, I thought I could avoid writing new cookbooks entirely, by just using using published cookbooks and configuring project-specific settings in roles. This was a poor idea.

You cannot avoid writing cookbooks, but it is possible to avoid most of the other advanced features. As they say in the Chef documentation: “A cookbook is the fundamental unit of configuration and policy distribution in Chef.” In particular, don’t store any settings in roles or databags unless you have a compelling reason. Instead, write “wrapper” cookbooks, following the pattern described here: How to write reusable chef cookbooks.

The main advantage of cookbooks over roles and database is versioning: cookbooks have version numbers and you can set a particular environment to use a particular cookbook version. This allows you to use “my_cookbook 1.0.0” on your production nodes, while testing “my_cookbook 1.0.1” on a staging node. To do so, you can create “environments” which specify which cookbook versions apply to a given environment. Environment files also let you set and overwrite attributes: don’t do that.

Don’t use roles. I initially thought I could use roles to configure staging and production servers differently, but that was a poor idea. Unlike cookbooks, roles don’t have any kind of versioning. If you want to test your new configuration in staging before promoting to production, you’ll need to copy and paste the changes from the staging role to the production role. That’s far more error prone than increasing a cookbook version number.

Data bags are interesting because there’s an option to encrypt data bag contents. This may be useful if there are certain credentials you don’t want to share with your entire team, but still need to deploy to your servers. Note that those credentials will almost certainly need to be unencrypted on the actual servers, so you’re not hiding them from anyone who will have root access to your servers. I don’t actually use encrypted data bags, I’m just noting that they exist and might be useful in some contexts. I don’t see any reason to use unencrypted data bags.

A tale of three cookbook types

My knife.rb file is configured to look for cookbooks in three directories:

#{current_dir}/../vendor_cookbooks
#{current_dir}/../public_cookbooks
#{current_dir}/../private_cookbooks

vendor_cookbooks

These are cookbooks written by other developers which I’m using in my project. Some are offical Opscode community cookbooks, others are just useful ones I’ve found on github. I was previously using Librarian-Chef to manage these, but ran into compatibility issues when installing knife using Bundler. I’m currently just maintaining the vendor_cookbooks directory using git submodules. I’ve also heard good things about Berkshelf, but haven’t used it myself on any projects.

public_cookbooks

These are the cookbooks I’ve written that handle all of the logic related to provisioning a server. They are available on github here: Chef Cookbooks. Even if you don’t expect your cookbooks to be useful to others, I still strongly recommend you publish yours in a public repository. Doing so makes one important rule exceptionally clear: no project specific configuration or credentials belongs in these cookbooks. If it isn’t reusable, stick it in a private wrapper cookbook.

private_cookbooks

Which brings us to the third and final cookbook directory. This is where you’ll add any configuration that isn’t suitable for public consumption. For example, the recipes/default.rb file in the wrapper cookbook for my main application looks like the following (I’ve obviously redacted the private settings):

node.normal['rails_app']['database'] = {
      'adapter' => 'postgresql',
      'database' => 'example_db',
      'host' => 'db01.example.com',
      'port' => '5432',
      'username' => 'example_user',
      'password' => 'example_password',
      'pool' => '50',
    }

node.override['rails_app']['git_repo'] = 'git@github.com:example_org/example.git'
node.override['rails_app']['git_branch'] = 'production'

node.override['rails_app']['workers'] = {
      :default => 1,
    }
node.override['rails_app']['deploy_dir'] = '/opt/example'
node.override['rails_app']['unicorn_config'] = '/etc/unicorn/example.rb'
node.override['rails_app']['user'] = 'example'
node.override['rails_app']['group'] = 'example'
node.override['rails_app']['notify_email'] = 'admin@example.com'
node.override['rails_app']['server_name'] = 'example.com'

node.override['github']['id_rsa'] = <<-EOS
-----BEGIN RSA PRIVATE KEY-----
-----END RSA PRIVATE KEY-----
EOS

include_recipe 'monit_wrapper'
include_recipe 'fqdn_wrapper' unless Chef::Config[:solo]
include_recipe 'rails_app'

tt = resources('template[/etc/nginx/nginx.conf]')
tt.source 'nginx.conf.erb'
tt.cookbook 'app_wrapper'

Notice how this cookbooks contains fields like the database password and an SSH private key for deployment. You may prefer to store these in an encrypted databag, but I’m comfortable keeping them in a plain-text cookbook and only sharing the repo with trusted team members.

Also notice how you can override templates (such as nginx.conf) that have been defined in a public cookbook.

Just use recipes and templates

Readers with a little chef experience might have noticed that I’m setting attributes in a recipe file and completely ignoring the concept of an attribute file. There’s an undocumented gotcha with attribute names that doesn’t work well with our wrapper convention: the attribute files in a cookbook named “some_cookbook” must start with the string “some_cookbook”. You cannot set an attribute like default["some_cookbook"]["my_setting"] in an attribute file in a cookbook named “some_cookbook_wrapper”. That’s subtle and confusing, which is reason enough in my option to just avoid using attribute files altogether.

Similarly, I haven’t used any cookbook features such as resources, definitions, or libraries. I put all of my logic in a recipe file (specifically, the “default.rb” recipe) and add templates as necessary. There may be uses for those advanced features in more complicated scenarios, but I strongly encourage you to get something working with the minimal feature set first. This is friendlier to Chef newbies, who won’t have to learn the whole range of options just to understand your code.

Defining roles

Although I don’t use roles, I obviously still need a way to define what cookbooks apply to a application server as opposed to a database server. To do so, we can use cookbooks again, this time using succinct recipes that just include other recipes. For example, my production app server looks contains just three lines:

include_recipe 'base_wrapper'
include_recipe 'redis::server_package'
include_recipe 'rails_app_wrapper'

All of the real configuration happens in the wrapper cookbooks. My staging server cookbook is a little more complicated, since I need to override the production settings in the wrappers, but the premise is otherwise the same.

Stick a spork in it

Knife spork is an exceptionally useful knife plugin. I’d actually consider it essential to the process. The project page describes it as a tool “which helps multiple developers work on the same Chef Server and repository without treading on each other’s toes.” But even working alone, I find it reassuring to know that I’m protected from accidentally pushing out a series of half-baked changes to every production server.

I use three knife spork commands repeatedly while testing in a staging environment.

knife spork bump $cookbook_name
knife spork upload $cookbook_name
knife spork promote $cookbook_name --remote

The “bump” command increments the cookbook version number. The “upload” command not only uploads the cookbook to the chef server, but also locks it. If you try to upload a change without first running “bump”, it will report an error.

In my spork-config.yml file, I have the “default_environments” configured so that knife spork promote will only update the staging environment. I manually promote to production once everything is throughly tested.

Test all the things

Following the patterns makes it straightforward to test extensively. I start by creating a local VM using Vagrant. Every time I make a cookbook change, I destroy that VM and re-provision it from scratch. This ensures there aren’t any artifacts left over from previous chef runs. Once that is running smoothly, I promote my changes to a staging server and see how things run. After a few incident-free days, I’ll promote the production cookbook version numbers.

Comments