Open Source: Manage Your Infrastructure

These days when looking to automate the setup and maintenance of even the simplest infrastructure you will typically end up with a decision between two popular configuration management tools, Puppet and Chef. There are other tools, some of which have been around much longer. However, recently the pace at which both Puppet and Chef’s development has increased indicates they’re both rapidly growing.

This article will begin with a look at some of the benefits of using Puppet to manage your infrastructure.

Breaking Down Your Infrastructure into Managable Components

Infrastructure can be relatively complex and only grows in complexity the more pieces you add to support your particular application(s). However, all infrastructure configurations can be broken down into components: individual pieces that are installed, configured and monitored. These components can be as simple as a user or a file, and as complex as multiple servers in a cluster supporting backend data.

Using a configuration management tool requires you to consider these individual pieces and their relationships. Relationships can be between servers or the individual components that make up a server.

For example, deploying a Ruby on Rails application typically requires a database, an application server and web server. These can be configured on a single machine or broken up into several machines, with one for each component. Thinking about these relationships and developing infrastructure that supports these enables you to view each component as a commodity, giving you the power to add, remove or replace each with ease and consistency.

An Overview of Puppet Resources

Puppet uses the term “resources” when describing these components. Resources in Puppet have have a type, a name and attributes that define the configuration of that resource. Here is an example of a resource:

file { "/etc/ntp.conf":
    owner => root,
    group => root,
    mode => 0644,
    source => "puppet:///ntpd/ntp.conf"
}

Resources typically begin with their type, here we are using the “file” type, after which we enclose our resource’s name and attributes in curly brackets. Here you can see our resource’s “name” is “/etc/ntp.conf”, followed by a colon. Our attributes are anything after the colon up until we close the resource with a curly bracket.

This resource manages a component of our infrastructure, the file located at ”/etc/ntp.conf”. We’ve established that we want this file to be owned by ”root” and have permissions of “0644″. Any node that we choose to run this resource on will have this file configured with these attributes.

Defining Relationships Between Resources

Where Puppet shines, in relation to other tools, is that it empowers you to specify relationships between these resources and the modules they may be defined in. Modules can generally be thought of as a configuration containing each of our three core requirements, installation, configuration and monitoring. Modules in Puppet typically break down to classes, classes are singleton collections of resources.

Continuing with our example above we create an ntpd class as follows:

class ntpd {
        package { "ntp":
            ensure => installed,
        }
 
        file { "/etc/ntp.conf":
            owner => root,
            group => root,
            mode => 0644,
            source => "puppet:///ntpd/ntp.conf",
            require => Package["ntp"]
        }
 
        @service { "ntpd":
            ensure => running,
            enable => true,
            hasrestart => true,
            hasstatus => true,
            require => [Package["ntp"], File["/etc/ntp.conf"]],
            subscribe => File["/etc/ntp.conf"]
        }
    }

Here we’ve established a relationship between installing, configuring and monitoring the ntp daemon. You can see we’ve defined three resource types, “package”, our “file” example from above and “@service” (the ‘@’ is a special syntax for a virtual resource, you can ignore that for now).

As I described above, each resource has a type, a name and attributes. Most of these should be self-explanatory, however the attributes that build the relationships are important.

You can see here that the file resource requires “Package['ntp']“. That simply refers to our package resource. The same requirements are defined for the @service resource, though we add in a dependency upon the file type.

The “subscribe” attribute tells our service type to listen for any changes to our file type so that we know to restart. If we make any changes to “/etc/ntp.conf”, our service will automatically restart on our next Puppet run.

Defining these relationships, Puppet builds a dependency graph in the background. This dependency graph offers you a lot of unique features that other tools don’t provide.

No Operation Mode

For example with this dependency graph we can run our setup in “dry run” mode using the ‘–noop’ flag. This allows us to test out exactly how Puppet will configure our systems. Which is great for any production infrastructure, even if you have “test” or “staging” machines to test on. “Noop” allows you to develop new infrastructure faster before having to deploy. This feature is often over looked when considering the power of Puppet compared to other tools.

Explicit Resource Dependencies

Another feature of using a dependency graph is that in Puppet, resource dependencies are always explicit. You can move resources around freely without worrying about the order of application. In our example above, if I were to move our package resource to another module, this would have no negative consequences on the process of installing and configuring ntpd. Puppet resources can listen and notify other resources that they either depend on or think might be interested. When order is important the relationship between resources must be explicitly specified. Puppet is concerned about the state of your server, and works to bring the configuration into compliance.

These are only a few of the benefits of Puppet’s graph based design. Other options such as virtual resources (which our service resource above is defined as) offer more options in dealing with related resources across similar server structures.

Getting Started with Puppet

Puppet is built with a focus on client/server configuration within an infrastructure. However Puppet provides you with an extremely easy method for getting started. One can run any independant manifest with “puppet apply <manifest>”. Using our example above, if we were to place that ntpd configuration in a file called “ntpd.pp” we could manage that service using only “puppet apply ntpd.pp”. In my experience this is far superier to getting started with standalone clients of other tools. Growing from this single manifest into a larger full blown infrastructure configuration is relatively easy.

Here is an example directory layout taken from NICS presentation. Please note that this is an advanced configuration and is meant to give you an idea of such an implementation:

/etc/puppet/
        auth.conf
        autosign.conf
        fileserver.conf
        puppet.conf
        tagmail.conf
        files/
            byhost/
                host1/
                host2/
                host3/
        manifests/
            nodes.pp
            site.pp
            classes/
                class1.pp
                class2.pp
        modules/
            module1/
                manifests/
                    init.pp
                files/
                templates/

Generally you’ll start by defining a number of modules and import those where necessary. Puppet recognizes servers as “nodes”. We would define our “nodes” in the nodes.pp above. Here is an example of a node configuration:

node "webserver" {
        include ntpd
    }

We have defined a node “webserver” and included our ntpd module. Each time Puppet runs on a server with the hostname “webserver” our ntpd module will be run. If there are discrepencies on the current system, Puppet will restore state back to our configuration.

Puppet, like any software has it’s warts, but you can find those types of discussions elsewhere and blog.

You can exploring the documentation to expand upon what this blog talked about here. You can also find a lot of assistance in the “#puppet” channel on the Freenode IRC network or on their mailing list.

Open Source

Sunday, March 27, 2011

Manage Your Infrastructure - Puppet