Creating Machines with Chef
28 Jul 2011
After ugrading Chef to use environments, I needed to update my custom AMIs.
These custom AMIs were based on Ubuntu's Amazon EC2 Published AMIs, have some extra software pre-installed, and have a custom chef client.rb which
gets configuration (chef server info, and client roles) from EC2 userdata to bootstrap itself. I then use scripts
to instantiate machines from those AMIs, and pass them the appropriate userdata.
This has been working great, but is not "the chef way" -- the recommendation is to use
knife ec2 server create,
which creates a machine, and then ssh'es in to bootstrap it. In Chef 0.9 I ran into various routing and ssh timing bugs
that made this approach too unreliable, but in 0.10 that appears to have been resolved.
The main advantage of this approach is that you don't need to make special AMIs; you just use the latest official Ubuntu ones, in any region/arch/store.
The disadvantage of that is that you then have to wait for
chef-client to install all the software, which in the case of Java and RVM/Ruby is a long time.
So the challenge is to:
- make sure that the "knife ec2 server create" method produces functional machines from stock AMIs for all my roles
- use custom AMIs to preload software, and use them from my existing scripts (which use ec2-run-instances) for selected roles
I also wanted to take the opportunity to upgrade OS and sanitise my Ruby install.
For the OS I wanted to switch from Ubuntu 10.4 Maverick Meerkat to 11.4 Natty Narwhal.
Chef 0.10.2 includes only templates for [ubuntu10.04-apt,
ubuntu10.04-gems.erb], which can be adapted for 11.4 by changing the "lucid" to "natty"
(or pull the release name out of
lsb_release), but then you end up with Chef 0.9, so you want to add "-0.10".
Here I ran into an interesting issue: the apt template does a
apt-get install -y chef, and then writes settings to the
and then runs chef-client for the initial bootstrap.
The problem is that the install also starts the
/etc/init.d/chef-client service, so that executes before the modifications to
are made, and before the chef-client bootstrap runs. In my template modifications I set the
node_name, and as a result the first chef-client
registered the client with the default name (the host name), and the subsequent invocation failed; and I ended up with nodes in the
wrong environment. I think there is actually a generic template bug here.
We're using Ruby and RVM for applications on some machine roles,
and I've run into various situations where there has been confusion between
the system ruby, apt, RVM in
/usr/local, RVM in user home directories, various gemsets, and the chef-client and our applications. To reduce that confusion I wanted to try the apt install rather than the default gem install, and limit RVM to a per-user install.
[Update: there are some unique issues, such as knife not finding plugins (CHEF-2483)]
The Knife Template
Pulling it all together I ended up with this
knife template ubuntu11.04-apt.erb:
which you can use likes this:
This works well for bringing up a generic instance with a given role from the command line, after which Chef kicks in and configures the machine.
To create an AMI there are two approaches: snapshot a running instance, or build an AMI using loopback mounts and chroot. The former is somewhat easier, the latter is more secure and precise, and is recommended for public AMIs. For a discussion, see Eric Hammond's posts on Creating Public AMIs Securely for EC2 and Building EBS Boot AMIs Using Canonical's Downloadable EC2 Images.
For my private AMI I decided to use the simpler snapshot approach, at least initially to develop the install sequence, and I've split it into separate scripts for easier testing. See my github create-ami repo.