Convincing Vim to accept ? and ! as part of keyword

Last night I learned how to convince vim that ! and ? are part of a keyword. This is awesome, because both those characters are valid characters for method names in Ruby. In particular, it now means that when I hit ctrl-]with my cursor in a method name containing a !, it will take me to the correct method definition. Let’s take an example, with a couple of methods on a User model:

Class User
  def invite(message)
    @invited = true

  def invite!(message)

and elsewhere, I have some code in a controller:

class UsersController < ApplicationController
  def invite

With my cursor at the underscore on @user.in_vite!, I can hit ctrl-] and it will take me to the method definition of the keyword under the cursor. Prior to making last night’s discovery, vim would have determined that the keyword was invite, and it would have searched the ctags file for a match. Invariably, it would have decided that the closest match was the one in the same file, so it would take me to the method definition for the invite controller action β€” just where I already was.

The first thing I wanted to double check was to see that ctags was correctly storing the method names. In vim, you can jump directly to a tag definition with the :ta[g] command. This will accept a string or a regular expression of the tag we want to jump to. I tried jumping with :tag invite! and, sure enough, it takes me to the right definition, so the tags are being generated with the right method names, and vim is happy enough to accept them as identifiers. That’s not the problem then.

So, it would appear that the problem is with how vim determines the keyword we’re searching for based on the current cursor position. In other words, how does vim take the line @user.in_vite!(params[:message]) (with the cursor at the _) and determine that the keyword we’re looking to find the definition for is invite? And how do we change it so that it picks up the correct keyword, invite! instead?

At this point, I figured that what constitutes a keyword is probably language-specific, so I went down a rabbit hole of digging through the Ruby syntax file, seeing how it parsed and syntax highlighted various elements of a Ruby source file. Wow, that’s complex! However, it already correctly understands that ? and ! are valid characters in a method name, so that led me to believe vim wasn’t using syntax information to figure out what constitutes a keyword.

Eventually, I resorted to digging through the help files for the word β€˜keyword’ and came across the option iskeyword. It transpires that this contains a listof the ASCII characters that are considered to be part of a keyword. The documentation here mentions that keywords are used in many commands, including ctrl-]. It also hints that they can be changed per file type (e.g. help files consider all non-blanks except for _, " & | to be part of a keyword). The default list of characters is @,48-57,_,192-255 which is:

  • @ is all the characters where isalpha() returns true, so it’s a-z, A-Z and various accented characters.
  • 48-57 are the ASCII digits 0-9.
  • _ is the literal underscore character.
  • 192-255 are the extended ASCII codes (accents, ASCII art symbols, that kind of thing).

So we can see that ! and ? are excluded from the list. I added them in to my current vim session with:

:set iskeyword=@,!,?,48-57,_,192-255

Then I tested it out by placing my cursor over @user.invite! and, sure enough, ctrl-] took me to the method definition! I did a little Snoopy dance at this point, but it was OK, because there was nobody else in the house. πŸ˜‰

It occurs to me that the definition of a keyword is language specific, so in order to make the configuration permanent, I’ve added the following to my ~/.vimrc:

autocmd FileType ruby set iskeyword=@,!,?,48-57,_,192-255

Restart vim to double check and yes, it’s now considering invite! and password_required? to be keywords. Winning!

Now that I’ve discovered the trick to tell vim what constitutes a keyword, I’m wondering a couple of things. Firstly, are there any other characters that should be part of a keyword in Ruby that aren’t included in the new list? And secondly, what other languages could do with their iskeyword being set to something different?

There and back again: A packet’s tale

I haven’t done any interviewing for a while1 but I went through a period of growth in one of the companies I worked for where we were feverishly expanding the development team, so we had to be a little more systematic in our approach to interviewing. Instead of just having an open conversation with candidates to see where it led (which is what I’d previously done in such situations), I wound up preparing a ‘standard’ set of questions. It took a few goes, but eventually I settled on a favourite question for the technical portion of the interview:

When I pull up my favourite Internet browser, type “” into the address bar, and press return, what happens?

I reckon it’s a doozy of a technical question. There’s so much breadth and depth in that answer. We could talk for hours on how the browser decides whether you’ve entered something which can be turned into a valid URL, or whether it’s intended to be a search term. From there, we can look at URL construction, then deconstruction, to figure out exactly what resource we’re looking for. Then it’s on to name resolution, to figure out who we should be talking to.

And then it gets really interesting. We start an HTTP conversation, which is encapsulated in a TCP session which is, in turn, encapsulated in a sequence of IP packets, which are, in turn, encapsulated in packets at the data link layer (some mixture of Ethernet and/or wireless protocols), which — finally! — causes some bits to fly through the air, travel along a copper wire, or become flashes of light through fibre optic.

Now our request emerges — fully formed again, if it survived the journey — at the data centre. The HTTP request is serviced (on which entire bookshelves have been written) and the response follows the same perilous journey back to the browser.

The story isn’t over, though. Once the browser has received the response, it still has to interpret it, create an internal representation of the document, apply visual style, and render it in a window. And then there’s the client side JavaScript code to be executed.

It’s an exciting story: one of daring journeys, lost packets, unanswerable questions, and the teamwork of many disjoint routers, distributed across the Internet.


There are so many different areas involved that it’s impossible to answer the question fully in an interview situation. But I hope an interviewee (for a technical role) would have a rough overview of the process involved, be able to make intelligent inferences as to how some bits work, and have depth of knowledge in some area(s). When you’re interviewing, as Valve’s Employee handbook (PDF) succinctly put it, you’re looking for a ‘T’ shape:

people who are both generalists (highly skilled at a broad set of valuable things — the top of the T) and also experts (among the best in their field within a narrow discipline — the vertical leg of the T).

From their answer to the question, I’ll get an inkling about their generalist skills. As well as a general overview of the topics, I’ll get some insight into their ability to structure an answer, communication skills, and problem solving. And I get an idea of where their strengths lie:

  • If the answer is largely around the browser itself, how it interprets HTML and CSS to render the page, and how it interacts with JavaScript code, I’ll have a hint that the candidate is strong on front end development.

  • If they lean more towards the server side, talking in depth about how an HTTP request is serviced, then chances are I’m talking to a seasoned backend developer.

  • And if we get into the nitty gritty of IP protocols, packets and routing, I’ve found myself an Ops engineer.

This interview question has been sitting with me in my repertoire for the past five years now. More recently, I’ve been wondering: wouldn’t it be awesome if more of us2 had a deeper knowledge of the full stack? While it’s not necessary to know the wire layout of an ARP3 packet, for example, it’s occasionally useful to know what ARP is when you’re trying to figure out why two computers won’t talk to each other.

A Sneak Peek at The Internet

The classic text for understanding the network layer is TCP/IP Illustrated Volume 1: The Protocols, by W. Richard Stevens. It has been revised by Kevin Fall, with a second edition in 2011, and it definitely covers all the material, in meticulous depth. But it’s not an easy read with all that detail and, to many, is not an approachable book.

I think there’s an untapped market here. As it happens, I’m in the market for some new goals in life, so here’s my plan. For the next while, I’m going to focus on writing articles that explain various areas of the networking stack in some interesting level of detail. I’ll be covering topics around:

  • the high level protocols themselves (HTTP, SMTP, IMAP, SSH, etc);
  • securing communication at the transport layer (SSL/TLS);

  • name resolution at various layers (DNS, ARP, etc);

  • the predominant transport layers (TCP & UDP), plus a look into relatively new contenders like SCTP & QUIC;

  • the underlying Internet Protocol, IPv4 & IPv6, which will encompass topics like routing, tunneling, firewalls & configuration;

  • the physical network layers, like Ethernet, Wifi & cellular;

  • the standardisation process at the IETF; and

  • some of the tools that we can use to explore all these concepts.

I might even dive a little into the implementation of a network protocol. For some reason — and it couldn’t possibly have been because I was drunk at the time! — the note, “Implement a TCP/IP Stack in Go” appeared one day on my Someday/Maybe todo list. Perhaps creating a complete protocol stack is a bit much, but taking a look into the Linux kernel to see how things are implemented could be insightful.

I’d like to experiment a little with screen casting, too. I’m aiming to put together a series of short talks for conferences (lightning talks, and I’m submitting longer talks to every conference I see issuing a Call For Papers!) on various aspects of this overarching theme. I’ve got some notes together for a talk on DNS, for a lightning talk at the Bath Ruby Conference next month, for example. I’d like to turn these talks into short screen casts, too, if you can put up with my accent. πŸ˜‰

But the bigger goal is to write a book. I feel that this is an important topic, and it’s one that every developer could use some greater insight into. All of us use the Internet, and most of us develop applications which interact over the Internet. Wouldn’t it be awesome if we all had the confidence to understand how the network worked, instead of treating it like a (reliable) bit of cable directly connecting each endpoint?

The book is definitely in the very early stages. I’ve put together a rough outline, and I’ve written a few thousand words that will fit in … somewhere (I seem to have an unhealthy obsession with DNS). I’m expecting it to be a long
project! My current working title is, “A Sneak Peek at The Internet”, though I was inspired by a new title while writing this morning: “There and back again: A Packet’s Tale.”

If you’re interested in keeping up to date with progress on my goal, I’ve put together a sign-up form (with LaunchRock which was delightfully easy to use!). You can sign up here: A Sneak Peek at The Internet.

  1. The mixed blessing of doing contract work instead of being a full time employee for the past few years. 
  2. Me included. I wasted my formative years printing out RFCs on a Star LC-10 dot matrix printer, then reading them — for fun! But I haven’t kept up with developments in the past decade, so I’m on a learning journey. And that’s kinda the point: that’s what I’m getting out of this deal. 
  3. Address Resolution Protocol. This is the mechanism used by Ethernet and Wifi networks to figure out who an IP address belongs to. If a host wants to communicate with on the local network, it shouts “who has” to everyone on that network. Hopefully, the owner of that IP address will answer, “I have, and my hardware address is 00:11:22:33:44:55.” 

Keybase: A Web 2.0 of Trust

Ever since Tim Bray introduced me to Keybase, I’ve been waiting patiently to get in on the beta. (OK, not so patiently.) At last, last week, my invitation came through, and I’ve been messing around with it since.

So, what’s it all about? Secure communication with people you trust. The secure communication part isn’t anything new. It’s using public key cryptography — specifically that provided by GnuPG — to:

  • Allow you to encrypt a message you send to a third party so only they can decode (read) it; and

  • sign a message that you send to a third party, so that they can verify it was really you that sent the message, and that the message they’re reading hasn’t been tampered with since it was signed.

This technology has been around for a good couple of decades now, and has become easier to acquire since the US relaxed their attitude to cryptography being a ‘munition’ and, therefore, subject to strict export regulations. Public key cryptography is great. The implementations of it are fast enough for general use, and it’s generally understood to be secure ‘enough’.

There are two components to public key cryptography:

  • A public key, which you can share with the world. Anyone can know your public key. In fact, in order to verify the signature on something you’ve sent, or in order to send you something which is encrypted, the other person needs your public key.

  • Your private key, which you must keep to yourself. This is a sequence of characters which uniquely identifies your authority to sign messages, and your ability to decode encrypted messages sent to you. Most software allows you to attach a password to the private key, giving you two factor authentication (something you have — the private key file — along with something you know — the password).

I’m comfortable that you know how to keep stuff private. You have a personal laptop, FileVault protecting the contents of the hard disk, with encrypted backups, have a password on your private key, and you don’t leave your laptop lying unlocked in public places. That’s cool. You’ve got a safe place to store that private key.

Sharing Public Keys

But it leaves one key problem: how do you share your public key? And how do I know that this public key, here, is really your key, not just somebody pretending to be you? After all, the security of the whole system relies on me having a public key that I trust really belongs to you.

Sharing public keys themselves is easy enough. There are many public key servers which you can publish your key to, most of which (eventually) synchronise with each other. From there, it’s just a case of asking your encryption software to download that key from the servers. In these examples, I’ll demonstrate with the GnuPG command line client, but the same principle applies to most PGP implementations, including more friendly GUI tools. I can tell you that my key id (a short, unique, identifier for every public key) is 002DC29B. You can now retrieve that key from the public key servers:

$ gpg --recv-keys 002DC29B
gpg: requesting key 002DC29B from hkps server
gpg: key 002DC29B: "Graeme Mathieson <>" not changed
gpg: Total number processed: 1
gpg:              unchanged: 1

But how do you know that I’m really who I say I am? This whole article could be an elaborate hoax from hackers in Estonia, who have commandeered my blog and published this article, just to fool you into believing that this is really my public key. If you’re sending sensitive data, encrypted using this public key, and it’s not really me who owns the corresponding private key, the whole effort has been in vain.

Verifying Fingerprints

The traditional way to exchange public keys — and therefore allow trustworthy, secure, communication over the Internet — is to meet in person. We’d meet up, prove who we each were to the other’s satisfaction (often with some Government-issued photographic ID), and exchange bits of paper which had, written or printed, a PGP fingerprint. This is a shorter, easier to verify (and type!) representation of your PGP public key. For example, my public key’s fingerprint is:

CF61 9DD5 6116 D3CD 4380 C1AE 8F7E 58DD 002D C29B

This hexadecimal string is enough to uniquely identify my public key, and to verify that the key has not been tampered with. (Notice that the last two blocks of the string correspond to the key id.) When you get back to your computer, you can download my key from the public key server, as above, and show its fingerprint:

$ gpg --fingerprint 002DC29B
pub   4096R/002DC29B 2012-07-20
      Key fingerprint = CF61 9DD5 6116 D3CD 4380  C1AE 8F7E 58DD 002D C29B
uid       [ultimate] Graeme Mathieson <>
uid       [ultimate] [jpeg image of size 9198]
uid       [ultimate] Graeme Mathieson <>
uid       [ultimate] <>
sub   4096R/4BDD1F4C 2012-07-20

If the fingerprint I handed you matches the fingerprint displayed, you can be confident that the public key you’ve downloaded from the key servers really is my key. If I’m confident that a particular key, and the identities on the key (names, email addresses, sometimes a photograph), are correct, then I can assert my confidence by “signing” the key, and publishing that signature back to the public key servers.

Verifying Identities

Once I’ve verified that I have the right public key by checking the fingerprints, the next step is to verify that the identities on the key are accurate. In the case of a photograph, this is easy enough to do, just by viewing the image on the key. Does it match the person I just met? In the case of email addresses, the easiest way to check is to send an email to each of the addresses, encrypted using the public key we’re verifying, including some particular phrase. If the owner of the key responds with knowledge of the phrase, then we have successfully confirmed they have control over both the key and the email address.

Once we’ve verified the authenticity of the key itself, and of the identities on it, all that remains is to sign the key. This indicates to our own encryption software that we’ve successfully been through this process and don’t need to do so again. It also publishes to the world the assertion that I have verified your identity.

Web of Trust

That’s all well and good if we can meet up in person to exchange public key information in the first place. What if we can’t? Is it still possible to establish each other’s public keys, and have a secure conversation? That’s where the web of trust comes in. In short, if I can’t meet directly with you to exchange keys, but we’ve each met with Bob, and exchanged keys with him, and I trust Bob’s protocol for signing keys, then I can be confident that the key Bob asserts is yours, really is yours. This is where publishing the signatures back to the key server comes in useful. If I can see Bob has signed your key, and I trust Bob’s verification process, then I can be confident that you really are who you say you are.

The web can extend further than just one hop, so long as we trust all the intermediate nodes in the graph to follow a rigorous verification process. This, combined with keysigning parties (where a large group of people, often at user group meetings or conferences, get together to all exchange keys) means a large network of people with whom we can securely communicate.

All these bits and pieces have been around for a long time. They’re well established protocols, have been battle tested, and I have confidence that they’re secure ‘enough’ for me (not only for secure communication, but for software development1, and the distribution of software packages2).

The trouble is that, despite all these systems having been around for a while, they’ve never really reached mainstream acceptance. I suspect, if you’ve read this far, you can see why: it’s complicated. It relies on a number of “human” protocols (as opposed to RFC-defined protocols that can be implemented entirely by software), and careful verification. You kinda have to trust individuals to follow the protocols correctly, and completely, in order to trust what they assert.

Levels of Trust

PGP does allow you to assign a ‘level’ of trust to each individual in your web of trust — that’s where the output above says [ultimate] since I trust myself completely! — so you can reflect the likelihood of an overall path between two people being trustworthy based on the trustworthiness of the intermediate people. But it’s still complex and, well, you’d have to really want secure communication in order to go to all this trouble, and it’s not like most of us have anything to hide, right? (As I write this, I’m feeling slightly unnerved, because I’m sitting in a Costa Coffee in Manchester Airport, where two police officers are sitting across the room — on their break, enjoying espressos — with their Heckler & Koch MP5s dangling at their sides. How sure am I that they know I have nothing to hide?)


The overall message, though, is that it’s about assertions. And it’s about following the paths of those assertions so that you can make a single main inference: the person I want to communicate with is the owner of this public key.

An Example

Let’s follow through the protocol with a concrete example. I would like to send a secure communication to my friend, Mark Brown. We both use GnuPG to communicate securely, and both already have established key pairs. However, I haven’t verified Mark’s key id yet, so I cannot say for sure that the key published on the public key servers really is the one he has control of.

Meeting for a Pint

The first step is to meet with Mark. As is tradition with such things, we’ll retire to the Holyrood Tavern for a pint or two, and to exchange key fingerprints. We’ve each brought our government-issued photographic ID (a driving license in my case, and a passport in Mark’s case). We’ve established that the photograph on the ID is a good likeness, verified the name matches who we expect, and had a laugh because, despite knowing each other for 20 years, we didn’t know each other’s middle names! So now we’ve asserted that the person who is handing us the key fingerprint really is the person we’re trying to communicate with. Mark tells me his fingerprint is:

3F25 68AA C269 98F9 E813 A1C5 C3F4 36CA 30F5 D8EB

Verifying the Fingerprint

The next step, back at home, on a (relatively) trusted Internet connection, is to verify that the copy of the public key we have is the same as the one Mark claims to have. Let’s grab the key from the public key servers:

$ gpg --recv-keys 30F5D8EB
gpg: requesting key 30F5D8EB from hkps server
gpg: key 30F5D8EB: "Mark Brown <>" not changed
gpg: Total number processed: 1
gpg:              unchanged: 1

Then display the fingerprint:

> gpg --fingerprint 30F5D8EB
pub   4096R/30F5D8EB 2011-10-21
      Key fingerprint = 3F25 68AA C269 98F9 E813  A1C5 C3F4 36CA 30F5 D8EB
uid       [ unknown] Mark Brown <>
uid       [ unknown] Mark Brown <>
uid       [ unknown] Mark Brown <>
[ ... ]

It matches! Now we’ve successfully asserted that the public key we’ve got a copy of matches the fingerprint that Mark gave me. From this, we can confidently draw the inference that the public key I have is the one that belongs to Mark.

Verifying Identities

There’s one further stage, and that’s to assert Mark really has control over all these email addresses (6 of them — I elided some for brevity!). Let’s send him an encrypted message to each of these email addresses. For example:

> gpg --encrypt --armor -r
[ ... ]
If you really own this key, reply with the password, "bob".

which spits out an encrypted message we can copy and paste into an email to Mark. Strictly speaking, you should send a separate email, with a different password, to each email address so you can indeed verify that Mark owns all of them.

In the interests of brevity/clarity above, I elided some of the output from GPG, but it’s actually quite relevant. When you attempt to encrypt a message to a recipient who is not yet in your web of trust, GPG will warn you, with:

gpg: 4F7C301E: There is no assurance this key belongs to the named user

pub  2048R/4F7C301E 2014-08-31 Mark Brown <>
 Primary key fingerprint: 3F25 68AA C269 98F9 E813  A1C5 C3F4 36CA 30F5 D8EB
      Subkey fingerprint: C119 C009 1F4F B17F DFE8  2A57 8824 E044 4F7C 301E

It is NOT certain that the key belongs to the person named
in the user ID.  If you *really* know what you are doing,
you may answer the next question with yes.

Use this key anyway? (y/N) y

That’s what we’re trying to achieve here, so this is the one and only situation in which it’s OK to ignore this warning!

So, we’ve sent Mark an email, and he’s responded with the super secret password to confirm that the email account really belongs to him. We’ve now asserted that the person who controls the private key associated with this public key also has control over the email address. By inference, we’re pretty confident that Mark Brown, the person we want to communicate securely with, really is the owner of this key, and this email address.

Signing the Key

Now that we’re confident with this chain of assertions, it’s time to commit to it. Let’s sign the key:

> gpg --sign-key 30F5D8EB
[ ... ]
Really sign all user IDs? (y/N)

GPG will list all the identities on the key, and ask if you really want to sign them all. If Mark has only acknowledged ownership of some of the email addresses, then we’d only want to sign some of them but, in this case, he’s replied to all of them, so we’re happy to sign them all. We can then publish our assertion, so it becomes part of the public web of trust, with:

> gpg --send-keys 30F5D8EB

Job done. Now we can confidently communicate securely with Mark. Not only that, but anyone who has already signed my key, and who trusts my ability to verify other peoples’ identity, can also communicate securely with Mark. The chain of assertions makes this possible.

This all sounds like a lot of hard work. It is. And that’s the trouble: there has to be some pay-off for all this hard work, so the way things are, you have to really want secure communication to go to all this bother. There are tools to make it a bit easier, but it’s still a faff.


So, what does Keybase bring to the mix? It’s all about assertions, but from a slightly different perspective. The key assertion it allows you to make is that the owner of a PGP public key is also the owner of:

  • a particular Twitter account;

  • a GitHub account;

  • a domain name; and/or

  • some random Bitcoin-related things I don’t fully understand!

It does this by posting some PGP-signed content publicly on each of these places. For example, I have a tweet with a message signed by my PGP key:

I have a Gist posted on GitHub with similar information:

And I’ve got a file hosted on this blog with the same PGP-signed information: keybase.txt. The latter two both have obvious PGP-signed blocks of text, which we can save off and verify:

> curl -s | gpg
gpg: Signature made Sat 20 Dec 12:34:08 2014 GMT using RSA key ID 002DC29B
gpg: Good signature from "Graeme Mathieson <>" [ultimate]
gpg:                 aka "Graeme Mathieson <>" [ultimate]
gpg:                 aka "[jpeg image of size 9198]" [ultimate]
gpg:                 aka " <>" [ultimate]

I’ve elided the decoded, signed, body but it’s another copy of the JSON hash visible earlier in the file.

So that’s what Keybase gives us: the ability to assert that the person who controls a particular account (on Twitter, GitHub, Hacker News, Reddit, or a personal domain name) is also the holder of a particular PGP key. This effectively provides an alternative to the email verification step above in a neat, publicly verifiable, way. Each time you interact with a new user on Keybase (for example, sending an encrypted message), it will check each of these assertions are still valid. It has on file, for example, that I’ve tweeted the above message. When somebody tries to send a message to me, it will verify that tweet still exists, ensuring the assertion is still valid. For example, sending a message to Mark (aka broonie):

> keybase encrypt -m 'Testing Keybase message sending' broonie
info: ...checking identity proofs
βœ” public key fingerprint: 3F25 68AA C269 98F9 E813 A1C5 C3F4 36CA 30F5 D8EB
βœ” "broonie" on twitter:
Is this the broonie you wanted? [y/N] y

It asserts that the public key I have locally matches the fingerprint Keybase have stored. It also checks that the assertion published by broonie on Twitter is still valid — and displays the URL so I can check it myself, too. If I’m happy that this really is the broonie I’d like to communicate with, I accept, and it spits out a PGP-encrypted message that I can then paste into an email to Mark.

Tracking Users

Similar to the key signing stage above, we can short circuit this trust-verification stage by tracking users. Instead of having to verify a user’s identity every time, we can track them once, which effectively signs their keybase identity. Let’s track Mark:

> keybase track broonie
info: ...checking identity proofs
βœ” public key fingerprint: 3F25 68AA C269 98F9 E813 A1C5 C3F4 36CA 30F5 D8EB
βœ” "broonie" on twitter:
Is this the broonie you wanted? [y/N] y
Permanently track this user, and write proof to server? [Y/n] y

You need a passphrase to unlock the secret key for
user: "Graeme Mathieson <>"
4096-bit RSA key, ID 8F7E58DD002DC29B, created 2012-07-20

info: βœ” Wrote tracking info to remote server
info: Success!

Now we can send him PGP-encrypted messages without verifying his assertions every time. Win. Keybase makes it really easy to track people’s identities with their command line client. And it makes it really easy to send PGP-signed or -encrypted messages to people.

Of course, “all” that Keybase has done is to allow us to assert that the owner of a particular public key is also the owner of a set of Twitter, GitHub, Hacker News and Reddit accounts. And a domain name. Is that set of assertions enough to be confident in drawing the inference that we’re securely communicating with the person we want to?

It depends on the situation. Based on prior interactions on Twitter, I’m pretty confident that Mark really is ‘broonie’ on Twitter. Similarly, based on previous interaction with relativesanity on GitHub, I’m pretty confident he’s my friend, JB. And that might be good enough.

Trade Offs

Security is all about trade-offs. Keybase has traded off some of the strict verification procedures I’ve been accustomed to in PGP-land. But instead it’s introduced a new verification method which is much simpler to do, and might just be ‘good enough’ for my needs. I’m still quite excited about Keybase, and I’m interested to see where it’s going to take secure personal communication. If you’d like to track me on Keybase, I’m mathie (as usual).

  1. Git allows authors to PGP-sign tags in their source control system. This provides a verifiable way of saying that the author is confident the source code contains the code they intended. Since an individual commit is a function of all the preceding commits, signing a single one asserts the history of the entire tree. 
  2. Most Linux distributions (both dpkg- and rpm-based distributions, at least) allow authors to PGP-sign their packages. That, combined with the complete web of trust (every Debian developer has their key signed by at least one other Debian developer, in order to develop this web of trust), means we can be confident that a software package really did come from that developer. 

Starting a new Rails project

Since Ruby on Rails 4.2 has just been released, perhaps now is a good time to review creating a shiny new Rails project. It’s not often I get to create a new project from scratch, but it’s Christmas and I’ve got a bit of downtime — and an itch I’d like to scratch! So, let’s get started.

I’m aiming to build a wee project that keeps track of OmniFocus perspectives. I’ve noticed that people are sharing their perspectives as screenshots and descriptions, in tweets and blog posts. Wouldn’t it be awesome if there was a one-stop-shop for everybody’s perspectives?

A couple of early decisions in terms of the basic starting point:

  • Chances are I’ll deploy the app onto Heroku, so it’s a no-brainer to start out by using PostgreSQL in development. (Even if I wasn’t using Heroku, personally I’d choose it over MySQL if I could!)

  • I have no design skills whatsoever, so I’ll be using Twitter Bootstrap to provide some basic styling.

  • I’ll be using RSpec for testing — both unit tests and integration tests. I’ll give Cucumber a miss, at least for now.

  • Eventually, I’d like to move away from the asset pipeline, managing client side assets separately, but I’ll save that for a future refactoring.

With that in mind, let’s get started. I’m assuming that you’ve got suitable versions of Ruby, and Git, and that you’ve got PostgreSQL running locally. If you haven’t, but you’re on a Mac, may I politely point you in the direction of homebrew, rbenv, and ruby-build? If you’re not on Mac OS X, please feel free to help out in the comments. :)

So, let’s install Rails through Rubygems:

$ gem install rails

Now we’ll create the new project, and add the generated skeleton as our initial git commit:

$ rails new perspectives -d postgresql --skip-keeps --skip-test-unit
[ ... ]
$ cd perspectives
$ git init
$ git add .
$ git commit -m 'Skeleton Rails project.'

Now let’s set up the database configuration. Edit config/database.yml to remove the production stanza (which is provided by our production configuration management system!), and remove the boilerplate comments:

default: &default
  adapter: postgresql
  encoding: unicode
  pool: 5

  <<: *default
  database: perspectives_development

  <<: *default
  database: perspectives_test

Let’s create the two databases, generate the initial schema, and commit the changes back to git:

$ rake db:create:all
$ rake db:migrate
$ git add .
$ git commit -m "Database configuration."

Now we’ll just tidy up the bundler dependencies in Gemfile, just because there’s stuff in there we’ll never use. We can always add them back in at the point we need them after all. Gemfile now contains just:

source ''

gem 'rails', '~> 4.2.0'
gem 'pg'

# Asset pipeline
gem 'sass-rails', '~> 5.0'
gem 'uglifier', '>= 1.3.0'
gem 'coffee-rails', '~> 4.1.0'
gem 'jquery-rails'

group :development, :test do
  gem 'byebug'
  gem 'web-console', '~> 2.0'
  gem 'spring'

Since we’ve removed the turbolinks gem, we’ll need to remove the reference to it from app/assets/javascripts/application.js, lest we start getting errors on every page load. Run bundle install again to tidy up the lock file, add the changes to the index and commit:

$ git add .
$ git commit -m "Tidy up gem dependencies."

It’s a good idea to set the Ruby version. This tells rbenv or rvm which version of Ruby you want to use in development, and tells Heroku which one you want in production. I’m currently working with Ruby 2.1.5, so let’s set that:

$ echo '2.1.5' > .ruby-version

We’ll get bundler to pick up that same setting, just so we’re not repeating ourselves. In Gemfile, just under the source declaration, add:

ruby'../.ruby-version', __FILE__)).chomp

Again, keeping with small, discrete changes, we’ll commit that:

$ git add .
$ git commit -m "Set Ruby version to 2.1.5."

We should figure out a way to run the application server. I like to use foreman to have all the app’s components up and running in development. I have it installed through the Heroku Toolbelt, but you can just install it through Rubygems, with gem install foreman. In terms of the application server itself, unless I have a compelling reason to use something different, I’ll use Unicorn. Create a Procfile in the root of the project with:

web: bundle exec unicorn --port=${PORT} --config-file config/unicorn.rb

add the ‘unicorn’ gem to our bundler dependencies, and create a Unicorn configuration file in config/unicorn.rb:

worker_processes Integer(ENV['WEB_CONCURRENCY'] || 3)
timeout 15
preload_app true

before_fork do |_server, _worker|
  Signal.trap 'TERM' do
    puts 'Unicorn master intercepting TERM signal and sending itself QUIT instead.'
    Process.kill 'QUIT',


after_fork do |_server, _worker|
  Signal.trap 'TERM' do
    puts 'TERM ignored. Waiting on master sending a QUIT signal instead.'


This is pretty much cargo-culted from the Heroku documentation for using Unicorn, but it’s doing a couple of key things:

  • Translating between the TERM and QUIT signals, so that Unicorn does the right thing when Heroku’s process supervision system tells it to die.

  • Makes sure that every worker process has its own connection to the PostgreSQL database, instead of sharing one inherited from the master (which doesn’t need it at all).

Now we can run:

$ foreman start

to start up an application server, and it should allow us to visit http://localhost:5000/. Commit those changes:

$ git add .
$ git commit -m "Unicorn and foreman for the app server."

Let’s setup some testing next. My testing framework of choice is RSpec. Let’s add it, and spring support, to the Gemfile:

group :development, :test do
  gem 'spring-commands-rspec'
  gem 'rspec-rails'
  gem 'capybara'

and run bundle install to add them to our bundle. Now generate the rspec binstub so it uses spring to load faster:

$ spring binstub --all

and generate the skeleton configuration for rspec:

$ rails g rspec:install
      create  .rspec
      create  spec
      create  spec/spec_helper.rb
      create  spec/rails_helper.rb

Note that two spec helpers are now generated — one that just loads and configures rspec, and one which also loads the full Rails environment. This means that, if you have code which doesn’t depend on Rails itself, you can skip loading the whole shebang, which means your tests run faster. Awesome. (Though with the help of Spring, I can’t say I’ve suffered enough from slow specs to need the separation lately.)

the spec/spec_helper.rb file comes with a sensible default configuration that’s commented out in an =begin/=end clause. They’re all useful defaults, so just remove the =begin and =end, and we’re good to start testing. Sometimes, spring gets in the way a bit, in that it needs to be manually restarted to pick up changes. This is one of those times, so let’s restart (well, stop — it’ll automatically start again when it needs to) it:

$ spring stop

Then run the specs:

$ rspec
Run options: include {:focus=>true}

All examples were filtered out; ignoring {:focus=>true}
No examples found.

Finished in 0.00029 seconds (files took 0.13203 seconds to load)
0 examples, 0 failures

Top 0 slowest examples (0 seconds, 0.0% of total time):

Randomized with seed 35401

That’s working nicely, so let’s commit our changes:

$ git add .
$ git commit -m "Add rspec for testing."

It’s handy to have quick feedback from tests. In particular, I like to have tests run automatically when I make changes to a related file. Guard has my back here, so let’s use it. Add the following to the Gemfile:

group :development, :test do
  gem 'guard-rspec'
  gem 'guard-bundler'

and run bundle install to install them. Now we can generate a skeleton Guardfile to configure Guard:

$ bundle exec guard init bundler rspec

The configuration in here has changed substantially since last time I generated a Guardfile! Suffice to say, all that really needs to be changed is to tell guard to use spring’s binstub to launch rspec. Change the rspec declaration to:

guard :rspec, cmd: "bin/rspec" do
  # [ ... ]

We can now start guard in a new shell, with:

$ bundle exec guard start

and it’ll keep an eye out for file changes, running the appropriate tests for us when a file is saved. Hit enter to run all the tests now, just to check it’s working. Tests passing? Good, let’s commit our changes:

$ git add .
$ git commit -m "Guard, for automatically running our tests."

Winning. Let’s actually start serving some content — in particular, it’s always helpful to start with a static home page. Let’s start by specifying the routes. Create spec/routing/pages_controller_routing_spec.rb with:

require 'rails_helper'

RSpec.describe PagesController do
  it 'routes the home page to pages#index' do
    expect(get: '/').to route_to('pages#index')

  it 'generates / for root_path' do
    expect(root_path).to eq('/')

then replace the contents of config/routes.rb with the following to make the specs pass:

Rails.application.routes.draw do
  root to: 'pages#index'

Now we’ll specify the controller itself, in spec/controllers/pages_controller_spec.rb:

require 'rails_helper'

RSpec.describe PagesController do
  describe 'GET index' do
    def do_get
      get :index

    it 'returns http success' do
      expect(response).to have_http_status(:success)

    it 'renders the index template' do
      expect(response).to render_template('index')

with the corresponding controller implementation in app/controllers/pages_controller.rb containing:

class PagesController < ApplicationController
  def index

And, finally, a view, in app/views/pages/index.html.erb:

<h1>Hello World</h1>

Not only should our specs be green, but visiting http://localhost:5000/ should reveal the newly minted view being rendered. Just for completeness, we’ll add a feature spec as a simple integration test to check the entire stack is working. Create spec/features/pages_spec.rb with:

require 'rails_helper'

RSpec.feature 'Static pages' do
  scenario 'Visiting the home page' do
    visit '/'

    expect(page).to have_text('Hello World')

Let’s commit our changes and move on to the final stage:

$ git add .
$ git commit -m "Add in a static home page."

The final stage is to make things look a little bit prettier, and for that we’ll use Bootstrap. In particular, since we’re still using the asset pipeline, in order to reduce the number of dependencies, we’ll use the SASS port of Bootstrap. Add the following to the Gemfile:

gem 'bootstrap-sass'
gem 'autoprefixer-rails'

and run bundle install to install it. Rename the application stylesheet so it’s processed as sass:

git mv app/assets/stylesheets/application.css{,.scss}

and add the following to the end:

@import "bootstrap-sprockets";
@import "bootstrap";

Import the Bootstrap Javascript, in app/assets/javascripts/application.js:

//= require bootstrap-sprockets

Finally, create a layout suitable for bootstrap, in app/views/layouts/application.html.erb:

<!DOCTYPE html>
<html lang="en">
    <meta charset="utf-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <%= tag :link, { rel: 'shortcut icon', href: image_path('favicon.ico') }, true %>
      <%= content_for?(:title) ? "#{yield :title} - " : "" %>OmniFocus Perspectives

    <%= stylesheet_link_tag 'application', media: 'all' %>
    <%= csrf_meta_tags %>
    <nav class="navbar navbar-default" role="navigation">
      <div class="container-fluid">
        <div class="navbar-header">
          <button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#primary-navbar-collapse">
            <span class="sr-only">Toggle navigation</span>
            <span class="icon-bar"></span>
            <span class="icon-bar"></span>
            <span class="icon-bar"></span>
          <%= link_to 'OmniFocus Perspectives', root_path, class: 'navbar-brand' %>

        <div class="collapse navbar-collapse" id="primary-navbar-collapse">
          <ul class="nav navbar-nav">
            <% if request.path == root_path %>
              <li class="active"><%= link_to 'Home', root_path %></li>
            <% else %>
              <li><%= link_to 'Home', root_path %></li>
            <% end %>

    <div class="container">

      <% if content_for?(:title) %>
        <hgroup class="page-header">
          <% if content_for?(:toolbar) %>
            <div class="pull-right">
              <%= yield :toolbar %>
          <% end %>

          <h1><%= yield :title %></h1>
      <% end %>

      <section id="flash">
        <% if notice.present? %>
          <div class="alert alert-success alert-dismissible" role="alert">
            <button type="button" class="close" data-dismiss="alert">
              <span aria-hidden="true">Γ—</span>
              <span class="sr-only">Close</span>

            <%= notice.html_safe %>
        <% end %>

        <% if alert.present? %>
          <div class="alert alert-danger alert-dismissible" role="alert">
            <button type="button" class="close" data-dismiss="alert">
              <span aria-hidden="true">Γ—</span>
              <span class="sr-only">Close</span>

            <%= alert.html_safe %>
        <% end %>

    <article id="main" role="main">
      <%= yield %>

    <%= javascript_include_tag 'application' %>

It’s a bit of a mouthful, but that’s the template I wind up starting with for all my new projects. It gives me a nice title, a menu bar, flash messages, and a prettily styled body. A good start for putting together the rest of the app. Finally, commit that change.

That’s about it for getting started on a new Rails project. All in all — especially when I crib from another project — it takes about half an hour to get up and running with a full test suite, a pretty UI, and a pleasant development environment. Now it’s time to implement the first feature!

Dockerising a Rails App

Last time, we learned how to get Vagrant, Docker & VMWare Fusion running locally on our Mac development environment, and even convinced it to serve a trivial static blog through nginx. Today, we’re going to up our game and get a full Ruby on Rails application running in the docker environment, complete with separate containers for each service. We’re going to take the widgets app we were using to muck around with specifying controllers with RSpec, and see how to deploy it properly.

So, what would we like to achieve?

  • Being able to build and test our docker containers locally, since that’ll be the fastest way to iterate on them and build the right thing.

  • Separate containers for each of the separate concerns in the web app. Right now, I’m aiming for:
    • a front end nginx container, which serves the static assets for the application, and proxies to the Rails application server;

    • the application server itself, which runs the Rails application; and

    • a backend PostgreSQL container to store the database.

  • Pushing all our customised images up to the Docker Registry so that we can easily deploy them elsewhere.

  • Deploying the images onto a container host in the cloud, just to prove it works in production, too.

Sounds like a plan, eh?

PostgreSQL Container

The first order of business, since it’s easiest, is to get a PostgreSQL container up and running. Here, we don’t need anything special, so we can grab a stock container from the Docker Registry. Fortunately, there’s an official image we can use. In fact, we can just download and run it:

docker run -d --name postgres postgres

This image exposes a running PostgreSQL database on port 5432 (its well-known port), and has a superuser called ‘postgres’. It also trusts all connections to itself by anyone claiming to be the postgres user, so it’s not the sort of thing that you’d want to expose to the public internet. Fortunately, since we’re running all our Docker containers on the same host, we don’t need to anyway (though it’s still worth remembering that you’re also trusting all the other containers running on that same host!).

Our App Server

Let’s tackle the trickiest bit, next: building our own custom container that runs the application. We can’t just pick something off the shelf that’ll work here, but we can start from a firm foundation. For that, let’s begin with phusion/base_image. This is an opinionated starting point, as it has a particular style of running docker containers. In particular, instead of being a single-process-based container, it has an init system, and runs a handful of ancillary services inside the container. From this base image, we get:

  • runit for initialising the container, running services and supervising them to ensure they stay working;
  • syslog-ng for centralised logging on the container itself (which you may then want to configure to ship off to a central log server elsewhere);

  • cron, for running of scheduled tasks; and

  • an ssh daemon, which we can use to introspect the running container.

All these things would need further configuration for a production deployment (in particular, ssh uses an insecure public key which should be replaced ASAP!), but we’ll punt on that for now and focus on building a container for
the app.

The Dockerfile

Let’s start building out a Dockerfile for our application server. First we start with some boilerplate, specifying the image we’re building on, and that we’re maintaining this version:

FROM phusion/baseimage
MAINTAINER Graeme Mathieson <>

Next up, we’ll install some system dependencies. For our app, we need Ruby 2.1, some sort of JavaScript runtime, and the appropriate build tools to compile our gem dependencies:

RUN apt-add-repository ppa:brightbox/ruby-ng
RUN apt-get update && apt-get dist-upgrade -qq -y
RUN apt-get install -qq -y ruby-switch ruby2.1 \
  build-essential ruby2.1-dev libpq-dev \
RUN ruby-switch --set ruby2.1

And we can update Rubygems for good measure, then install bundler to manage the application’s dependencies:

RUN gem update --system --no-rdoc --no-ri
RUN gem update --no-rdoc --no-ri
RUN gem install --no-rdoc --no-ri bundler

Now we specify the environment for the application. In a production environment, we’d probably get this information from a configuration service, but since we’re just playing around, we can hard code them here, so at least the app can know what it’s supposed to be doing:

ENV RAILS_ENV production

It turns out that we can use environment variables we’ve set here, later on in the configuration file. We’ll make use of this so we don’t repeat ourselves when specifying the port that the application server will expose:


We’ll supply a simple script so that runit can run the application server, and restart it if it fails. The script is:


export DATABASE_URL="postgresql://postgres@${POSTGRES_PORT_5432_TCP_ADDR}:${POSTGRES_PORT_5432_TCP_PORT}/widgets?pool=5"

cd /srv/widgets

/sbin/setuser widgets bundle exec rake db:create
/sbin/setuser widgets bundle exec rake db:migrate

exec /sbin/setuser widgets bundle exec \
  unicorn -p ${PORT} -c config/unicorn.rb \
    >> /var/log/rails.log 2>&1

We’ll come onto the DATABASE_URL later on, but it’s picking up the relevant information from the container environment to figure out the database that Rails should connect to. We’re also cheating a bit, in that the init script makes sure the database exists, and is migrated to the latest schema. I wouldn’t recommend this approach in a real production environment!

We install the script with:

RUN mkdir /etc/service/widgets
ADD config/deploy/ /etc/service/widgets/run

Now, finally, we can install the application itself, along with its gem dependencies:

RUN adduser --system --group widgets
COPY . /srv/widgets
RUN chown -R widgets:widgets /srv/widgets
RUN cd /srv/widgets && \
  /sbin/setuser widgets bundle install \
    --deployment \
    --without development:test

This dumps the container’s build context (which is the directory that contains the Dockerfile — in other words, the root of our project) into /srv/widgets, making sure it’s owned by the widgets user, then installing dependencies.

Building the custom app container

Now we’ve got a Dockerfile which describes the container we’d like to have, we can build it:

docker build -t 'mathie/widgets:latest' .

This runs through all the commands in the Docker configuration file in turn, and executes them in a newly created container which is running the resulting image from the previous command. This is clever stuff. Every new container is built upon the results of the commands preceding it. One key benefit is that it means Docker can cache these intermediate images, and only rebuild the bits that are needed, resulting in faster image build times.

So, for example, if we just change our runit configuration file above, then rebuild the image, Docker will used cached versions of the images generated from each preceding command, only really running the ADD command to add the new configuration file. Of course, since this emits a new image, all the subsequent commands have to be re-run. It can be useful to bear this caching strategy in mind when structuring your Dockerfile: put the bits that download and install all your dependencies near the top, while leaving configuration files and tweaks towards the end.

Running the containers

Now we’ve built all the necessary containers, we can run them. First of all, let’s run the PostgreSQL container, which provides a database service we can link to from our app:

docker run -d --name postgres postgres

Next, we can run the app itself, linking to the running PostgreSQL database:

docker run -d --name widgets \
  --link 'postgres:postgres' \

Linking allows an application on one container to connect to a service provided by another container. In this example, it allows the widgets container to connect to services running on the postgres container. This confused me for a while; I assumed that the links were for specific services on the target container. That’s not the case. The client container has access to all the exposed services on the target container.

Better still, linking containers allows us to auto-configure the connection to the target container, using environment variables. That way, we’re not hard coding any IP configuration between the containers, which is just as well seeing as we don’t have any overt way to control it.

The --link flag has two colon-separated arguments. The first is the target container. The second argument is the label used for the environment variables on the source container. Let’s try running an example container attached to the postgres container to see what happens. And let’s label the link “chickens” just to make things clearer:

$ docker run -i -t --rm --link 'postgres:chicken' mathie/widgets:latest bash
$ env|grep CHICKEN

The environment variables that start CHICKEN_ENV_ correspond to the environment set up in the PostgreSQL instance’s Docker configuration. We get the container’s name and label from the CHICKEN_NAME environment variable. And we get each of the exposed ports (from EXPOSE in the docker configuration file) in a few different ways, so that we can easily compose what we need to connect to the service.

Now you can see how we compose the database URL above.

Nginx Frontend

We’re still missing one piece of the puzzle: something to serve static assets (stylesheets, javascript, images, that kind of thing). We’re using the Rails Asset Pipeline to serve our assets, so we need to generate static copies of them, bundle them up in a container running a web server, and get it to forward real requests on to the backend. Just like all the cool kids, I’ve followed the trends through Rails web serving, from Apache to Lighty, back to Apache with Passenger, then onto Nginx. (I hope we’re all still on Nginx, right?) Let’s stick with that.

There’s an nginx container that’s ready to go, and it does almost all we need to do — serve static assets — but not quite. Still, we can start with it, and customise it a little. Let’s create a second Dockerfile:

FROM nginx
MAINTAINER Graeme Mathieson <>

RUN rm -rf /usr/share/nginx/html
COPY public /usr/share/nginx/html
COPY config/deploy/nginx.conf /etc/nginx/conf.d/default.conf

All we’re doing here is copying all the static assets (i.e. the contents of the public/ folder, some of which we’ll have to generate in advance) across to the container, and supplying an nginx configuration file that passes dynamic requests to the back end. What does that nginx configuration file look like?

server {
    listen       80;
    server_name  localhost;

    location / {
        root      /usr/share/nginx/html;
        index     index.html index.htm;
        try_files $uri/index.html $uri.html $uri @upstream;

    location @upstream {
        proxy_pass http://widgets:5000;

So, if a file exists (in a couple of forms), it’ll serve it. If it doesn’t, it’ll pass it on to this mysterious upstream called widgets at port 5000.

Automating the build with Rake

Things are starting to get a little complicated here. Not only do we have to remember to precompile the assets before building our web frontend, we also have to remember to do it before building our app server too. (The app server uses the generated manifest.yml file to figure out the cache-busting names it should use for the assets it links to.) With all these dependent tasks, it sounds like we need a make-clone to come to our rescue. Hello, rake!

There are a couple of things that we want to manage with rake:

  • Since we’re now building two different Docker images, we want the appropriate Dockerfile at the root of the repository, so that docker correctly picks up the entire app for its context. Let’s rename our existing Dockerfile to and the web front end to Dockerfile.web. We can then get rake to symlink the appropriate one as it’s doing the image build.
  • We need to precompile all the assets for the asset pipeline.

  • And, of course, we need to build the images themselves.

Let’s create our rake tasks in lib/tasks/docker.rake with the following:

namespace :docker do
  task :build => ['docker:build:web', 'docker:build:app', 'assets:clobber']

  namespace :build do
    task :app => ['assets:precompile', 'assets:clean'] do
      sh 'ln -snf Dockerfile'
      sh 'docker build -t "mathie/widgets:latest" .'
      sh 'rm -f Dockerfile'

    task :web => ['assets:precompile', 'assets:clean'] do
      sh 'ln -snf Dockerfile.web Dockerfile'
      sh 'docker build -t "mathie/widgets-web:latest" .'
      sh 'rm -f Dockerfile'

Now we can run rake docker:build to build our pair of docker images, one for the web front end, and one for the app itself. If you take a look at the file in GitHub, you’ll see that I’ve also added some helper tasks to automate running and terminating the containers in the right order with the right links.

Container Host Names

A little earlier, I noted that our nginx configuration was passing dynamic requests up to a mysteriously named back end:

location @upstream {
    proxy_pass http://widgets:5000;

So what’s that all about? Well, when we link containers together, in addition to creating all those environment variables to help us configure our container, it also writes the label out to /etc/hosts, giving us a name -> ip address mapping for the linked container. Let’s run a sample container to see this in action:

$ docker run -i -t --rm --link 'widgets:chickens' mathie/widgets-web:latest bash
root@45aa608114f8:/# ping chickens
PING chickens ( 48 data bytes
56 bytes from icmp_seq=0 ttl=64 time=7.931 ms

Winning. This is particularly useful for the likes of nginx, where we can’t use environment variables in that particular context. Our alternative would have been to figure out some sort of template nginx configuration that had variables we could substitute when the container was run, and that would have been too much like hard work!

Running the web front end

So, to recap, now we have two running containers: one for PostgreSQL, and one for the app itself. And we’ve built two custom containers: one for the app itself, and one with static assets for the web front end. The final piece of the puzzle is to start up the web front end:

docker run -d \
  --name widgets_web \
  -p 80:80 \
  --link "widgets:widgets" \

We’re launching it with the custom container, binding port 80 on the container to port 80 on the docker host, and linking the container to the app server. If you’ve followed the instructions in part 1 (in particular, around giving the docker host a name, and running avahi for mDNS), then you should now be able to visit http://docker.local/ and see the running app.

In Summary

It feels like it’s been a long trek to get here, but we’ve achieved a lot:

  • We’re building custom Docker containers, starting from existing containers supplied by the community. And we’ve automated the build with Rake, so it would be trivial for our Continuous Integration system to automatically build these images every time we push up some new code.
  • We’ve got an entire docker environment running locally, which will closely resemble our production environment. This will give us closer parity between development and production, reducing the scope of production-only errors, and improving our (still inevitable) production debugging experience.

  • And we’ve learned how Docker actually works for linking containers, with both the environment variables available, and the host names supplied by Docker. Personally, I’m most pleased about that, because I hadn’t properly understood it until now!

I do have one question that you might know the answer to, if you’ve suffered your way through this entire rambling article: in a Docker configuration file, what’s the difference between COPY and ADD? The only difference I can see is that ADD will accept source URLs in addition to files in the image’s build context. I suspect there’s a subtle difference in caching behaviour, but I could be way off base!

Vagrant, Docker, and VMWare Fusion

I’m a bit behind the times when it comes to containerised deployments. I’ve been quite happily using Vagrant and Puppet to model production environments on my laptop before orchestrating their release into a real production environment. It’s delightful to be able to use Vagrant to recreate entire production-like environments, complete with a local puppet master, on my own computer, to test my Puppet changes before they go live.

With Vagrant at my back, and with all the experience I’ve picked up from deploying Puppet I’ve largely ignored the new hotness that is Docker — after all, it’s just Solaris Zones, right? πŸ˜‰

I have to admit, I did have a wee poke around with Docker on my laptop a few months back, but quickly gave up. The trouble is that I’m obstinate: I’ve paid for a VMWare Fusion license, and I’m insisting on using it. The Vagrant support for Docker uses boot2docker, which only supports VirtualBox, so I’m left high and dry. I figured it was time to get around that and figure out a reliable way to run docker containers on my Mac OS X laptop.

The trouble, of course, is that Docker is a Linux technology, so it doesn’t run natively on Mac OS X. So we need a Linux VM to run the containers. Let’s just get this absolutely clear: we’re trying to run containers within a Linux virtual machine, which is running on VMWare Fusion, which is managed through Vagrant which, in turn, is running on our Mac OS X laptop. (Yes, it’s virtualisation technology all the way down!)

So, what do we want from this Linux VM?

  • Ideally, it should be accessible by some well-known name — let’s call it docker.local — so that we can connect to services running on it without worrying about hard coding IP addresses, or forwarding ports back to the host machine.

  • Most importantly, it should be running a recent stable version of Docker.

  • It has (a portion of) the Mac OS X host file system shared with it, so that we can, in turn, share portions of that filesystem with docker containers.

You can find the full configuration up on GitHub: Vagrantfile. Let’s step through it. All the configuration is wrapped in a standard vagrant configuration block:


Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
  # Configuration goes here.

First of all, we configure a base box. I’m using the base image of Ubuntu 14.04 provided by the folks at Phusion (purveyors of Passenger and other high quality things for a production environment): phusion/ubuntu-14.04-amd64. Mostly, I’m using this base box because it’s correctly configured for VMWare Fusion, so things like shared folders keep working, even across updated kernel packages. I’m also tweaking the number of CPUs available to the box, and the memory, just to give us plenty of headroom for running containers: = "phusion/ubuntu-14.04-amd64"

config.vm.provider "vmware_fusion" do |provider|
  provider.vmx['memsize'] = 2048
  provider.vmx['numvcpus'] = 4

We’ll give the box a well-known hostname because, along with installing avahi-daemon shortly, that will allow us to refer to the vm by name, instead of having to discover its IP address manually:

config.vm.hostname = 'docker.local'

Finally, configuration-wise, let’s mount the host’s home directory somewhere convenient, so that we can share folders with docker containers later on:

config.vm.synced_folder ENV['HOME'], '/mnt'

This way, our entire home directory on Mac OS X will be available in /mnt on the Linux VM. Handy. (I did initially muck around with mounting it on /home/vagrant, but that turned out to create problems with SSH keys, so I punted and mounted it somewhere neutral.)

Now all we need is a bit of shell magic to provision the machine. Specify a shell script provisioner with:

config.vm.provision :shell, inline: <<-SHELL
  # Provisioning script goes in here.

The rest of the code is inside the provisioning script. All we’re doing is installing the latest stable version of Docker, direct from Docker’s own apt repository, configuring it to listen for both TCP and socket connections, updating the rest of the system, and installing avahi for multicast DNS resolution. Simples:

sudo apt-key adv --keyserver hkp:// --recv-keys 36A1D7869245C8950F966E92D8576A8BA88D21E9
echo 'deb docker main' > /etc/apt/sources.list.d/docker.list

apt-get update
apt-get dist-upgrade -u -y -qq
apt-get install -qq -y lxc-docker avahi-daemon
apt-get autoremove --purge -y

adduser vagrant docker

echo 'DOCKER_OPTS="-H tcp:// -H unix:///var/run/docker.sock"' >> /etc/default/docker
restart docker

Let’s provision the box to make sure it’s working, and run our first docker container:

$ vagrant up
# [ ... ]
$ vagrant ssh
$ docker run ubuntu echo hello world
# [ ... ]
hello world

Winning. But, even better still, we can run the Docker client on Mac OS X, connecting to the Docker daemon running inside the VM. I’ve got docker installed on my Mac through homebrew:

$ brew install docker

Now we can tell the Docker client where to connect (this is where setting the hostname on our VM, and installing avahi, comes in handy). The DOCKER_HOST environment variable can be used to say where our Docker host is. I’ve added the following to my shell startup (~/.zshenv in my case):

export DOCKER_HOST="tcp://docker.local:2375"

Restart your shell and let’s see if we can run a docker container directly from Mac OS X:

$ docker run ubuntu echo hello world
hello world

Awesome. Finally, just to check we’ve got all the moving parts working as desired, let’s try to serve up this blog’s generated content (built with Jekyll) using an nginx container. This will check that we can run daemon containers, connect to their advertised ports, and share the local filesystem content with a container. Start the container, on your Mac OS X host’s command line, with:

$ BLOG=Development/Personal/
$ docker run -d \
    -p 80:80 \
    --name wossname-nginx \
    -v /mnt/${BLOG}:/usr/share/nginx/html:ro \

It’s a bit of a mouthful. We’re:

  • running a docker container as a daemon, detaching it from the tty with the -d flag;
  • forwarding port 80 on the docker host to port 80 of the container;

  • giving the container a name (wossname-nginx) so that we can refer to it later;

  • mounting a shared volume from this web site’s output directory on the host (though note the path is based on where it’s mounted inside the Linux VM) to the preconfigured HTML root for the nginx container; and

  • finally, specifying the name of the image we want to run, which happens to be the most recent official nginx image.

With all that done, we can visit http://docker.local/ and be served with the most recently generated version of this web site. And it’s coming from a docker container inside a Linux VM running in VMWare Fusion, which is managed by Vagrant, which is running on our Mac OS X host. It’s amazing it looks so fresh after travelling all that distance, eh?

The next step will be to try and serve up something a little more complex — say, a simple Rails app — with multiple containers. But that’s for another day.

Representing Trees in PostgreSQL


Trees are a very useful data structure for modelling real world entities. You can use them to model an organisation’s logical structure (teams, tribes, squads, departments, etc) or you can use them to model a building’s physical structure (campuses, buildings, floors, rooms, etc). You can use them to model conversations (nested comments). The possibilities are endless.

The trouble with hierarchical data like this is that it doesn’t fit terribly well into our traditional relational data model, where we’re operating over a set of data which matches some particular constraint. It’s OK if the depth of the tree is fixed. So, if for example, we had a rigid model for our hierarchy which represented locations (a room on a floor within a building, say), then it’s straightforward to write queries for that data. Let’s say we wanted to find out which room a computer was in:

SELECT building, floor, room
FROM computers, locations buildings, locations floors, locations rooms
WHERE     = 'Arabica'
  AND computers.room_id  =
  AND rooms.location_id  =
  AND floors.location_id =

Unfortunately, here the structure of the tree is baked right into the query. If we wanted to change the model — say to break rooms out into individual desk locations — then we’d have to change the queries and other code to accommodate that. Since trees of arbitrary depth and flexible structure are so useful, people have come up with a few ways to shoehorn them into a SQL database model:

  • Adjacency model is where the nearest ancestor of a particular row (the parent) has its id recorded on each of its children. Each child winds up with a parent_id column which refers to its immediate parent. This seems like a clean, natural, solution from a programming perspective but, in practice it performs poorly and is hard to manage.

  • Materialised Path model is where each node in the tree has a column which stores its entire ancestral path, from the root node, down to its parent. These are typically stored a string and encoded with delimeters so they’re easy to search on. If we have a structure along the lines of A -> B -> C (that is, C is a child of B who, in turn, is a child of A), then C‘s path column might be A/B.

  • Nested Sets are slightly more complex to contemplate. Each node is given a “left” number and a “right” number. These numbers define a range (or interval) which encompasses the ranges of all its children. To continue with the example above, A might have an interval of (1, 5), B could have an interval of (2, 4) and C could have a short interval of (3, 3). This particular representation is great for searching subtrees, but inserting a new node can potentially cause half the nodes in tree to need to be rebalanced (i.e. have their intervals recalculated).

There’s also a twist on the Nested Sets model, called the nested interval model, where the nodes are given two numbers that represent the numerator and denominator of a fraction, but it doesn’t seem so popular, and was too complex to wrap my head around!

Each of these models have their advantages and disadvantages, both in terms of their data representation, and in terms of the performance of common operations. Some are blazingly fast for INSERTs, for example, while making it unnecessarily hard to perform common queries. Choosing the right one depends on your workload. (The good news, though, is that you can always migrate your data from one pattern to another, if it turns out your real life workload requires it.)

PostgreSQL Arrays

For the current project I’m working on, I decided that the materialised path pattern would probably be the most appropriate for our particular workload. And it gave me a good excuse to try out an idea that’s been on my todo list for a couple of years now: using PostgreSQL’s native array type to store the materialised path.

An Example

Let’s work through a wee example to see it in action. Our application has a hierarchical tree of nodes. There can be multiple root nodes (that is, nodes without any ancestors), and each node simply has a name (which is required). You can find the sample app up on GitHub: tree which includes a bit of UI to play around with it, too. One thing to note is that we’re dependent upon the postgres_ext gem to handle the conversion between PostgreSQL and native array types. Let’s start with the migration:

class CreateNodes < ActiveRecord::Migration
  def change
    create_table :nodes do |t|
      t.string  :name, null: false
      t.integer :path, null: false, array: true, default: []

      t.timestamps null: false

    add_index :nodes, :path, using: 'gin'

It’s handy that ActiveRecord 4.x supports creating array types. So here we have:

  • An integer id. This does also work with uuid-style ids, but there are a few more hoops to jump through.

  • An array of integers as the path. This represents our materialised path of the ancestors of each node. It can’t be NULL, but defaults to the empty array, which is how we represent a root of a tree (no ancestors).

  • Timestamps, as usual, because it’s easier to go with the flow. :)

  • A Generalised Inverted Index (GIN) on the path column, which I’ll explain in more detail later. Suffice to say, this means all our common operations on the tree should hit an index, so should be ‘fast enough’.

Let’s now think of some of the queries we might like to ask our tree. In each of these cases, I’m keen for the resulting class to be an ActiveRecord::Relation, since that way a client of the code can further refine the search. (Say for example, I wanted to search a particular subtree for all the nodes that had a green name, I could chain together, node.subtree.where('name LIKE ?', '%green%').)

Finding Nodes

We’d probably like to get a list of the root nodes — after all, we’ve got to start somewhere. What would that look like?

class Node < ActiveRecord::Base
  def self.roots
    where(path: [])

Which is just asking for all the nodes which have an empty array for the path.

Now that we’ve got an initial node to work from, what kinds of queries might we want to do on it? Well, we might want to find the root of the tree that contains it:

def root
  if root_id = path.first

That is, the first id in the path is the root of the tree. If there is one (i.e. it’s not an empty array), then look up the row with that primary key. If the array is empty, we’re already on the root node, so can just return ourselves. Finding the parent of the current node is similar:

def parent
  if parent_id = path.last

The difference here is that if there’s no parent, we return nil. We can also quickly find all the ancestors:

def ancestors
  if path.present?
    self.class.where(id: path)

Dead simple: load up the rows with the primary keys of all the ancestors. In order to maintain consistency of always returning an ActiveRecord::Relation I had to use the special ‘null relation’ here which, when executed, always returns an empty set. Checking for siblings is straightforward, too; they have the same path as the current node:

def siblings
  self.class.where(path: path)

and direct children have the current node’s path concatenated with its own id:

def children
  self.class.where(path: path + [id])

The next one is a little more interesting. In order to find all the descendants of a particular node (that is, the children, and the children’s children and so on) we’re looking for any nodes that contain the current node’s id anywhere within its path. Fortunately, PostgreSQL has just such an operator, so we can do:

def descendants
  self.class.where(":id = ANY(path)", id: id)

The N+1 Problem

There is a snag with all of this, though. While the individual queries are simple, efficient, and hit indexes, they’re not minimising the number of SQL queries generated in the first place. Logically, in one of my views, I want to display a nested list of the nodes and their children. In the controller I load up the root nodes:

def index
  @nodes = Node.roots

Then in my app/views/nodes/index.html.erb view, I’d like to do:

  <%= render @nodes %>

and in app/views/nodes/_node.html.erb I can simply do:

  <%= link_to, node %>

  <% if node.children.present? %>
      <%= render node.children %>
  <% end %>

Nice, clean, tidy view code, right? The trouble is that it’s got the classic N+1 problem of SQL statements being generated: one to load the root nodes, and one for each descendant of each root node! We could have retrieved all the appropriate data with a single query, but that means reconstructing the tree from a flat set of rows we got back from PostgreSQL. Doing that sort of thing in the view would be hienous.

In Theory

Here’s what I’d like to do:

@nodes = Node.roots.eager_load(:descendants)

which would load the root nodes, and would eager load all the descendant nodes. It would be able to insert them into the children relation of each appropriate node, and mark that relation as having already been loaded, just like eager loading of associations that we’re used to with ActiveRecord.

I have a sneaking suspicion that such a thing is possible. In fact, I suspect that the machinery is already mostly in place, in ActiveRecord::Associations::Preloader. So, a question: can it be done using the existing ActiveRecord code? If so, how?

Specifying Rails Controllers with RSpec

Part 1: Query-style actions

To my mind, Ruby on Rails controllers are a simple beast in the Model-View-Controller (MVC) pattern. They simply:

  • take an HTTP action (GET, POST, PUT or DELETE) for a particular resource. The mapping between the HTTP verb, the resource, and the corresponding action is handled above the controller, at the routing layer. By the time it gets down to the controller, we already know what the desired action is, and we have an identifier for the resource, where applicable.

  • communicate the intent of the action to the underlying model.

  • return an appropriate response, whether that’s rendering a template with particular information, or sending back an HTTP redirect.

There are two possibilities when it comes to communicating the intent of the action to the underlying model:

  • It can query the model for some information, usually to render it as an HTML page, or emit some JSON representation of the model; or
  • it can send a command to the model, and return an indication of the success or failure of the command.

That’s it. The controller is merely an orchestration layer for translating between the language and idioms of HTTP, and the underlying model. I should probably note at this point that the underlying model isn’t necessarily just a bunch of ActiveRecord::Base-inherited models, it can be composed of service layers, aggregate roots, etc. The point is that the controllers are “just” about communicating between two other parts of the system (at least that’s how I always interpreted Jamis Buck’s famous Skinny Controller, Fat Model post).

That’s handy, because it makes them nice and easy to test. I’ve been brushing up on RSpec over the past couple of weeks, learning all about the new syntax and features, and applying them to a couple of demo projects. I thought I’d share what I’d learned about rspec in the form of a worked example.

We’ve got a requirement to keep track of Widgets. Widgets are pretty simple things: they have a name, and they have a size in millimetres (you wouldn’t want to get different sized widgets mixed up!). Our client, who manufactures these widgets, is looking for a simple web app where she can:

  • list all widgets;
  • create a new widget; and

  • delete an existing widget.

We’ve been developing their in-house stock management system for a while, so we’ve got an established Ruby on Rails project in which to hang the new functionality. It’s just going to be a case of extending it with a new model, controller, and a couple of actions. And we’re all set up with rspec, of course. Better still, we’re set up with Guard, too, so our tests run automatically whenever we make a change. You can find the starting point of the project on the initial setup branch.

Listing all the widgets

Let’s start with an integration test. I like the outside-in flow of testing that BDD encourages (in particular, the flow discussed in The RSpec Book), where we start with an integration test that roughly outlines the next feature we want to build, then start fleshing out the functionality with unit tests. So, our first scenario, in spec/features/widgets_spec.rb:

RSpec.feature 'Managing widgets' do
  # All our subsequent scenarios will be inside this feature block.

  scenario 'Finding the list of widgets' do
    visit '/'

    click_on 'List Widgets'

    expect(current_path).to be('/widgets')

I always like to start simple, and from the user’s perspective. If I’m going to manage a set of widgets, I need to be able to find the page from another location. In this scenario, we’re checking that there’s a ‘List Widgets’ link somewhere on the home page. It saves that embarrassing conversation with the client where you’ve implemented a new feature, but she can’t find it.

The first thing to note is that, by default, rspec now no longer creates global methods, so Capybara’s feature DSL method (and RSpec’s describe, which we’ll see in a minute) are both defined on the RSpec namespace. Perhaps the other thing to note is that I’m just using RSpec feature specs — with Capybara for a nice DSL — instead of Cucumber, which I’ve advocated in the ast? Why? It turned out that people rarely read my Cucumber stories, and those that did could cope with reading code, too. RSpec features are more succinct, consistent with the unit tests, and have fewer unwieldy overheads. You and your team’s mileage may, of course, vary!

Finally, there’s rspec’s new expect syntax. Instead of adding methods to a global namespace (essentially, defining the should and should_not methods on Object), you have to explicitly wrap your objects. So, where in rspec 2.x and older, we’d have written:

current_path.should eq('/widgets')

(which relies on a method called should being defined on your object) instead we now wrap the object under test with expect:

expect(current_path).to eq('/widgets')

It still reads well (“expect current path to equal /widgets” as opposed to the older version, “current path should equal /widgets”). Combined with the allow decorator method, it also gives us a consistent language for mocking, which we’ll see shortly. To my mind, it also makes it slightly clearer exactly what the object under test really is, since it’s (necessarily) wrapped in parentheses. I like the new syntax.

Let’s just assume we’ve implemented this particular scenario, by inserting the following into app/views/layouts/application.html.erb:

<li><%= link_to 'List widgets', widgets_path %></li>

It does bring up the issue of routes, though: how to we specify that the list of widgets is located at /widgets? Let’s write a couple of quick specs to verify the routes. I swither between testing routes being overkill or not. If there is client side JavaScript relying on the contract that the routes provide, I err on the side of specifying them. It’s not too hard, anyway. So, in spec/routing/widgets_controller_routing_spec.rb (what a mouthful, but it’s what rubocop recommends):

RSpec.describe 'WidgetsController' do
  it 'routes GET /widgets to widgets#index' do
    expect(get: '/widgets').to route_to('widgets#index')

  it 'generates /widgets from widgets_path' do
    expect(widgets_path).to eq('/widgets')

In order to make these specs pass, add the following inside the routes block in config/routes.rb:

resources :widgets, only: [:index]

But one of our specs is still failing, complaining about the missing controller. Let’s create an empty controller to keep it happy. Add the following to app/controllers/widgets_controller.rb:

class WidgetsController < ApplicationController

We’ve still got a failing scenario, though — because it’s missing the index action that results from clicking on the link to list widgets. Let’s start to test-drive the development of that controller action. This is what I describe as a “query”-style action, in that the user is querying something about the model and having the results displayed to them. In these query-style actions, there are four things that typically happen:

  • the controller asks the model for some data;
  • it responds with an HTTP status of ‘200 OK’;

  • it specifies a particular template to be rendered; and

  • it passes one or more objects to that template.

Let’s specify the middle two for now. I’ve a feeling that will be enough to make our first scenario pass. In spec/controllers/widgets_controller_spec.rb, describe the desired behaviour:

RSpec.describe WidgetsController do
  describe 'GET index' do
    def do_get
      get :index

    it 'responds with http success' do
      expect(response).to have_http_status(:success)

    it 'renders the index template' do
      expect(response).to render_template('widgets/index')

Defining a method to perform the action is just one of my habits. It doesn’t really add much here — it’s just replacing get :index with do_get — but it helps to remove repetition from testing other actions, and doing it here too gives me consistency amongst tests. Now define an empty index action in WidgetsController:

def index

and create an empty template in app/views/widgets/index.html.erb. That’s enough to make the tests pass. Time to commit the code, and go grab a fresh coffee.

Next up, let’s specify a real scenario for listing some widgets:

scenario 'Listing widgets' do
  Widget.create! name: 'Frooble', size: 20
  Widget.create! name: 'Barble',  size: 42

  visit '/widgets'

  expect(page).to have_content('Frooble (20mm)')
  expect(page).to have_content('Barble (42mm)')

Another failing scenario. Time to write some code. This time it’s complaining that the widgets we’re trying to create don’t have a model. Let’s sort that out:

rails g model Widget name:string size:integer

We’ll tweak the migration so neither of those columns are nullable (in our domain model it isn’t valid to have a widget without a name and a size)

class CreateWidgets < ActiveRecord::Migration
  def change
    create_table :widgets do |t|
      t.string :name,  null: false
      t.integer :size, null: false

      t.timestamps null: false

Run rake db:migrate and we’re good to go again. Our scenario is now failing where we’d expect it to — the list of widgets are not appearing on the page. Let’s think a bit about what we actually want here. Having discussed it with the client, we’re just looking for a list of all the Widgets in the system — no need to think about complex things like ordering, or pagination. So we’re going to ask the model for a list of all the widgets, and we’re going to pass that to the template to be rendered. Let’s write a controller spec for that behaviour, inside our existing describe block:

let(:widget_class) { class_spy('Widget').as_stubbed_const }
let(:widgets)      { [instance_spy('Widget')] }

before(:each) do
  allow(widget_class).to receive(:all) { widgets }

it 'fetches a list of widgets from the model' do
  expect(widget_class).to have_received(:all)

it 'passes the list of widgets to the template' do
  expect(assigns(:widgets)).to eq(widgets)

There are a few interesting new aspects to rspec going on here, mostly around the test doubles (the general name for mocks, stubs, and spies). Firstly, you’ll see that stubbed methods, and expectations of methods being called, use the new expect/allow syntax for consistency with the rest of your expectations.

Next up is the ability to replace constants during the test run. You’ll see that we’re defining a test double for the Widget class. Calling as_stubbed_const on that class defines, for the duration of each test, the constant Widget as the test double. In the olden days of rspec 2, we’d have to either use dependency injection to supply an alternative widget implementation (something that’s tricky with implicitly instantiated controllers in Rails), or we’d have to define a partial test double, where we supply stubbed implementations for particular methods. This would be something along the lines of:

before(:each) do
  allow(Widget).to receive(:all) { widgets }

While this works, it is only stubbing out the single method we’ve specified; the rest of the class methods on Widget are the real, live, implementations, so it’s more difficult to tell if the code under test is really calling what we expect it to.

But the most interesting thing (to my mind) is that, in addition to the normal mocks and stubs, we also have test spies. Spies are like normal test doubles, except that they record how their users interact with them. This allows us, the test authors, to have a consistent pattern for our tests:

  • Given some prerequisite;
  • When I perform this action;

  • Then I have these expectations.

Prior to test spies, you had to set up the expectations prior to performing the action. So, when checking that the controller asks the model for the right thing, we’d have had to write:

it 'fetches a list of widgets from the model' do
  expect(widget_class).to receive(:all)

It seems like such a little thing, but it was always jarring to have some specs where the expectations had to precede the action. Spies make it all more consistent, which is a win.

We’re back to having a failing test, so let’s write the controller implementation. In the index action of WidgetsController, it’s as simple as:

@widget = Widget.all

and finish off the implementation with the view template, in app/views/widgets/index.html.erb:

<h1>All Widgets</h1>

  <%= render @widgets %>

and, finally, app/views/widgets/widget.html.erb:

<%= content_tag_for :li, widget do %>
  <%= %> (<%= widget.size %>mm)
<% end %>

Our controller tests and our scenarios pass, time to commit our code, push to production, pack up for the day and head to the pub for a celebratory beer! In part 2, we’ll drive out the behaviour of sending a command to the model, and figuring out what to do based upon its response.

You can find a copy of the code so far on the part 1 branch in GitHub. Once you’ve had that celebratory beer, head across to part 2 to continue our exploration of RSpec while implementing a command-style action.

tmux: New Windows in the Current Working Directory

For a while, tmux would default to creating new windows (and splits) with the shell in the current working directory (CWD) of my existing pane. As of quite recently, that seems to have stopped working; now all my new shells are popping up with the CWD set to my home directory. That’s almost never where I want to be, and the CWD of the previous window is at least as good a starting point as any.

It turns out, according to this question on Stack Exchange, that the default behaviour used to be, on tmux 1.7:

  • If default-path is set on the session, set the current working directory of a new window (or split) to that directory; or

  • if default-path is unset, use the current working directory of the current window.

In tmux 1.9, the default-path option was apparently removed. It’s definitely not mentioned in the man page on my installed version (1.9a). That’s kind of a shame, because setting the default path for a session would be a useful feature (say, for example, setting it to the root of a project). Still, let’s see about making sure the new-window-related shortcuts within tmux do the right thing. Add the following to ~/.tmux.conf:

# Set the current working directory based on the current pane's current
# working directory (if set; if not, use the pane's starting directory)
# when creating # new windows and splits.
bind-key c new-window -c '#{pane_current_path}'
bind-key '"' split-window -c '#{pane_current_path}'
bind-key % split-window -h -c '#{pane_current_path}'

which updates the current key bindings to use the current pane’s working directory. I’ve also updated my new-session shortcut, too:

bind-key S command-prompt "new-session -A -c '#{pane_current_path}' -s '%%'"

Much better! You can find my entire tmux configuration up on GitHub: tmux.conf. Most of the rest of the configuration is around customising the status bar.

Personal Code Review

Git is a my distributed version control system of choice. I almost always pair it with GitHub since, despite it centralising a distributed system, sometimes the conversation around code is just as important as the code itself. (Now I think about it, a distributed issue tracking system which stashes its data, meta-data and history in an orphan branch would be pretty cool.) I’ve been a user, and proponent, of Git, since my friend Mark did a talk on it at the Scottish Ruby User Group, and convinced us all it was better than Subversion.

Fundamentally, git (or any other version control system) gives us a couple of useful features:

  • The ability to keep track of changes over time. If we’re a little careful about creating and describing those changes at the time (instead of a 3,000 line change entitled, “New version”) then we can make use of this information in the future to reconstruct the context in which a decision was made.

  • The ability to share those changes with others.

That’s about it, really. All the rest is just bells and whistles that allow us to create larger work flows and processes.

I’ve seen many, and contributed to some, conversations on work flow using Git. These tend to focus around the second point: sharing changes with others. It’s all about collaboration amongst team members, how to manage releases and how to build larger features. Depending on the size of the team and the business goals, it can be around parallelisation of effort, too.

The ‘right’ answer in these situations is certainly product specific. There are work flows that suit versioned shrink-wrapped products. And if you have to simultaneously support several shrink-wrapped versions, you’re going to build a more complex model for that, too. If you’ve got a web application that’s continuously delivered to production, and you only ever have one version in the wild, you’ll have a different work flow.

The right answer is often team- and culture-specific, too. And, for good measure, the team’s culture can (and probably should) change over time, too. The work flows that work well for an open source project consisting of well meaning, but externally constrained, volunteers might be very different from the work flow required by the next big startup. And neither of those work flows are necessarily going to work in a large corporation who is suffering under rigorous compliance policies imposed upon them by HIPPA, PCI, ISO29001 or Sabarnes-Oxley, for example.


These conversations often revolve around sharing changes with others. They talk about branching (to work on larger features away from the main, stable, production code), the length of time that branches should be allowed to survive for, the maintenance overhead in branching and how, and when, to merge code. They go beyond git, describing how to merge code, but selectively deploy it, or selectively activate it (usually referred to as ‘feature switches’, allowing the business to make a decision about enabling or disabling features, instead of imposing it directly at a code level).

And these conversations are very interesting. They’re important, too. But they’re not what I want to talk about today. What I want to describe is my low level, deliberate, git work flow. It’s the way of working that I, personally, think enables these higher level work flows to happen smoothly.


The key result of my git micro work flow is that I want each individual commit to represent a single, atomic, change which makes sense in isolation. A branch, turned into a patch (in traditional git workflow) or turned into a GitHub pull request is a collection of these atomic commits that tell a single, larger, cohesive story. The key thing is that neither commits, nor pull requests (that’s how I usually work) are a jumble of unrelated stories, all interwoven.

This is how I attempt to achieve that.

It’s hard. I’m as scatter-brained as it comes. When I’m hacking on a bit of code, I’ll often spot something entirely unrelated that needs fixing. Right now. If I’m being exceptionally well disciplined, I resist the urge, make a note on my todo list (an ‘internal’ interruption in Pomodoro parlance) and carry on with what I’m supposed to be doing.

A Rabbit hole full of yaks

More often than not, though, I’m not that good. I’ll get distracted by some yak shaving exercise and spend the next hour working on it instead. Of course, everybody knows that yak shaves are recursive, because they always take you to another section of the code where you discover something you need to fix. A recursive yak shave is collectively known as a ‘rabbit hole’.

Hours later, we emerge. We’ve successfully dealt with all these crazy bald yaks, and figured out how they fitted in a rabbit hole in the first place. We’ve popped our stack all the way back to our original problem and solved it. Win.

The trouble is that our working copy if now a complete shambles of unrelated changes. That’s OK, nobody cares about these things.

git add .
git commit -m "New version."
git push

Time to head to the pub and regale our colleagues with tales of those vanquished yaks!

Well, no, not quite. How is the poor sod that’s been assigned to review my change going to discern the two line change in logic to implement what I was intended to do from the 3,000 line diff that ‘refactors’ all the junk I came across on the way? When I come across a line of code in two years time and wonder, “why did I make that change?”, is the comment of ‘New version’ and a bunch of unrelated code changes going to help me?


No. I can’t possibly lump this up into a single pull request, never mind a single, atomic, commit. That would be irresponsible. But how do I get out of this mess? It’s perhaps telling that there are three git commands I use so frequently that I’ve aliased them in my shell:

  • git diff which I’ve aliased to the lovely, short, gd. This tells me the changes that I’ve made compared to what I’ve already staged for commit. When I just get started untangling this mess, these are all the changes I’ve made.

  • git diff --cached, which I’ve aliased to gdc. I’m surprised this command is relatively difficult to ‘get to’ and really deserves an alias, because I use it all the time. These are the changes that I’ve staged for commit. This is essentially the work-in-progress of ‘what I’m about to commit next’.

  • git add --patch, which I’ve aliases to gap. This shows me each fragment of change I’ve made in turn (not just each file, but (approximately) each change within a file), asking me for each in turn, ‘would you like to stage this fragment to commit next?’

The combination of these three commands — along with git commit (which I’ve aliased to gc) obviously — and a bit of fiddling, allow me to turn my rabbit warren of a working tree into those atomic, cohesive, commits that I desire.


Let’s take an (abstract) example. I’ve wound up with a working tree where I’ve implemented two related things. They’re both part of the current story that I’m telling, but they’re obviously discrete components of that story. And then there’s the rabbit hole. I noticed that one of the patterns we were using for the Controller#show action was still living in Rails 2.3, and I’ve updated all of them to the shiny new Rails 4 idioms. While I was doing that, I had to tweak a few models to support the new Rails 4 idioms that the controllers now rely on.

What a mess! We’ve got four sets of changes in the current working tree:

  • The two atomic changes which are related to the current branch — or plot arc, as I like to think of it.

  • The two completely unrelated sets of changes, which are sweeping across the entire code base and, although minor conceptually, are a massive number of line changes.

Let’s start with the two bits that are related to the current plot arc I’m trying to tell? How do I extract them from each other, never mind from the rest of the code? This is where git add --patch comes into its own. It’s going to show you each individual change you’ve made (not just the files, but the individual changes in those files) allowing you to choose whether to stage that change for commit or not.


It’s worth talking about staging at this point. Maybe I should have brought it up earlier, but it’s one of these things I take for granted now. The ‘index’ (the place where changes are staged prior to commit) is a place for you to compose a commit into a cohesive whole before committing it. This offers you the opportunity to gradually add changes to the index, create a cohesive whole, review it, then finally commit to that hand-crafted, artisanal change. It’s the difference between git and prior version control systems I’ve used (RCS, CVS, Subversion) because it has a staging area between what you’re working on, and what you’re committing.

When you’re working through the hunks offered by git add --patch it’s down to making a semantic choice. What story do you want to tell next (following on naturally from the previous story in this plot arc)? In some cases, it’s easy because there’s a natural ordering. Pick the prerequisite first. Otherwise, I tend to just accept the one that shows up first, then accept all the other changes atomically associated with that hunk.


When you’ve accepted all the hunks associated with that change, you can review the proposed commit with git diff --cached. This is your chance to cast a final eye over the change, make sure it’s complete, makes sense in isolation, and doesn’t have unrelated changes. This is also your chance for a personal code review:

  • Does the code look correct?

  • Are there unit/integration tests covering the changes you’ve made?

  • Is it accurately, and completely, documented?

  • Have you accidentally left in any debugging code from while you were implementing it?

  • Does the code look pretty? Is there any trailing white space, or duplicate carriage returns, or missing carriage returns for that matter? Are things lined up nicely?

  • Does the commit tell a good story?

This is the time to get inspiration for the commit message, too, which is a fundamental part of the story you’re telling, because it’s the headline people will see when they’re reviewing these changes, and it’s what they’ll see when they attempt to find out why this code exists in six months time. You’ve already described what you’ve changed by, well, changing the code. We’ll know from the meta data surrounding the commit who changed it and when. The where is, depending upon how you look at it, either irrelevant (I don’t care if you fixed it while you were sitting on the toilet, but I hope you washed your hands afterwards) or supplied by the changes (you changed line 3 of foo.c). The how you changed it is the subject of many editor wars, but is largely irrelevant, too.

Now is the time to explain why you changed it, what problem you were solving at the time (preferably with a reference to the context in which that change was decided).

Now that I’m satisfied about what I’m committing, I commit that chunk. And then I repeat this process of the other, related, changes that are part of this pull request’s plot arc.

Unrelated Changes

But there’s a complication. What about the other, unrelated, changes? They don’t have a place in this plot arc. Sometimes they’re part of a plot arc of their own, but more often than not, they’re like the standalone ‘filler’ episodes that TV writers bung in to fill up a season.

What shall we do with them? They’re separate stories — no matter how small — so they deserve their own separate plot arcs, of course (whether you represent plot arcs in git as patches or pull requests)!

If you’re lucky, then you can create a new branch (new plot arc!) straight away, with the current master (or whatever your root branch point currently is), keeping your dirty working tree intact:

git checkout -b shiny-refactoring master

then use git add --patch, git diff --cached and git commit to build up and review atomic commits on the new branch for your shiny refactoring.

If your current working tree’s changes will not cleanly apply to master, then the easiest way to deal with it is to temporarily stash them, and pop them from the stash afterwards.

git stash save --include-untracked
git checkout -b shiny-refactoring master
git stash pop

You’ll still need to resolve the conflicts yourself, though.

Splitting Patches

When things get a bit more complicated, git add --patch still has your back. When you’ve got an intertwined set of changes which are close to each other in the source, but are semantically unrelated, it’s a bit more effort to unpick them into individual commits. There are two scenarios here.

If you have two unrelated changes in the same fragment, but they’re only in the same fragment because they share some context, then you can hit s. The fragment will be split into smaller constituent fragments and you’ll be asked if you’d like to stage each one individually.

If the code is seriously intertwined, then you can ask git to fire up your favourite editor with the patch fragment in question. You can then edit that patch to the version you want to stage. If the patch is adding a line that you don’t want to add, delete that line from the patch. If the patch is removing a line that you don’t want removed, turn it into a context line (by turning the leading - into a space). This can be a bit finickity, but when you need to do it, it’s awesome.


I’ve had a couple of people pushing back a little on this work flow (usually when I’m in the driving seat while pairing) over the past few years.

The first was from a long-time eXtreme Programmer, who rightly pointed out, “but doesn’t that mean you’re committing untested code?” Even if the entire working tree has a passing test suite, the fragment that I’m staging for commit might not. It’s a fair point and one I occasionally have some angst over. If I’m feeling that angst, then there is a workaround. At the point where I have a set of changes staged and ready to commit, I can tweak the work flow to:

git stash save --keep-index --include-untracked
rake test # Or whatever your testing strategy is
git commit
git stash pop

This will stash away the changes that aren’t staged for commit, then run my full suite of tests, so I can be sure I’ve got a passing test suite for this individual commit.

The other argument I hear is, “why bother?” Well, that’s really the point of this article. I think it’s important to tell stories with my code: each commit should tell an individual story, and each patch/pull-request should tell a single (albeit larger) story, too.

(Not that I always listen to my own advice, of course.)