Shower Power

A few weeks ago, I watched a talk from CUSEC 2012 by Bret Victor called Inventing on Principle where he described the principle by which he lives his life. You should go watch the talk now, it’s awesome. I’ll wait.

Done? Wasn’t it a great talk? That thing with the game, and the future paths of the character … wasn’t that just awesome? I got a bit lost on the overall point of the talk (you should have a guiding principle and you should live your life by it), because I loved Bret’s guiding principle; the whole idea of immediate connections really resonated with me.

Then I realised another thing that’d been on my todo list to read (because it was too beautiful to shove into Instapaper) was Bret’s work too: Up and Down the Ladder of Abstraction. It’s all about gaining a deeper understanding of a concept by exploring it at different levels of abstraction. But again, skipping the point itself, take a look at and interact with the examples. Instead of some dry, boring graphs, or even some sexy (but meaningless) infographics, there are visualisations of the model that you can play with. And playing with these models helps you properly understand how variables interact, helping you to deeply understand the model.

Then I read another of his articles, the Scrubbing Calculator and something went ping in my head. (More on that another day, perhaps.)

I had to try it out on a wee project. It turns out that Bret also has a Javascript library called Tangle which helps you to create “reactive” documents. These reactive documents allow you to create a narrative flow around a model, letting users pick numbers from the model and change them (by dragging), which is then immediately reflected throughout the rest of the narrative. So the readers of your document get instant feedback on the impact of changing the input values on your model. That’s kinda cool.

As it happens, my son had spent … rather a long time in the shower that morning. We had a wee bit of a “discussion” about the money that was being wasted, but I wasn’t really able to communicate the consequences effectively (“it must be costing a fortune!”, I said to my wife, Annabel). Aha, I have my model. And so Shower Power was born.

It took me a wee while to research the model and understand how to get from “Malcolm spends 30 minutes in the shower” to “which costs £0.61″. Either I wasn’t paying attention in Standard Grade Physics at school, or I haven’t used those brain cells in the intervening 18 years. (Was it really that long ago?!) I got there in the end, and I’ve outlined the algorithm on the about page.

It works, and I rather like it. Have a play around and let me know what you think? I like the idea that there’s some narrative around it, rather than it just being a table of numbers. Given a little bit more thought, the narrative could be better crafted to bubble the important variables closest to the top, and defer the rarely used ones further down. I particularly like that it’s not just numbers you can change, but discrete options too (e.g. how often you shower).

With the people I tested it on, it clearly wasn’t obvious that you could pick an underlined number and drag it to change the effect – maybe a bit of signposting with guiders.js would have helped there?

I was particularly surprised by one of the variables in the model: it doesn’t matter what temperature your boiler heats the water to (well, it has to be higher than the desired shower temperature!). If I hadn’t played around with the model in this way, I’d never have noticed that – in fact, I spent a couple of hours debugging why changing that value didn’t change the results! So now we set the boiler temperature that bit lower, to reduce the chances of scalding from hot taps.

Anyway, it was a fun experiment, and I’m quite pleased with the results.

I’m feeling inspired to build something awesome (I’d tell you, but … bad experiences with that in the past … let me build an MVP first) with this sort of idea in mind. What do you reckon? Could be awesome? Crazy? Just a bit meh?

(Coming soon, a follow up post on how I built what is effectively a single-page static site with Rails (because when your only tool is a hammer, everything is a thumb) and how I used it as an excuse to explore the asset pipeline, efficient client side caching and CDNs.)

How to install a working set of compilers on Mac OS X 10.7 (Lion)

With the advent of Xcode 4.2, Apple have removed GCC from the Xcode installer. Up ’til now, when you installed Xcode (say, 4.1), you’d get:

  • The original GNU Compiler Collection (GCC), version 4.2.1.

  • LLVM-GCC, which is a modified version of GCC designed to emit LLVM’s intermediate representation, so that it can hook into LLVM’s back end optimisers and code generation (as I understand it).

  • Clang, the shiny new compiler freshly built with LLVM goodness all the way through.

However, with Xcode 4.2 onwards, the original GCC variant has disappeared, and /usr/bin/gcc is now being served by the (mostly) compatible llvm-gcc.

This is only an issue if you freshly install Xcode 4.2 on a computer where the developer tools weren’t previously installed. If you upgrade from Xcode 4.1 to 4.2, it will still leave gcc-4.2 lying around (suboptimal package management, but I’m not complaining today!). I suspect this is why more people aren’t getting upset.

It’s entirely understandable for Apple to stop distributing gcc: they’re moving forward, innovating with LLVM, and there’s only so long they should have to maintain a legacy (as far as their commercial platform is concerned, at least) compiler. In fact, llvm-gcc is only there as a temporary crutch; wait ’til they remove that too, leaving us with only Clang…

However, there’s a slight snag: all compilers are not built alike. You’d think that compilers implement a standard correctly, and users of the compiler write code to that standard. It never quite works out that way, though: compilers have bugs and proprietary extensions. Worse still, coders write code until the compiler validates it, not necessarily ’til it’s right.

So, flipping to a new compiler, even gcc -> llvm-gcc, will uncover problems. One of the apps I rely on that has been having problems is Ruby, which is why I wound up messing around with this in the first place.

So, what to do? There are a couple of newly popular mechanisms out there for installing a set of compilers on Mac OS X without installing the whole Xcode behemoth (largely as a way to save disk space):

  • Soren Ionescu’s GCC Without Xcode, which uses the Xcode installer you download from the App Store to generate a slimline installer with just the bits you need. Soren is careful to do it this way so as not to distribute any of Apple’s binary packages on their behalf (something they, naturally, can get a little grumpy about). Unfortunately, this means it’s using the Xcode 4.2 installer, which … doesn’t have GCC.

  • Kenneth Reitz’s OSX GCC Installer, which, in the project’s download section, has a pre-built package, ready for installation. Here’s the key thing: the pre-built package is built against Xcode 4.1, so still includes gcc-4.2.1. Hallelujah, we’re saved. (Kenneth, please for the love of all things that compile, please don’t update that package!)

So, when you’ve got a fresh Lion installation, and you’re looking for something you can build the widest range of apps on (from things that require gcc all the way through to that shiny iOS 5 project you’re working on), do the following:

  • Grab the OSX GCC Installer (as of writing, the 10.7-v2 package is the way forward).

  • Run the package to install it.

  • Grab Xcode from the App Store.

  • (Prior to Xcode 4.3) Run the Xcode installer. (From Xcode 4.3 onwards, it’s a regular app, ready to go in /Applications.)

  • (If you’re on Xcode 4.3+) start Xcode, go to Preferences, then to the Downloads section, then the Components tab. Choose to install the Command Line Tools.

Finally, you should be good to go. /usr/bin/gcc will still point to llvm-gcc-4.2 but at least you’ll have gcc-4.2 installed. If you’re having trouble compiling things, try forcing it to use gcc with CC=gcc-4.2 or equivalent. I think the likes of Homebrew and RVM will try to be clever on your behalf if they can.

Update How about that? As of Xcode 4.3, /usr/bin/cc now points to clang for fresh installations. The above sequence of steps will still install a full set of compilers for you, though. If you need to force a build to use a particular compiler, use the CC, CXX etc environment variables and 99% of the time you should be good to go.

As of Xcode 4.3, you can skip the full Xcode installation and just install the command line tools (Developer Connection login required, but free). However, that just gives you llvm-gcc and Clang. You still need the OSX GCC Installer if you want gcc-4.2.

Understanding the Rails Logger

I post here fairly infrequently and irregularly. I’m sure the fact that I’m posting over at the FreeAgent Engineering Blog too isn’t going to help that at all.

Just yesterday I had a bit of an adventure in the Ruby on Rails logging code. You can find the article on the engineering blog: Understanding the Rails Logger.

If Ruby, Rails, testing, scalability, automating your infrastructure and a wee smattering of accounting is the kind of thing you’re interested in, you could do worse than keeping track of FreeAgent’s engineering blog. I’ve got some very smart colleagues with some really interesting stuff to say…

Ruby Timeout Woes, Part 2

I started digging into how Ruby’s timeout mechanism worked this morning, in order to get to the bottom of a bug we’ve got.

Let me give you a little context. We use Delayed Job to run some of our longer running tasks. Delayed job wraps all its jobs in a timeout, which we’ve set to 20 minutes. That’s a good thing: I don’t really want a job running forever and, consequently, tying up one of our workers forever. So, we’ve got Delayed Job wrapping arbitrary code in Ruby’s built in Timeout. What can possibly go wrong?

Well, it turns out that, for one particular job, the timeout mechanism wasn’t working, and the job was carrying on well past the 20 minute timeout we’d set. Worse still, when a running job exceeds the maximum run time, Delayed Job will assume that the entire worker died, break the lock and hand the job to another worker. So we wound up with every single delayed job worker in our cluster running the same job, to completion, no matter how long it took.

Suboptimal, eh?

I started digging into Delayed Job, our code, and the Timeout implementation to see if I could figure out what was going wrong. Delayed Job is doing fine, nothing unusual there. The Timeout implementation is interesting. It creates a separate thread, which then sleeps for the timeout length. If the main thread completes its block before the timeout, it just kills the timeout thread and carries on happily. However, if the timeout thread wakes up before the main thread has completed execution, then it raises an exception on the main thread. The timeout method catches that exception on the main thread, tidies up and raises a Timeout::Error exception.

There are a few problems with that implementation (every call to Timeout.timeout creates a new thread, and it makes use of Thread.raise and Thread.kill which, as Charles Nutter pointed out a few years back is a little broken), but we’ll gloss over them for now. That’s not what was causing my woes today. Let’s reduce the problem to a simple example:

require 'timeout'
 
puts "#{Time.now}: Starting"
begin
  Timeout.timeout(5) do
    begin
      sleep 10
    rescue Exception => e
      puts "#{Time.now}: Caught an exception: #{e.inspect}"
    end
    sleep 10
  end
rescue Timeout::Error => e
  puts "#{Time.now}: Timeout: #{e}"
else
  puts "#{Time.now}: Never timed out."
end

Let’s see what happens when we run that wee snippet:

Tue Aug 30 13:38:56 +0100 2011: Starting
Tue Aug 30 13:39:01 +0100 2011: Caught an exception: #<#<Class:0x1001337f0>: execution expired>
Tue Aug 30 13:39:11 +0100 2011: Never timed out.

The inside rescue block is catching some exception after the timeout has expired, but the one expecting the timeout error never gets it. That’s down to the implementation of Timeout. When the timer thread reawakened, it threw an exception on the main thread. The exception it threw on the main thread inherits from Exception, so anything that catches Exception will catch it before it bubbles back up the stack to the timeout method. So, while we’ve timed out the inner block, we’ve neutered the overall effect of the timeout method.

Lessons learned:

  • Catching generic StandardError exceptions is crazy enough, but you probably never want to catch Exception. PS, library authors, your exceptions should inherit from StandardError, not Exception.

  • Ruby’s built in Timeout mechanism is crazy in a whole new and interesting way, too. Be careful how you use it.

Ruby Timeout Woes, Part 1

I seem to be having a bad day with the built in Timeout class in Ruby. There are two problems; one is pretty innocuous, the other … not so much.

When you’re using Timeout, you’ll typically wrap the block of code you’re wanting to guard like this:

require 'timeout'
 
begin
  Timeout.timeout(10) do
    # Block of code
  end
rescue Timeout::Error => e
  puts "Execution expired"
end

Your block of code will run for up to (approximately) 10 seconds and, if it hasn’t completed in that time, will raise the Timeout::Error exception. Pretty straightforward.

The innocuous issue is just one trying to make me mistrust my memory. In Ruby 1.8.x, Timeout::Error inherits from Interrupt, so it’s inheritance from Exception goes along the lines of:

Timeout::Error < Interrupt < SignalException < Exception

The key thing to note here is that it doesn’t inherit directly from StandardError and so a blank rescue block won’t catch it:

begin
  Timeout.timeout(10) { sleep 20 }
rescue
  puts "On Ruby 1.8.x I won't catch the timeout exception."
end

However, on Ruby 1.9.2, Timeout::Error inherits from RuntimeError, so in the above code example, the rescue block will get called. That’s annoying, but it’s not like it’s the only incompatible change between Ruby 1.8.x and Ruby 1.9, so I’m OK with that. Plus, non-specific rescue blocks like that are a bad smell anyway.

The slightly more insidious problem needs further explanation. Come back again later on and I’ll tell you all about it.

Pimpin’ TextMate (aka Top 5 TextMate Plugins, 2011 Edition)

All the cool kids are using Vim these days. That’s fine, y’all go right ahead, but sorry, it’s not really my thing. I’m perfectly happy with TextMate, thanks.

Don’t get me wrong; I’ve been using various flavours of Vi for 17 years now. I’m reasonably proficient at it, and it’s my tool of choice when I’m, say, editing server configuration files. I can plan and execute a series of changes to a file like a boss, when I already know what I’m doing. The trouble occurs when I’m coding, and what I’m trying to achieve isn’t yet a fully crystallised thought. Then, with Vi, I feel boxed in and a bit blinkered.

It’s just me, I know. Maybe the root of the problem is that I code before thinking. :)

So I still use TextMate, where I feel just a little more free and creative.

Anyway, this isn’t a post about Vim vs TextMate (there’s enough of them already!), I just wanted to get that off my chest.

I seem to be going through a phase of setting up fresh installs of my various desktop and laptop computers again (blame Lion). Instead of blindly installing all the same tools as I’m used to, I figured I’d reassess the landscape and see what’s out there. Here are a few plugins for TextMate I’ve tried, and liked.

  • PeepOpen is a file finder from the lovely folks at PeepCode. It replaces the default “Go to file…” (⌘-T) pane with one that’s a good deal smarter. Instead of matching just on the filename, it’ll match on the entire path. This is invaluable if your project has 300 files called show.html.erb. If I’m looking for, say, an article’s show template in a Rails project, typing avartsh (app/views/articles/show.html.erb) will get me there quickly.

    PeepOpen is $12, but it’s free as part of your PeepCode Unlimited subscription which you’d be crazy not to have anyway, right?

  • EGOTextMateFullScreen is a plugin that gives you native full screen support on Mac OS X Lion. On my desktop computer, a 27″ iMac, I didn’t really the point of full screen support (I much prefer a few windows tiled with the assistance of Sizeup). But then I got my hands on a Macbook Air and suddenly it made a whole lot more sense.

  • If you’re making use of the full screen support, then there’s one small snag: the drawer isn’t visible. I quite like the drawer; not to find files – because PeepOpen does a much better job – but to remind me of the context I’m in. When you’re editing a file, and need a reminder of the context, hit ctrl-⌘-R and your current file will be highlighted in the project tree.

    So if, like me, you’d miss the drawer in full screen mode, the old standby is the MissingDrawer plugin. This turns the drawer into a sidebar, which is part of the main window and, therefore, still visible in full screen mode.

  • Finally, project-wide search. The default Find in Project is a little … slow. And a bit beachball-y. For a while I’d been using a bundle that replaces the default Find in Project with one powered by Ack. Much faster, and a little more flexible too. When I went searching for it this time around, though, I discovered AckMate, a plugin which provides a more integrated experience. It’s fast. It can show you the context of the search result. It even allows you to constrain the results to particular file types. In short, it’s awesome. Be sure to read the instructions on how to set the shortcut key to the default “Find in Project…” one.

There’s one final plugin that I’m planning on evaluating, but haven’t quite gotten around to yet. The one, single, feature I liked in my brief affair with RubyMine was the ability to jump to a method definition. I’ve got a similar mechanism set up in Vim using exuberant ctags which works really well. (And back in the days before Mac OS X when Emacs was my hammer – and every C file a thumb – I lived by my TAGS files.) How I’d love project-wide “Go to symbol…”. There’s a plugin out there called TmCodeBrowser that appears to do exactly what I’m after. I’m slightly wary as it hasn’t been updated in quite some time, but perhaps that’s because it still just works. I’ll evaluate it soon and update this post when I do.

There’s one plugin I wish existed, but doesn’t seem to: split windows. I’ve got plenty of screen real estate. It would be crazy awesome to be able to have my model and its tests sitting side by side with a vertical split. If it’s possible to write a TextMate plugin that realises this, I would happily pay $12 for it (same price point as PeepOpen). Just so you know. ;-)

That’s it. I still love TextMate and I spend several hours working in it most days. Thanks for making code editing a lovely experience, Allan!

Using tcpflow

Sometimes, when you’re writing applications that use a library to talk over the wire to a remote service, it’s difficult to see how the high level API the library exposes translates into the on-the-wire protocol. Funnily enough, I was having that very problem yesterday, so I dug tcpflow out my toolbox to better understand what was happening.

I was writing a client for the [REDACTED] (for now, at least!) API, for a client project. I’d decided to use this as an excuse to learn EventMachine and em-http-request to talk to the remote API. Given the pattern of use I’m expecting, a reactor-patterned daemon feels like a really good fit.

It was a weird experience — it’s the first time I’ve done any reactor pattern development in anger since ~2004 when I was messing around with using Python’s Twisted framework for small TCP server applications. (Ah, them were the days, before HTTP became the hammer to everybody’s wire protocol thumb nail.) But I digress…

I was having trouble understanding why the em-http-request requests I was making weren’t having the intended result. There were two specific problems I saw over the course of the day:

  • The path I was sending in, according to the documentation, using request.get :path => '/api/v2/chickens.json', didn’t appear to be what was being requested (in that everything I requested looked like it was getting the same response as a request to GET /).

  • After the initial request, subsequent requests reusing the same HttpRequest object would silently fail. This was slightly more bizarre.

In solving either of these problems, I could have dived into the em-http-request source and traced what was happening. In fact, that was my first port of call but, being rusty at the reactor pattern, I was having trouble following the flow of execution. (As it turns out, if I’ve done this properly, I should have spotted the problem straight away, but we’ll get to that.)

However, in this instance, I decided to treat the library I was using as a black box and instead examine what it was generating and consuming at the other end.

Enter tcpflow. It’s been part of my arsenal of network debugging tools for as long as I can remember. It is similar to tcpdump, but that typically just shows you the IP packets on the wire. tcpflow attempts to reconstruct the actual TCP streams to give you an idea of the conversation going on.

Let’s install it. I’m on a Mac, and I’m using Homebrew, so it’s just a case of:

brew install tcpflow

Your mileage may vary, but it’s a piece of software that’s been around for a long time, so I bet there’s a package for your system. If you’re using Homebrew, it will install the binary as your user and not setuid to root, which means that, in order to access the packet capture device, you’ll have to make sure and run it through sudo. The examples I show all have sudo in ‘em since that’s what I had to do.

tcpflow uses libpcap as the underlying packet capture library, the very same one as is used by tcpdump. This means that the syntax for specifying the information you want to capture will be familiar if you’ve used that tool before.

Last piece of background information before we get into solving the problems. You’ll need to figure out the network interface you’re using as that’s the interface it will capture packets from. Short version (since a longer version is out of scope!): if you’re on a Mac, it’s highly likely to be en0 if you’re on wired Ethernet and en1 if you’re on wireless. If you’re using Linux and you’re on a wired network, chances are it’ll be eth0. My Mac laptop is on a wireless network right now, so the examples show me using en1. Examine the output of ifconfig to determine your active network interface.

So, that’s the theory, let’s see this thing in action. The first problem is that we’re not getting the response we’re expecting from making particular HTTP requests. Let’s see what’s happening on the wire with tcpflow:

sudo tcpflow -c -n en1 src or dst host api.example.com

Breaking that down:

  • -c spits the flows out on stdout. The normal operation is to create a file for each flow, in each direction. If you’re capturing a lot of data, or pipelined conversations, files are much more sane, but for small conversations like this, stdout is really convenient.

  • -n en1 is capturing from my wireless device, as discussed above.

  • src or dst host api.example.com is the expression used to describe the flows that we want to capture. In essence, this is saying that we want to capture flows of data that are sent to, or received from, api.example.com. The language is pretty rich, allowing for a bunch of complex rules to narrow down the data captured, but I tend to find that specifying a remote host and/or port is enough.

Let’s see what that gets us. I’ve created a Ruby script, the essence of which is:

EM.run do
  request = EventMachine::HttpRequest.new('http://api.example.com/')
  deferrable = request.get :path => '/api/v2/chickens.json'
  deferrable.callback { puts "It worked"; EM.stop }
  deferrable.errback  { puts "It failed"; EM.stop }
end

Let’s see what the on-the-wire communication turns into:

172.017.012.012.53284-010.011.012.234.00080: GET / HTTP/1.1
User-Agent: EventMachine HttpClient
Host: api.example.com


010.011.012.234.00080-172.017.012.012.53284: HTTP/1.1 301 Moved Permanently
Date: Sat, 05 Mar 2011 11:59:55 GMT
Content-Type: text/html
Content-Length: 178
Connection: close
Location: http://www.example.com/

[ body content elided ]

That’s annoying, because I’d specifically requested /api/v2/chickens.json, but it’s actually requesting / (the GET / HTTP/1.1 line). What’s with that? Well, it turns out that the published stable gem for em-http-request doesn’t actually implement the :path option; I need to install the beta 1.0 gem to get that. Oops. Moral of the story: if you’re going to read the source to the gem you’re using, read the source to the installed version you’re using, not just the latest version on GitHub!

The second problem is a little more subtle. I got to the stage where the first request was working successfully, but subsequent requests reusing the same HttpRequest would fail miserably, just waiting for no response to occur. An example:

EM.run do
  request = EventMachine::HttpRequest.new('http://api.example.com/')
  deferrable = request.get :path => '/api/v2/chickens.json'
  deferrable.callback do
    deferred_eggs = request.get :path => '/api/v2/chickens/1/eggs.json'
    deferred_eggs.callback { puts "It worked"; EM.stop }
    deferred_eggs.errback  { puts "Egg retrieval failed"; EM.stop }
  end
  deferrable.errback  { puts "It failed"; EM.stop }
end

Let’s take a wee look at the over-the-wire conversation as captured by tcpflow:

172.017.012.012.54442-010.011.012.234.00080: GET /api/v2/chickens.json HTTP/1.1
User-Agent: EventMachine HttpClient
Host: api.example.com


010.011.012.234.00080-172.017.012.012.54442: HTTP/1.1 200 OK
Date: Sat, 05 Mar 2011 13:39:47 GMT
Content-Type: application/json;charset=ISO-8859-1
Connection: close
Set-Cookie: JSESSIONID=B4D6AE4456BC6D93E6B3441D4FEC6946; Path=/api/v2
Content-Language: en-US

[ body content elided ]


172.017.012.012.54442-010.011.012.234.00080: GET /api/v2/chickens/1/eggs.json HTTP/1.1
User-Agent: EventMachine HttpClient
Host: api.example.com

Here we see the initial request, the response and the second request, but no response. That’s really weird. Isn’t it? Then I spotted something. Do you see the initial line of each flow, with a string of numbers? That’s the source IP and port, then the destination IP and port. Let’s break the first one apart:

172.017.012.012.54442-010.011.012.234.00080

which translates to:

  • the client IP address is 172.17.12.12. That’s the internal IP of my laptop on my home network.
  • the client TCP port is 54442. Both sides of a TCP connection get a port, and the client side usually has a random, unused, high port number chosen for it by the kernel. Each TCP connection gets a different client port, and they’re typically not reused for a while after you’re done with them.
  • the server IP address is 10.11.12.234 which I cleverly remembered to change just now, lest you discover who [REDACTED] is. ;)
  • the server TCP port is 80, the well known port for HTTP.

Now, let’s look at the second request’s connection information:

172.017.012.012.54442-010.011.012.234.00080

Isn’t that interesting? They’ve both got the same client TCP port. Now, we’re not doing keepalive, and the server responded with Connection: close (which, sadly, it does, even if I do attempt to switch keepalive on). So, as far as the server’s concerned, the TCP session is done and it has closed the connection. But the client, em-http-request, hasn’t closed its end, and has decided to send another request along the same TCP connection. Since the server has already closed its side of the connection, it never sees the second request and, naturally, never responds.

I smell a bug with em-http-request that I’ve reported here: #77 Reusing HttpRequest.

I hope this has served as a useful introduction to tcpflow. It’s a great tool for discovering what really happens on the wire when you’re using higher level APIs and libraries. It really comes into its own when the library is, say, closed source and you have to deal with it as a black box. There are plenty of higher level tools with GUIs that can help you with this kind of task too, but I really like that tcpflow does one thing and does it really well. Thanks, Jeremy!