Give me back my # key!

This particular tip may have an audience of approximately one, since:

  • It’s only going to bother Mac users;
  • who are in the UK;
  • who use the # (hash, pound, whatever you call it) key much; and
  • who have enabled “Use option as meta key” in Terminal.app in order to sanely use Emacs.

If you tick all these boxes, and have suddenly discovered that your # key (which is option-3 for UK-keyboard-wielding Mac users) no longer works, you’ve come to the right place!

The problem is that M-3 is bound to digit-argument. This allows you to repeat commands (e.g. if you type M-3 c, then it will output ccc) which I’ve got to admit I don’t use terribly often.

It turns out that the bindings for keys are controlled by the readline library and you can customise them with readline’s configuration file, ~/.inputrc. If you want to override the default behaviour of M-3 and turn it back to emitting the # symbol, put the following in that file:

"\e3": '#'

You’ll need to restart bash (or force it to reload its inputrc file with C-x C-r) in order for it to take effect.

Here’s another related tip, while I’m here. I picked this one up from Ross a few years ago. I almost never use the UK currency symbol (£) in a shell environment. On the other hand, there’s another operation I perform quite often at the command line: commenting out the command I’m currently typing. It usually happens when I’m doing a sequence of steps at the command line, and I realise that I’ve forgotten a prerequisite step. Rather than sticking the current line in the kill ring (C-e C-u or C-a C-k), then remembering to yank it (C-y) again, I tend to jump to the start of the line (C-a), stick a hash in (to comment the line out), then hit enter. That way it’s in my (searchable) bash history for when I next need it.

But that’s a fairly cumbersome sequence too, so let’s shorten it. Stick the following in ~/.inputrc:

"£": '\C-a#\C-m'

and restart your shell. Now when you’re part way through typing a command line and want to switch tracks, hit the £ sign and you’re done. When you want the command back, retrieve it from your shell history (C-r to search!), then hit C-a C-d to remove the comment sign and you’re good to go.

One last wee trick. You really can remap any key to any other key. Imagine the fun and hilarity of the following in your ~/.inputrc:

"l": 'r'
"s": 'm'

:-)

Update: My colleague, Mihai, points out that the only appreciable difference between the UK and US keyboard layout on the Mac is that the # and £ key combinations are swapped around (oh, and it’s easier to find the ™ than the € now, not that I use either). Since I’ve just discovered that my irb isn’t picking up the ~/.inputrc and doing the right thing, I reckon I’ll just switch to the US layout…

Understanding the Rails Logger

I post here fairly infrequently and irregularly. I’m sure the fact that I’m posting over at the FreeAgent Engineering Blog too isn’t going to help that at all.

Just yesterday I had a bit of an adventure in the Ruby on Rails logging code. You can find the article on the engineering blog: Understanding the Rails Logger.

If Ruby, Rails, testing, scalability, automating your infrastructure and a wee smattering of accounting is the kind of thing you’re interested in, you could do worse than keeping track of FreeAgent’s engineering blog. I’ve got some very smart colleagues with some really interesting stuff to say…

Ruby Timeout Woes, Part 2

I started digging into how Ruby’s timeout mechanism worked this morning, in order to get to the bottom of a bug we’ve got.

Let me give you a little context. We use Delayed Job to run some of our longer running tasks. Delayed job wraps all its jobs in a timeout, which we’ve set to 20 minutes. That’s a good thing: I don’t really want a job running forever and, consequently, tying up one of our workers forever. So, we’ve got Delayed Job wrapping arbitrary code in Ruby’s built in Timeout. What can possibly go wrong?

Well, it turns out that, for one particular job, the timeout mechanism wasn’t working, and the job was carrying on well past the 20 minute timeout we’d set. Worse still, when a running job exceeds the maximum run time, Delayed Job will assume that the entire worker died, break the lock and hand the job to another worker. So we wound up with every single delayed job worker in our cluster running the same job, to completion, no matter how long it took.

Suboptimal, eh?

I started digging into Delayed Job, our code, and the Timeout implementation to see if I could figure out what was going wrong. Delayed Job is doing fine, nothing unusual there. The Timeout implementation is interesting. It creates a separate thread, which then sleeps for the timeout length. If the main thread completes its block before the timeout, it just kills the timeout thread and carries on happily. However, if the timeout thread wakes up before the main thread has completed execution, then it raises an exception on the main thread. The timeout method catches that exception on the main thread, tidies up and raises a Timeout::Error exception.

There are a few problems with that implementation (every call to Timeout.timeout creates a new thread, and it makes use of Thread.raise and Thread.kill which, as Charles Nutter pointed out a few years back is a little broken), but we’ll gloss over them for now. That’s not what was causing my woes today. Let’s reduce the problem to a simple example:

require 'timeout'
 
puts "#{Time.now}: Starting"
begin
  Timeout.timeout(5) do
    begin
      sleep 10
    rescue Exception => e
      puts "#{Time.now}: Caught an exception: #{e.inspect}"
    end
    sleep 10
  end
rescue Timeout::Error => e
  puts "#{Time.now}: Timeout: #{e}"
else
  puts "#{Time.now}: Never timed out."
end

Let’s see what happens when we run that wee snippet:

Tue Aug 30 13:38:56 +0100 2011: Starting
Tue Aug 30 13:39:01 +0100 2011: Caught an exception: #<#<Class:0x1001337f0>: execution expired>
Tue Aug 30 13:39:11 +0100 2011: Never timed out.

The inside rescue block is catching some exception after the timeout has expired, but the one expecting the timeout error never gets it. That’s down to the implementation of Timeout. When the timer thread reawakened, it threw an exception on the main thread. The exception it threw on the main thread inherits from Exception, so anything that catches Exception will catch it before it bubbles back up the stack to the timeout method. So, while we’ve timed out the inner block, we’ve neutered the overall effect of the timeout method.

Lessons learned:

  • Catching generic StandardError exceptions is crazy enough, but you probably never want to catch Exception. PS, library authors, your exceptions should inherit from StandardError, not Exception.

  • Ruby’s built in Timeout mechanism is crazy in a whole new and interesting way, too. Be careful how you use it.

Ruby Timeout Woes, Part 1

I seem to be having a bad day with the built in Timeout class in Ruby. There are two problems; one is pretty innocuous, the other … not so much.

When you’re using Timeout, you’ll typically wrap the block of code you’re wanting to guard like this:

require 'timeout'
 
begin
  Timeout.timeout(10) do
    # Block of code
  end
rescue Timeout::Error => e
  puts "Execution expired"
end

Your block of code will run for up to (approximately) 10 seconds and, if it hasn’t completed in that time, will raise the Timeout::Error exception. Pretty straightforward.

The innocuous issue is just one trying to make me mistrust my memory. In Ruby 1.8.x, Timeout::Error inherits from Interrupt, so it’s inheritance from Exception goes along the lines of:

Timeout::Error < Interrupt < SignalException < Exception

The key thing to note here is that it doesn’t inherit directly from StandardError and so a blank rescue block won’t catch it:

begin
  Timeout.timeout(10) { sleep 20 }
rescue
  puts "On Ruby 1.8.x I won't catch the timeout exception."
end

However, on Ruby 1.9.2, Timeout::Error inherits from RuntimeError, so in the above code example, the rescue block will get called. That’s annoying, but it’s not like it’s the only incompatible change between Ruby 1.8.x and Ruby 1.9, so I’m OK with that. Plus, non-specific rescue blocks like that are a bad smell anyway.

The slightly more insidious problem needs further explanation. Come back again later on and I’ll tell you all about it.

Pimpin’ TextMate (aka Top 5 TextMate Plugins, 2011 Edition)

All the cool kids are using Vim these days. That’s fine, y’all go right ahead, but sorry, it’s not really my thing. I’m perfectly happy with TextMate, thanks.

Don’t get me wrong; I’ve been using various flavours of Vi for 17 years now. I’m reasonably proficient at it, and it’s my tool of choice when I’m, say, editing server configuration files. I can plan and execute a series of changes to a file like a boss, when I already know what I’m doing. The trouble occurs when I’m coding, and what I’m trying to achieve isn’t yet a fully crystallised thought. Then, with Vi, I feel boxed in and a bit blinkered.

It’s just me, I know. Maybe the root of the problem is that I code before thinking. :)

So I still use TextMate, where I feel just a little more free and creative.

Anyway, this isn’t a post about Vim vs TextMate (there’s enough of them already!), I just wanted to get that off my chest.

I seem to be going through a phase of setting up fresh installs of my various desktop and laptop computers again (blame Lion). Instead of blindly installing all the same tools as I’m used to, I figured I’d reassess the landscape and see what’s out there. Here are a few plugins for TextMate I’ve tried, and liked.

  • PeepOpen is a file finder from the lovely folks at PeepCode. It replaces the default “Go to file…” (⌘-T) pane with one that’s a good deal smarter. Instead of matching just on the filename, it’ll match on the entire path. This is invaluable if your project has 300 files called show.html.erb. If I’m looking for, say, an article’s show template in a Rails project, typing avartsh (app/views/articles/show.html.erb) will get me there quickly.

    PeepOpen is $12, but it’s free as part of your PeepCode Unlimited subscription which you’d be crazy not to have anyway, right?

  • EGOTextMateFullScreen is a plugin that gives you native full screen support on Mac OS X Lion. On my desktop computer, a 27″ iMac, I didn’t really the point of full screen support (I much prefer a few windows tiled with the assistance of Sizeup). But then I got my hands on a Macbook Air and suddenly it made a whole lot more sense.

  • If you’re making use of the full screen support, then there’s one small snag: the drawer isn’t visible. I quite like the drawer; not to find files – because PeepOpen does a much better job – but to remind me of the context I’m in. When you’re editing a file, and need a reminder of the context, hit ctrl-⌘-R and your current file will be highlighted in the project tree.

    So if, like me, you’d miss the drawer in full screen mode, the old standby is the MissingDrawer plugin. This turns the drawer into a sidebar, which is part of the main window and, therefore, still visible in full screen mode.

  • Finally, project-wide search. The default Find in Project is a little … slow. And a bit beachball-y. For a while I’d been using a bundle that replaces the default Find in Project with one powered by Ack. Much faster, and a little more flexible too. When I went searching for it this time around, though, I discovered AckMate, a plugin which provides a more integrated experience. It’s fast. It can show you the context of the search result. It even allows you to constrain the results to particular file types. In short, it’s awesome. Be sure to read the instructions on how to set the shortcut key to the default “Find in Project…” one.

There’s one final plugin that I’m planning on evaluating, but haven’t quite gotten around to yet. The one, single, feature I liked in my brief affair with RubyMine was the ability to jump to a method definition. I’ve got a similar mechanism set up in Vim using exuberant ctags which works really well. (And back in the days before Mac OS X when Emacs was my hammer – and every C file a thumb – I lived by my TAGS files.) How I’d love project-wide “Go to symbol…”. There’s a plugin out there called TmCodeBrowser that appears to do exactly what I’m after. I’m slightly wary as it hasn’t been updated in quite some time, but perhaps that’s because it still just works. I’ll evaluate it soon and update this post when I do.

There’s one plugin I wish existed, but doesn’t seem to: split windows. I’ve got plenty of screen real estate. It would be crazy awesome to be able to have my model and its tests sitting side by side with a vertical split. If it’s possible to write a TextMate plugin that realises this, I would happily pay $12 for it (same price point as PeepOpen). Just so you know. ;-)

That’s it. I still love TextMate and I spend several hours working in it most days. Thanks for making code editing a lovely experience, Allan!

Using tcpflow

Sometimes, when you’re writing applications that use a library to talk over the wire to a remote service, it’s difficult to see how the high level API the library exposes translates into the on-the-wire protocol. Funnily enough, I was having that very problem yesterday, so I dug tcpflow out my toolbox to better understand what was happening.

I was writing a client for the [REDACTED] (for now, at least!) API, for a client project. I’d decided to use this as an excuse to learn EventMachine and em-http-request to talk to the remote API. Given the pattern of use I’m expecting, a reactor-patterned daemon feels like a really good fit.

It was a weird experience — it’s the first time I’ve done any reactor pattern development in anger since ~2004 when I was messing around with using Python’s Twisted framework for small TCP server applications. (Ah, them were the days, before HTTP became the hammer to everybody’s wire protocol thumb nail.) But I digress…

I was having trouble understanding why the em-http-request requests I was making weren’t having the intended result. There were two specific problems I saw over the course of the day:

  • The path I was sending in, according to the documentation, using request.get :path => '/api/v2/chickens.json', didn’t appear to be what was being requested (in that everything I requested looked like it was getting the same response as a request to GET /).

  • After the initial request, subsequent requests reusing the same HttpRequest object would silently fail. This was slightly more bizarre.

In solving either of these problems, I could have dived into the em-http-request source and traced what was happening. In fact, that was my first port of call but, being rusty at the reactor pattern, I was having trouble following the flow of execution. (As it turns out, if I’ve done this properly, I should have spotted the problem straight away, but we’ll get to that.)

However, in this instance, I decided to treat the library I was using as a black box and instead examine what it was generating and consuming at the other end.

Enter tcpflow. It’s been part of my arsenal of network debugging tools for as long as I can remember. It is similar to tcpdump, but that typically just shows you the IP packets on the wire. tcpflow attempts to reconstruct the actual TCP streams to give you an idea of the conversation going on.

Let’s install it. I’m on a Mac, and I’m using Homebrew, so it’s just a case of:

brew install tcpflow

Your mileage may vary, but it’s a piece of software that’s been around for a long time, so I bet there’s a package for your system. If you’re using Homebrew, it will install the binary as your user and not setuid to root, which means that, in order to access the packet capture device, you’ll have to make sure and run it through sudo. The examples I show all have sudo in ‘em since that’s what I had to do.

tcpflow uses libpcap as the underlying packet capture library, the very same one as is used by tcpdump. This means that the syntax for specifying the information you want to capture will be familiar if you’ve used that tool before.

Last piece of background information before we get into solving the problems. You’ll need to figure out the network interface you’re using as that’s the interface it will capture packets from. Short version (since a longer version is out of scope!): if you’re on a Mac, it’s highly likely to be en0 if you’re on wired Ethernet and en1 if you’re on wireless. If you’re using Linux and you’re on a wired network, chances are it’ll be eth0. My Mac laptop is on a wireless network right now, so the examples show me using en1. Examine the output of ifconfig to determine your active network interface.

So, that’s the theory, let’s see this thing in action. The first problem is that we’re not getting the response we’re expecting from making particular HTTP requests. Let’s see what’s happening on the wire with tcpflow:

sudo tcpflow -c -n en1 src or dst host api.example.com

Breaking that down:

  • -c spits the flows out on stdout. The normal operation is to create a file for each flow, in each direction. If you’re capturing a lot of data, or pipelined conversations, files are much more sane, but for small conversations like this, stdout is really convenient.

  • -n en1 is capturing from my wireless device, as discussed above.

  • src or dst host api.example.com is the expression used to describe the flows that we want to capture. In essence, this is saying that we want to capture flows of data that are sent to, or received from, api.example.com. The language is pretty rich, allowing for a bunch of complex rules to narrow down the data captured, but I tend to find that specifying a remote host and/or port is enough.

Let’s see what that gets us. I’ve created a Ruby script, the essence of which is:

EM.run do
  request = EventMachine::HttpRequest.new('http://api.example.com/')
  deferrable = request.get :path => '/api/v2/chickens.json'
  deferrable.callback { puts "It worked"; EM.stop }
  deferrable.errback  { puts "It failed"; EM.stop }
end

Let’s see what the on-the-wire communication turns into:

172.017.012.012.53284-010.011.012.234.00080: GET / HTTP/1.1
User-Agent: EventMachine HttpClient
Host: api.example.com


010.011.012.234.00080-172.017.012.012.53284: HTTP/1.1 301 Moved Permanently
Date: Sat, 05 Mar 2011 11:59:55 GMT
Content-Type: text/html
Content-Length: 178
Connection: close
Location: http://www.example.com/

[ body content elided ]

That’s annoying, because I’d specifically requested /api/v2/chickens.json, but it’s actually requesting / (the GET / HTTP/1.1 line). What’s with that? Well, it turns out that the published stable gem for em-http-request doesn’t actually implement the :path option; I need to install the beta 1.0 gem to get that. Oops. Moral of the story: if you’re going to read the source to the gem you’re using, read the source to the installed version you’re using, not just the latest version on GitHub!

The second problem is a little more subtle. I got to the stage where the first request was working successfully, but subsequent requests reusing the same HttpRequest would fail miserably, just waiting for no response to occur. An example:

EM.run do
  request = EventMachine::HttpRequest.new('http://api.example.com/')
  deferrable = request.get :path => '/api/v2/chickens.json'
  deferrable.callback do
    deferred_eggs = request.get :path => '/api/v2/chickens/1/eggs.json'
    deferred_eggs.callback { puts "It worked"; EM.stop }
    deferred_eggs.errback  { puts "Egg retrieval failed"; EM.stop }
  end
  deferrable.errback  { puts "It failed"; EM.stop }
end

Let’s take a wee look at the over-the-wire conversation as captured by tcpflow:

172.017.012.012.54442-010.011.012.234.00080: GET /api/v2/chickens.json HTTP/1.1
User-Agent: EventMachine HttpClient
Host: api.example.com


010.011.012.234.00080-172.017.012.012.54442: HTTP/1.1 200 OK
Date: Sat, 05 Mar 2011 13:39:47 GMT
Content-Type: application/json;charset=ISO-8859-1
Connection: close
Set-Cookie: JSESSIONID=B4D6AE4456BC6D93E6B3441D4FEC6946; Path=/api/v2
Content-Language: en-US

[ body content elided ]


172.017.012.012.54442-010.011.012.234.00080: GET /api/v2/chickens/1/eggs.json HTTP/1.1
User-Agent: EventMachine HttpClient
Host: api.example.com

Here we see the initial request, the response and the second request, but no response. That’s really weird. Isn’t it? Then I spotted something. Do you see the initial line of each flow, with a string of numbers? That’s the source IP and port, then the destination IP and port. Let’s break the first one apart:

172.017.012.012.54442-010.011.012.234.00080

which translates to:

  • the client IP address is 172.17.12.12. That’s the internal IP of my laptop on my home network.
  • the client TCP port is 54442. Both sides of a TCP connection get a port, and the client side usually has a random, unused, high port number chosen for it by the kernel. Each TCP connection gets a different client port, and they’re typically not reused for a while after you’re done with them.
  • the server IP address is 10.11.12.234 which I cleverly remembered to change just now, lest you discover who [REDACTED] is. ;)
  • the server TCP port is 80, the well known port for HTTP.

Now, let’s look at the second request’s connection information:

172.017.012.012.54442-010.011.012.234.00080

Isn’t that interesting? They’ve both got the same client TCP port. Now, we’re not doing keepalive, and the server responded with Connection: close (which, sadly, it does, even if I do attempt to switch keepalive on). So, as far as the server’s concerned, the TCP session is done and it has closed the connection. But the client, em-http-request, hasn’t closed its end, and has decided to send another request along the same TCP connection. Since the server has already closed its side of the connection, it never sees the second request and, naturally, never responds.

I smell a bug with em-http-request that I’ve reported here: #77 Reusing HttpRequest.

I hope this has served as a useful introduction to tcpflow. It’s a great tool for discovering what really happens on the wire when you’re using higher level APIs and libraries. It really comes into its own when the library is, say, closed source and you have to deal with it as a black box. There are plenty of higher level tools with GUIs that can help you with this kind of task too, but I really like that tcpflow does one thing and does it really well. Thanks, Jeremy!

Vagrant box for Ubuntu 10.10 Maverick (64 bit)

I didn’t really intend this to turn into a series, but hey ho. I needed to build a 64-bit Ubuntu 10.10 vagrant image for a client project I’m working on. (I still need to get around to he RHEL-a-like image, I’ll get there one day!)

So, without further ado, here we are: Vagrant Base Box for Ubuntu 10.10 “Maverick” (64-bit). Getting started with it is pretty simple. First of all, make sure you’ve got VirtualBox 4.0.4 or greater installed, along with Vagrant 0.7.0 or greater. Then all you need to do is:

vagrant box add maverick64 http://mathie-vagrant-boxes.s3.amazonaws.com/maverick64.box

This will take a while to download the box and unpack it in the way that Vagrant likes to do. Finally, let’s just test it out:

mkdir maverick_demo
cd maverick_demo
vagrant init maverick64
vagrant up
vagrant ssh

which should wind up with you ssh’d into a pristine minimal Ubuntu 10.10 environment.

Changelog

As I update the box, I’ll update the change log here, newest changes at the top.

24th February, 2011