Ruby Timeout Woes, Part 1

I seem to be having a bad day with the built in Timeout class in Ruby. There are two problems; one is pretty innocuous, the other … not so much.

When you’re using Timeout, you’ll typically wrap the block of code you’re wanting to guard like this:

require 'timeout'
 
begin
  Timeout.timeout(10) do
    # Block of code
  end
rescue Timeout::Error => e
  puts "Execution expired"
end

Your block of code will run for up to (approximately) 10 seconds and, if it hasn’t completed in that time, will raise the Timeout::Error exception. Pretty straightforward.

The innocuous issue is just one trying to make me mistrust my memory. In Ruby 1.8.x, Timeout::Error inherits from Interrupt, so it’s inheritance from Exception goes along the lines of:

Timeout::Error < Interrupt < SignalException < Exception

The key thing to note here is that it doesn’t inherit directly from StandardError and so a blank rescue block won’t catch it:

begin
  Timeout.timeout(10) { sleep 20 }
rescue
  puts "On Ruby 1.8.x I won't catch the timeout exception."
end

However, on Ruby 1.9.2, Timeout::Error inherits from RuntimeError, so in the above code example, the rescue block will get called. That’s annoying, but it’s not like it’s the only incompatible change between Ruby 1.8.x and Ruby 1.9, so I’m OK with that. Plus, non-specific rescue blocks like that are a bad smell anyway.

The slightly more insidious problem needs further explanation. Come back again later on and I’ll tell you all about it.

Pimpin’ TextMate (aka Top 5 TextMate Plugins, 2011 Edition)

All the cool kids are using Vim these days. That’s fine, y’all go right ahead, but sorry, it’s not really my thing. I’m perfectly happy with TextMate, thanks.

Don’t get me wrong; I’ve been using various flavours of Vi for 17 years now. I’m reasonably proficient at it, and it’s my tool of choice when I’m, say, editing server configuration files. I can plan and execute a series of changes to a file like a boss, when I already know what I’m doing. The trouble occurs when I’m coding, and what I’m trying to achieve isn’t yet a fully crystallised thought. Then, with Vi, I feel boxed in and a bit blinkered.

It’s just me, I know. Maybe the root of the problem is that I code before thinking. :)

So I still use TextMate, where I feel just a little more free and creative.

Anyway, this isn’t a post about Vim vs TextMate (there’s enough of them already!), I just wanted to get that off my chest.

I seem to be going through a phase of setting up fresh installs of my various desktop and laptop computers again (blame Lion). Instead of blindly installing all the same tools as I’m used to, I figured I’d reassess the landscape and see what’s out there. Here are a few plugins for TextMate I’ve tried, and liked.

  • PeepOpen is a file finder from the lovely folks at PeepCode. It replaces the default “Go to file…” (⌘-T) pane with one that’s a good deal smarter. Instead of matching just on the filename, it’ll match on the entire path. This is invaluable if your project has 300 files called show.html.erb. If I’m looking for, say, an article’s show template in a Rails project, typing avartsh (app/views/articles/show.html.erb) will get me there quickly.

    PeepOpen is $12, but it’s free as part of your PeepCode Unlimited subscription which you’d be crazy not to have anyway, right?

  • EGOTextMateFullScreen is a plugin that gives you native full screen support on Mac OS X Lion. On my desktop computer, a 27″ iMac, I didn’t really the point of full screen support (I much prefer a few windows tiled with the assistance of Sizeup). But then I got my hands on a Macbook Air and suddenly it made a whole lot more sense.

  • If you’re making use of the full screen support, then there’s one small snag: the drawer isn’t visible. I quite like the drawer; not to find files – because PeepOpen does a much better job – but to remind me of the context I’m in. When you’re editing a file, and need a reminder of the context, hit ctrl-⌘-R and your current file will be highlighted in the project tree.

    So if, like me, you’d miss the drawer in full screen mode, the old standby is the MissingDrawer plugin. This turns the drawer into a sidebar, which is part of the main window and, therefore, still visible in full screen mode.

  • Finally, project-wide search. The default Find in Project is a little … slow. And a bit beachball-y. For a while I’d been using a bundle that replaces the default Find in Project with one powered by Ack. Much faster, and a little more flexible too. When I went searching for it this time around, though, I discovered AckMate, a plugin which provides a more integrated experience. It’s fast. It can show you the context of the search result. It even allows you to constrain the results to particular file types. In short, it’s awesome. Be sure to read the instructions on how to set the shortcut key to the default “Find in Project…” one.

There’s one final plugin that I’m planning on evaluating, but haven’t quite gotten around to yet. The one, single, feature I liked in my brief affair with RubyMine was the ability to jump to a method definition. I’ve got a similar mechanism set up in Vim using exuberant ctags which works really well. (And back in the days before Mac OS X when Emacs was my hammer – and every C file a thumb – I lived by my TAGS files.) How I’d love project-wide “Go to symbol…”. There’s a plugin out there called TmCodeBrowser that appears to do exactly what I’m after. I’m slightly wary as it hasn’t been updated in quite some time, but perhaps that’s because it still just works. I’ll evaluate it soon and update this post when I do.

There’s one plugin I wish existed, but doesn’t seem to: split windows. I’ve got plenty of screen real estate. It would be crazy awesome to be able to have my model and its tests sitting side by side with a vertical split. If it’s possible to write a TextMate plugin that realises this, I would happily pay $12 for it (same price point as PeepOpen). Just so you know. ;-)

That’s it. I still love TextMate and I spend several hours working in it most days. Thanks for making code editing a lovely experience, Allan!

Using tcpflow

Sometimes, when you’re writing applications that use a library to talk over the wire to a remote service, it’s difficult to see how the high level API the library exposes translates into the on-the-wire protocol. Funnily enough, I was having that very problem yesterday, so I dug tcpflow out my toolbox to better understand what was happening.

I was writing a client for the [REDACTED] (for now, at least!) API, for a client project. I’d decided to use this as an excuse to learn EventMachine and em-http-request to talk to the remote API. Given the pattern of use I’m expecting, a reactor-patterned daemon feels like a really good fit.

It was a weird experience — it’s the first time I’ve done any reactor pattern development in anger since ~2004 when I was messing around with using Python’s Twisted framework for small TCP server applications. (Ah, them were the days, before HTTP became the hammer to everybody’s wire protocol thumb nail.) But I digress…

I was having trouble understanding why the em-http-request requests I was making weren’t having the intended result. There were two specific problems I saw over the course of the day:

  • The path I was sending in, according to the documentation, using request.get :path => '/api/v2/chickens.json', didn’t appear to be what was being requested (in that everything I requested looked like it was getting the same response as a request to GET /).

  • After the initial request, subsequent requests reusing the same HttpRequest object would silently fail. This was slightly more bizarre.

In solving either of these problems, I could have dived into the em-http-request source and traced what was happening. In fact, that was my first port of call but, being rusty at the reactor pattern, I was having trouble following the flow of execution. (As it turns out, if I’ve done this properly, I should have spotted the problem straight away, but we’ll get to that.)

However, in this instance, I decided to treat the library I was using as a black box and instead examine what it was generating and consuming at the other end.

Enter tcpflow. It’s been part of my arsenal of network debugging tools for as long as I can remember. It is similar to tcpdump, but that typically just shows you the IP packets on the wire. tcpflow attempts to reconstruct the actual TCP streams to give you an idea of the conversation going on.

Let’s install it. I’m on a Mac, and I’m using Homebrew, so it’s just a case of:

brew install tcpflow

Your mileage may vary, but it’s a piece of software that’s been around for a long time, so I bet there’s a package for your system. If you’re using Homebrew, it will install the binary as your user and not setuid to root, which means that, in order to access the packet capture device, you’ll have to make sure and run it through sudo. The examples I show all have sudo in ‘em since that’s what I had to do.

tcpflow uses libpcap as the underlying packet capture library, the very same one as is used by tcpdump. This means that the syntax for specifying the information you want to capture will be familiar if you’ve used that tool before.

Last piece of background information before we get into solving the problems. You’ll need to figure out the network interface you’re using as that’s the interface it will capture packets from. Short version (since a longer version is out of scope!): if you’re on a Mac, it’s highly likely to be en0 if you’re on wired Ethernet and en1 if you’re on wireless. If you’re using Linux and you’re on a wired network, chances are it’ll be eth0. My Mac laptop is on a wireless network right now, so the examples show me using en1. Examine the output of ifconfig to determine your active network interface.

So, that’s the theory, let’s see this thing in action. The first problem is that we’re not getting the response we’re expecting from making particular HTTP requests. Let’s see what’s happening on the wire with tcpflow:

sudo tcpflow -c -n en1 src or dst host api.example.com

Breaking that down:

  • -c spits the flows out on stdout. The normal operation is to create a file for each flow, in each direction. If you’re capturing a lot of data, or pipelined conversations, files are much more sane, but for small conversations like this, stdout is really convenient.

  • -n en1 is capturing from my wireless device, as discussed above.

  • src or dst host api.example.com is the expression used to describe the flows that we want to capture. In essence, this is saying that we want to capture flows of data that are sent to, or received from, api.example.com. The language is pretty rich, allowing for a bunch of complex rules to narrow down the data captured, but I tend to find that specifying a remote host and/or port is enough.

Let’s see what that gets us. I’ve created a Ruby script, the essence of which is:

EM.run do
  request = EventMachine::HttpRequest.new('http://api.example.com/')
  deferrable = request.get :path => '/api/v2/chickens.json'
  deferrable.callback { puts "It worked"; EM.stop }
  deferrable.errback  { puts "It failed"; EM.stop }
end

Let’s see what the on-the-wire communication turns into:

172.017.012.012.53284-010.011.012.234.00080: GET / HTTP/1.1
User-Agent: EventMachine HttpClient
Host: api.example.com


010.011.012.234.00080-172.017.012.012.53284: HTTP/1.1 301 Moved Permanently
Date: Sat, 05 Mar 2011 11:59:55 GMT
Content-Type: text/html
Content-Length: 178
Connection: close
Location: http://www.example.com/

[ body content elided ]

That’s annoying, because I’d specifically requested /api/v2/chickens.json, but it’s actually requesting / (the GET / HTTP/1.1 line). What’s with that? Well, it turns out that the published stable gem for em-http-request doesn’t actually implement the :path option; I need to install the beta 1.0 gem to get that. Oops. Moral of the story: if you’re going to read the source to the gem you’re using, read the source to the installed version you’re using, not just the latest version on GitHub!

The second problem is a little more subtle. I got to the stage where the first request was working successfully, but subsequent requests reusing the same HttpRequest would fail miserably, just waiting for no response to occur. An example:

EM.run do
  request = EventMachine::HttpRequest.new('http://api.example.com/')
  deferrable = request.get :path => '/api/v2/chickens.json'
  deferrable.callback do
    deferred_eggs = request.get :path => '/api/v2/chickens/1/eggs.json'
    deferred_eggs.callback { puts "It worked"; EM.stop }
    deferred_eggs.errback  { puts "Egg retrieval failed"; EM.stop }
  end
  deferrable.errback  { puts "It failed"; EM.stop }
end

Let’s take a wee look at the over-the-wire conversation as captured by tcpflow:

172.017.012.012.54442-010.011.012.234.00080: GET /api/v2/chickens.json HTTP/1.1
User-Agent: EventMachine HttpClient
Host: api.example.com


010.011.012.234.00080-172.017.012.012.54442: HTTP/1.1 200 OK
Date: Sat, 05 Mar 2011 13:39:47 GMT
Content-Type: application/json;charset=ISO-8859-1
Connection: close
Set-Cookie: JSESSIONID=B4D6AE4456BC6D93E6B3441D4FEC6946; Path=/api/v2
Content-Language: en-US

[ body content elided ]


172.017.012.012.54442-010.011.012.234.00080: GET /api/v2/chickens/1/eggs.json HTTP/1.1
User-Agent: EventMachine HttpClient
Host: api.example.com

Here we see the initial request, the response and the second request, but no response. That’s really weird. Isn’t it? Then I spotted something. Do you see the initial line of each flow, with a string of numbers? That’s the source IP and port, then the destination IP and port. Let’s break the first one apart:

172.017.012.012.54442-010.011.012.234.00080

which translates to:

  • the client IP address is 172.17.12.12. That’s the internal IP of my laptop on my home network.
  • the client TCP port is 54442. Both sides of a TCP connection get a port, and the client side usually has a random, unused, high port number chosen for it by the kernel. Each TCP connection gets a different client port, and they’re typically not reused for a while after you’re done with them.
  • the server IP address is 10.11.12.234 which I cleverly remembered to change just now, lest you discover who [REDACTED] is. ;)
  • the server TCP port is 80, the well known port for HTTP.

Now, let’s look at the second request’s connection information:

172.017.012.012.54442-010.011.012.234.00080

Isn’t that interesting? They’ve both got the same client TCP port. Now, we’re not doing keepalive, and the server responded with Connection: close (which, sadly, it does, even if I do attempt to switch keepalive on). So, as far as the server’s concerned, the TCP session is done and it has closed the connection. But the client, em-http-request, hasn’t closed its end, and has decided to send another request along the same TCP connection. Since the server has already closed its side of the connection, it never sees the second request and, naturally, never responds.

I smell a bug with em-http-request that I’ve reported here: #77 Reusing HttpRequest.

I hope this has served as a useful introduction to tcpflow. It’s a great tool for discovering what really happens on the wire when you’re using higher level APIs and libraries. It really comes into its own when the library is, say, closed source and you have to deal with it as a black box. There are plenty of higher level tools with GUIs that can help you with this kind of task too, but I really like that tcpflow does one thing and does it really well. Thanks, Jeremy!

Vagrant box for Ubuntu 10.10 Maverick (64 bit)

I didn’t really intend this to turn into a series, but hey ho. I needed to build a 64-bit Ubuntu 10.10 vagrant image for a client project I’m working on. (I still need to get around to he RHEL-a-like image, I’ll get there one day!)

So, without further ado, here we are: Vagrant Base Box for Ubuntu 10.10 “Maverick” (64-bit). Getting started with it is pretty simple. First of all, make sure you’ve got VirtualBox 4.0.4 or greater installed, along with Vagrant 0.7.0 or greater. Then all you need to do is:

vagrant box add maverick64 http://mathie-vagrant-boxes.s3.amazonaws.com/maverick64.box

This will take a while to download the box and unpack it in the way that Vagrant likes to do. Finally, let’s just test it out:

mkdir maverick_demo
cd maverick_demo
vagrant init maverick64
vagrant up
vagrant ssh

which should wind up with you ssh’d into a pristine minimal Ubuntu 10.10 environment.

Changelog

As I update the box, I’ll update the change log here, newest changes at the top.

24th February, 2011

Vagrant base box for Debian Squeeze

To celebrate the new release of Debian Squeeze at the weekend, I decided to have a quick hack around with setting up a Vagrant base box for it. What I really want to do is build and share a base box for something not entirely unlike RedHat Enterprise Linux 5, which is what our production platform at work uses, but I wanted to get a feel for packaging up base boxes, so I started with something I’m more familiar with.

Anyway, turns out it’s really not that hard. The instructions for creating base boxes are pretty comprehensive. So, without further ado, here we are: Vagrant Base Box for Debian 6.0 “Squeeze” (32-bit). Getting started with it is pretty simple. First of all, make sure you’ve got VirtualBox 4.0 installed, along with Vagrant 0.7.0 or greater. Then all you need to do is:

vagrant box add debian_squeeze_32 \

http://mathie-vagrant-boxes.s3.amazonaws.com/debian_squeeze_32.box

(excuse the formatting, I will fix the code examples on this site one day!) This will take a while to download the box and unpack it in the way that Vagrant likes to do. Finally, let’s just test it out:

mkdir squeeze_demo
cd squeeze_demo
vagrant init debian_squeeze_32
vagrant up
vagrant ssh

which should wind up with you ssh’d into a pristine minimal Debian Squeeze environment, ready to test out its stable goodness.

Changelog

As I update the box, I’ll update the change log here, newest changes at the top.

26th February, 2011

  • Updated the permissions on S3 so you can actually download the new version. Sorry, folks!

24th February, 2011

  • apt-get update && apt-get upgrade to pull in the latest package updates.
  • Upgraded the VirtualBox Guest Additions from 4.0.2 to 4.0.4.
  • Remove the USB controller, since it’s unnecessary.
  • Dropped the Grub timeout from 5 seconds to 1, since Vagrant ain’t allowing you to choose an alternative anyway.

7th February, 2011

  • Initial release.

Installing on my Mac at home

I’m in the office and I want to start installing some software on my laptop at home so it’s ready to try when I get back to my laptop. (The test suite is running and I know the install will take a while, so it can get going while I commute!) In particular, I was looking to try Vagrant, which requires VirtualBox.

First step: Back to my Mac. Turns out, if you’re sensible enough to have SSH running on your Mac (enable it under Sharing in System Preferences), it’s available from other Macs where you have your MobileMe account set up. Simply run:

ssh <hostname>.<MobileMe name>.members.mac.com

where <hostname> is the short name of your laptop and <MobileMe name> is your MobileMe login. So, for example:

ssh jura.mathie.members.mac.com

gets me to my laptop. (Interestingly, this doesn’t resolve on hosts where you’re not logged in to your MobileMe account, so there’s some DNS resolution magic going on locally. At least I hope so, and I’ve not just exposed myself on the Internet!)

Next up, let’s grab VirtualBox. Head across to the web site to figure out the latest download URL, and download it with curl:

curl -LO http://[...]/VirtualBox-4.0.2-69518-OSX.dmg

Once it’s downloaded, we have to mount the disk image. The simplest way with Mac OS X is to use hdiutil:

hdiutil attach VirtualBox-4.0.2-69518-OSX.dmg

This will mount the disk image on /Volumes/VirtualBox. Now we have to run the installer, which we can do headless:

sudo installer -pkg /Volumes/VirtualBox/VirtualBox.mpkg -target / -verbose

This needs to run as root, hence sudo. Since it’s verbose, it’ll spit out a bunch of progress stuff, because we asked it to be verbose. But that’s it. Thanks to having a standard installer, we can install Mac OS X packages without a GUI. Win.

We should tidy up after ourselves by unmounting the disk image:

hdiutil detach /Volumes/VirtualBox

And we’re done. VirtualBox is installed, and I can get onto the long running download of grabbing the demo Ubuntu 10.04 LTS image for Vagrant (which clocks in at nearly 500MB). Getting started with Vagrant is another story entirely, which I’m sure I’ll talk about sometime soon.

Well, we’re nearly done. Let’s finish up by freaking out whoever’s in the house:

say "We're watching you!"

:-)

Using SagePay in your Ruby Projects

Once upon a time, in a galaxy far, far away … Ok, I’ll stop now. A few years back, I was working on a client project and they needed to integrate with a billing platform. They’d already picked Protx (now SagePay) as their platform of choice, and in particular, the Server variant. Wait, I’ll backtrack. SagePay has three variants:

  • Form which is basically just sticking a random form/button on your web page that redirects to SagePay’s servers and redirects back afterwards. Useful if your budget is low, you don’t have a web developer on hand and your requirements are utterly trivial.

  • Server which is more full featured, and allows much finer control of the billing process, including delayed billing, recurring subscriptions, that kind of thing. It’s a little more involved, but you still don’t capture the credit card details themselves (that is still a redirect step to SagePay’s servers) so you get to sidestep the painful parts of PCI-DSS compliance. Which you want to do, unless you have a bucketload of money and want to be labelled clinically insane.

  • Direct which involves you doing the store/forward of the credit card details. Its the ultimate in flexibility and doesn’t tie you to SagePay in the long term (say, because they have all the CC numbers for your recurring payments!). Choosing this option, however, is one of the criteria for free entry to the PCI DSS loony bin.

I’d previously built an integration for a client with Protx Form, and had modified ActiveMerchant to support it (forked here, not that the patch ever got merged, but n’mind). So, naturally, the second time, I reached for ActiveMerchant again and started hacking away. Unfortunately, SagePay Server’s model doesn’t really work with ActiveMerchant because it’s half way between an integration and a gateway. We struggled on and got that working (see this fork here). (That patch was never merged back upstream either, but I’m less surprised by that.)

Fast forward a couple of years, and another client wants an integration with SagePay. Their requirements are more complex in that its a SaaS offering which requires flexible recurring billing, so the Server version definitely fits well. But this time I’ve decided for sure that AM is not a good fit.

So we start from scratch and model SagePay Server’s API natively in Ruby. The sage_pay gem is the result of our efforts. There’s also an example Rails application using most of it’s features just so you can see what’s happening. Its all available under a liberal license, so you’re free to do what you like with it. I’d love to here from you if you use it. I’d also be utterly delighted to accept patches and improvements.

Documentation is a little sparse so far, so your best bet is to check out the sample app and work from there. I’d be particularly happy to accept pull requests that contain documentation. ;-)

(Oh, and while I’m on the subject of ActiveMerchant forks that never made it back upstream, here’s one for Barclays ePDQ too.)