Symlink corruption on Mac OS X

Mac OS X on my desktop computer (a newish 27″ iMac, using a Promise Thunderbolt disk array for the root filesystem) seems to be having filesystem troubles. I notice it through symlinks going awry, though I’m sure they’re not the only victim. I tidied all the errant symlinks up two weeks ago, hoping it was a temporary glitch, but they’re back again today. Here’s an example:

> find -L /System -type l -print0 |xargs -0 ls -l
lrwxr-xr-x  1 root  wheel  24 15 Jan 09:42 /System/Library/Frameworks/ApplicationServices.framework/Frameworks/CoreGraphics.framework/Headers -> >File</string>????<key>L
lrwxr-xr-x  1 root  wheel  24 15 Jan 09:42 /System/Library/Frameworks/ApplicationServices.framework/Frameworks/HIServices.framework/Headers -> ?6?s?A??]h?_?:d9?r?
lrwxr-xr-x  1 root  wheel  24 15 Jan 09:42 /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/CoreGraphics.framework/Headers -> >File</string>????<key>L
lrwxr-xr-x  1 root  wheel  24 15 Jan 09:42 /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/HIServices.framework/Headers -> ?6?s?A??]h?_?:d9?r?
lrwxr-xr-x  1 root  wheel  24 15 Jan 09:42 /System/Library/Frameworks/ApplicationServices.framework/Versions/Current/Frameworks/CoreGraphics.framework/Headers -> >File</string>????<key>L
lrwxr-xr-x  1 root  wheel  24 15 Jan 09:42 /System/Library/Frameworks/ApplicationServices.framework/Versions/Current/Frameworks/HIServices.framework/Headers -> ?6?s?A??]h?_?:d9?r?

Each of those symlinks are pointing to some garbage. (Interestingly, the garbage quite often looks like the partial contents of a plist file.)

Here’s another example, and this is one I remember fixing last time:

lrwxr-xr-x  1 root  wheel  27 12 Nov 19:06 /System/Library/Frameworks/JavaVM.framework/Frameworks -> Versions/Current/Frameworks
lrwxr-xr-x  1 root  wheel  24 15 Jan 09:43 /System/Library/Frameworks/JavaVM.framework/Headers -> Versions/Current/Headers
lrwxr-xr-x  1 root  wheel  23 12 Nov 19:06 /System/Library/Frameworks/JavaVM.framework/JavaVM -> Versions/Current/JavaVM
lrwxr-xr-x  1 root  wheel  26 15 Jan 09:45 /System/Library/Frameworks/JavaVM.framework/Resources -> Versions/Current/Resources
lrwxr-xr-x  1 root  wheel   1  8 Jan 14:57 /System/Library/Frameworks/JavaVM.framework/Versions/Current -> c

The problem here isn’t the first four symlinks – they’re all pointing to the right places – but the last one (which they’re all pointing through) which is pointing to ‘c’, not ‘A’ like it should.

The symlink targets all seem to be the right length, just the wrong characters.

How do I go about communicating with Apple about the problem so I can get it resolved? It doesn’t really seem the sort of thing I can take to a Genius Bar…

How to install a working set of compilers on Mac OS X 10.7 (Lion)

With the advent of Xcode 4.2, Apple have removed GCC from the Xcode installer. Up ’til now, when you installed Xcode (say, 4.1), you’d get:

  • The original GNU Compiler Collection (GCC), version 4.2.1.

  • LLVM-GCC, which is a modified version of GCC designed to emit LLVM’s intermediate representation, so that it can hook into LLVM’s back end optimisers and code generation (as I understand it).

  • Clang, the shiny new compiler freshly built with LLVM goodness all the way through.

However, with Xcode 4.2 onwards, the original GCC variant has disappeared, and /usr/bin/gcc is now being served by the (mostly) compatible llvm-gcc.

This is only an issue if you freshly install Xcode 4.2 on a computer where the developer tools weren’t previously installed. If you upgrade from Xcode 4.1 to 4.2, it will still leave gcc-4.2 lying around (suboptimal package management, but I’m not complaining today!). I suspect this is why more people aren’t getting upset.

It’s entirely understandable for Apple to stop distributing gcc: they’re moving forward, innovating with LLVM, and there’s only so long they should have to maintain a legacy (as far as their commercial platform is concerned, at least) compiler. In fact, llvm-gcc is only there as a temporary crutch; wait ’til they remove that too, leaving us with only Clang…

However, there’s a slight snag: all compilers are not built alike. You’d think that compilers implement a standard correctly, and users of the compiler write code to that standard. It never quite works out that way, though: compilers have bugs and proprietary extensions. Worse still, coders write code until the compiler validates it, not necessarily ’til it’s right.

So, flipping to a new compiler, even gcc -> llvm-gcc, will uncover problems. One of the apps I rely on that has been having problems is Ruby, which is why I wound up messing around with this in the first place.

So, what to do? There are a couple of newly popular mechanisms out there for installing a set of compilers on Mac OS X without installing the whole Xcode behemoth (largely as a way to save disk space):

  • Soren Ionescu’s GCC Without Xcode, which uses the Xcode installer you download from the App Store to generate a slimline installer with just the bits you need. Soren is careful to do it this way so as not to distribute any of Apple’s binary packages on their behalf (something they, naturally, can get a little grumpy about). Unfortunately, this means it’s using the Xcode 4.2 installer, which … doesn’t have GCC.

  • Kenneth Reitz’s OSX GCC Installer, which, in the project’s download section, has a pre-built package, ready for installation. Here’s the key thing: the pre-built package is built against Xcode 4.1, so still includes gcc-4.2.1. Hallelujah, we’re saved. (Kenneth, please for the love of all things that compile, please don’t update that package!)

So, when you’ve got a fresh Lion installation, and you’re looking for something you can build the widest range of apps on (from things that require gcc all the way through to that shiny iOS 5 project you’re working on), do the following:

  • Grab the OSX GCC Installer (as of writing, the 10.7-v2 package is the way forward).

  • Run the package to install it.

  • Grab Xcode from the App Store.

  • Run the Xcode installer.

Finally, you should be good to go. /usrbin/gcc will still point to llvm-gcc-4.2 but at least you’ll have gcc-4.2 installed. If you’re having trouble compiling things, try forcing it to use gcc with CC=gcc-4.2 or equivalent. I think the likes of Homebrew and RVM will try to be clever on your behalf if they can.

Running tmux in Mac OS X Terminal

I’ve been a fan of screen for … a while now. But since I like being one of the cool kids, I’ve been using tmux for the past year or so. Last week, I noticed that every time I launch a new terminal, I wind up typing tmux attach-session. Let’s streamline, a little bit.

In Mac OS X’s Terminal.app, you can change the shell that it runs. Here’s how I did it:

  • Open Preferences, and choose the Settings tab.

  • Duplicate your existing settings (since sometimes you might not want tmux after all). Pick your default session (mine’s “Pro”) and select “Duplicate settings” from the tool menu at the bottom. Name the new settings “Tmux” or something along those lines.

  • In the shell tab for your settings, select “Run command” and enter /usr/local/bin/tmux attach-session. Deselect “Run inside shell” since you don’t really need to. Since you’re not running inside a shell, /usr/local/bin probably isn’t in your $PATH so you’ll need to specify the full path name. Of course, if your tmux binary lives somewhere other than /usr/local/bin you’ll need to change the path.

  • If you’ve selected “Only if there are processes other than” for “Prompt before closing”, then you’ll probably want to add tmux to that list.

  • In the “Window” tab, I set “Scrollback” to limit the number of rows to ’0′, since tmux provides scroll back, and the Terminal one isn’t terribly useful when tmux is running inside it.

  • Make sure your Tmux session is set as the default one by clicking the “Default” button at the bottom of the settings lists while it’s selected.

That’s it. Close your existing terminal sessions and launch a new one. You should be launched into (one of) your existing tmux sessions. If tmux wasn’t already running, then this assumes that your ~/.tmux.conf sets up at least one session (which I think it required anyway). If you’ve got more than one tmux session running, I’ve no idea which one, offhand, it’ll choose, but you can always switch to the one you’re looking for with C-a s. (You have rebound the prefix to C-a, right?)

There’s (at least) one time where you don’t want tmux as your shell. That’s when you’re attempting to interact with launchd. I suspect it’s to do with launchd checking that you’re a child process as part of its permissions when you’re asking it to do stuff, where tmux works by detaching itself from its parent process so it’s not killed when the parent is. (Total guess, BTW.) Still, when you want to use launchctl, you’ll need to do it somewhere other than a tmux session. In Terminal, choose the Shell menu, choose “New Window” (or “New Tab”) and select one of the other settings profiles.

Give me back my # key!

This particular tip may have an audience of approximately one, since:

  • It’s only going to bother Mac users;
  • who are in the UK;
  • who use the # (hash, pound, whatever you call it) key much; and
  • who have enabled “Use option as meta key” in Terminal.app in order to sanely use Emacs.

If you tick all these boxes, and have suddenly discovered that your # key (which is option-3 for UK-keyboard-wielding Mac users) no longer works, you’ve come to the right place!

The problem is that M-3 is bound to digit-argument. This allows you to repeat commands (e.g. if you type M-3 c, then it will output ccc) which I’ve got to admit I don’t use terribly often.

It turns out that the bindings for keys are controlled by the readline library and you can customise them with readline’s configuration file, ~/.inputrc. If you want to override the default behaviour of M-3 and turn it back to emitting the # symbol, put the following in that file:

"\e3": '#'

You’ll need to restart bash (or force it to reload its inputrc file with C-x C-r) in order for it to take effect.

Here’s another related tip, while I’m here. I picked this one up from Ross a few years ago. I almost never use the UK currency symbol (£) in a shell environment. On the other hand, there’s another operation I perform quite often at the command line: commenting out the command I’m currently typing. It usually happens when I’m doing a sequence of steps at the command line, and I realise that I’ve forgotten a prerequisite step. Rather than sticking the current line in the kill ring (C-e C-u or C-a C-k), then remembering to yank it (C-y) again, I tend to jump to the start of the line (C-a), stick a hash in (to comment the line out), then hit enter. That way it’s in my (searchable) bash history for when I next need it.

But that’s a fairly cumbersome sequence too, so let’s shorten it. Stick the following in ~/.inputrc:

"£": '\C-a#\C-m'

and restart your shell. Now when you’re part way through typing a command line and want to switch tracks, hit the £ sign and you’re done. When you want the command back, retrieve it from your shell history (C-r to search!), then hit C-a C-d to remove the comment sign and you’re good to go.

One last wee trick. You really can remap any key to any other key. Imagine the fun and hilarity of the following in your ~/.inputrc:

"l": 'r'
"s": 'm'

:-)

Update: My colleague, Mihai, points out that the only appreciable difference between the UK and US keyboard layout on the Mac is that the # and £ key combinations are swapped around (oh, and it’s easier to find the ™ than the € now, not that I use either). Since I’ve just discovered that my irb isn’t picking up the ~/.inputrc and doing the right thing, I reckon I’ll just switch to the US layout…

Pimpin’ TextMate (aka Top 5 TextMate Plugins, 2011 Edition)

All the cool kids are using Vim these days. That’s fine, y’all go right ahead, but sorry, it’s not really my thing. I’m perfectly happy with TextMate, thanks.

Don’t get me wrong; I’ve been using various flavours of Vi for 17 years now. I’m reasonably proficient at it, and it’s my tool of choice when I’m, say, editing server configuration files. I can plan and execute a series of changes to a file like a boss, when I already know what I’m doing. The trouble occurs when I’m coding, and what I’m trying to achieve isn’t yet a fully crystallised thought. Then, with Vi, I feel boxed in and a bit blinkered.

It’s just me, I know. Maybe the root of the problem is that I code before thinking. :)

So I still use TextMate, where I feel just a little more free and creative.

Anyway, this isn’t a post about Vim vs TextMate (there’s enough of them already!), I just wanted to get that off my chest.

I seem to be going through a phase of setting up fresh installs of my various desktop and laptop computers again (blame Lion). Instead of blindly installing all the same tools as I’m used to, I figured I’d reassess the landscape and see what’s out there. Here are a few plugins for TextMate I’ve tried, and liked.

  • PeepOpen is a file finder from the lovely folks at PeepCode. It replaces the default “Go to file…” (⌘-T) pane with one that’s a good deal smarter. Instead of matching just on the filename, it’ll match on the entire path. This is invaluable if your project has 300 files called show.html.erb. If I’m looking for, say, an article’s show template in a Rails project, typing avartsh (app/views/articles/show.html.erb) will get me there quickly.

    PeepOpen is $12, but it’s free as part of your PeepCode Unlimited subscription which you’d be crazy not to have anyway, right?

  • EGOTextMateFullScreen is a plugin that gives you native full screen support on Mac OS X Lion. On my desktop computer, a 27″ iMac, I didn’t really the point of full screen support (I much prefer a few windows tiled with the assistance of Sizeup). But then I got my hands on a Macbook Air and suddenly it made a whole lot more sense.

  • If you’re making use of the full screen support, then there’s one small snag: the drawer isn’t visible. I quite like the drawer; not to find files – because PeepOpen does a much better job – but to remind me of the context I’m in. When you’re editing a file, and need a reminder of the context, hit ctrl-⌘-R and your current file will be highlighted in the project tree.

    So if, like me, you’d miss the drawer in full screen mode, the old standby is the MissingDrawer plugin. This turns the drawer into a sidebar, which is part of the main window and, therefore, still visible in full screen mode.

  • Finally, project-wide search. The default Find in Project is a little … slow. And a bit beachball-y. For a while I’d been using a bundle that replaces the default Find in Project with one powered by Ack. Much faster, and a little more flexible too. When I went searching for it this time around, though, I discovered AckMate, a plugin which provides a more integrated experience. It’s fast. It can show you the context of the search result. It even allows you to constrain the results to particular file types. In short, it’s awesome. Be sure to read the instructions on how to set the shortcut key to the default “Find in Project…” one.

There’s one final plugin that I’m planning on evaluating, but haven’t quite gotten around to yet. The one, single, feature I liked in my brief affair with RubyMine was the ability to jump to a method definition. I’ve got a similar mechanism set up in Vim using exuberant ctags which works really well. (And back in the days before Mac OS X when Emacs was my hammer – and every C file a thumb – I lived by my TAGS files.) How I’d love project-wide “Go to symbol…”. There’s a plugin out there called TmCodeBrowser that appears to do exactly what I’m after. I’m slightly wary as it hasn’t been updated in quite some time, but perhaps that’s because it still just works. I’ll evaluate it soon and update this post when I do.

There’s one plugin I wish existed, but doesn’t seem to: split windows. I’ve got plenty of screen real estate. It would be crazy awesome to be able to have my model and its tests sitting side by side with a vertical split. If it’s possible to write a TextMate plugin that realises this, I would happily pay $12 for it (same price point as PeepOpen). Just so you know. ;-)

That’s it. I still love TextMate and I spend several hours working in it most days. Thanks for making code editing a lovely experience, Allan!

Using tcpflow

Sometimes, when you’re writing applications that use a library to talk over the wire to a remote service, it’s difficult to see how the high level API the library exposes translates into the on-the-wire protocol. Funnily enough, I was having that very problem yesterday, so I dug tcpflow out my toolbox to better understand what was happening.

I was writing a client for the [REDACTED] (for now, at least!) API, for a client project. I’d decided to use this as an excuse to learn EventMachine and em-http-request to talk to the remote API. Given the pattern of use I’m expecting, a reactor-patterned daemon feels like a really good fit.

It was a weird experience — it’s the first time I’ve done any reactor pattern development in anger since ~2004 when I was messing around with using Python’s Twisted framework for small TCP server applications. (Ah, them were the days, before HTTP became the hammer to everybody’s wire protocol thumb nail.) But I digress…

I was having trouble understanding why the em-http-request requests I was making weren’t having the intended result. There were two specific problems I saw over the course of the day:

  • The path I was sending in, according to the documentation, using request.get :path => '/api/v2/chickens.json', didn’t appear to be what was being requested (in that everything I requested looked like it was getting the same response as a request to GET /).

  • After the initial request, subsequent requests reusing the same HttpRequest object would silently fail. This was slightly more bizarre.

In solving either of these problems, I could have dived into the em-http-request source and traced what was happening. In fact, that was my first port of call but, being rusty at the reactor pattern, I was having trouble following the flow of execution. (As it turns out, if I’ve done this properly, I should have spotted the problem straight away, but we’ll get to that.)

However, in this instance, I decided to treat the library I was using as a black box and instead examine what it was generating and consuming at the other end.

Enter tcpflow. It’s been part of my arsenal of network debugging tools for as long as I can remember. It is similar to tcpdump, but that typically just shows you the IP packets on the wire. tcpflow attempts to reconstruct the actual TCP streams to give you an idea of the conversation going on.

Let’s install it. I’m on a Mac, and I’m using Homebrew, so it’s just a case of:

brew install tcpflow

Your mileage may vary, but it’s a piece of software that’s been around for a long time, so I bet there’s a package for your system. If you’re using Homebrew, it will install the binary as your user and not setuid to root, which means that, in order to access the packet capture device, you’ll have to make sure and run it through sudo. The examples I show all have sudo in ‘em since that’s what I had to do.

tcpflow uses libpcap as the underlying packet capture library, the very same one as is used by tcpdump. This means that the syntax for specifying the information you want to capture will be familiar if you’ve used that tool before.

Last piece of background information before we get into solving the problems. You’ll need to figure out the network interface you’re using as that’s the interface it will capture packets from. Short version (since a longer version is out of scope!): if you’re on a Mac, it’s highly likely to be en0 if you’re on wired Ethernet and en1 if you’re on wireless. If you’re using Linux and you’re on a wired network, chances are it’ll be eth0. My Mac laptop is on a wireless network right now, so the examples show me using en1. Examine the output of ifconfig to determine your active network interface.

So, that’s the theory, let’s see this thing in action. The first problem is that we’re not getting the response we’re expecting from making particular HTTP requests. Let’s see what’s happening on the wire with tcpflow:

sudo tcpflow -c -n en1 src or dst host api.example.com

Breaking that down:

  • -c spits the flows out on stdout. The normal operation is to create a file for each flow, in each direction. If you’re capturing a lot of data, or pipelined conversations, files are much more sane, but for small conversations like this, stdout is really convenient.

  • -n en1 is capturing from my wireless device, as discussed above.

  • src or dst host api.example.com is the expression used to describe the flows that we want to capture. In essence, this is saying that we want to capture flows of data that are sent to, or received from, api.example.com. The language is pretty rich, allowing for a bunch of complex rules to narrow down the data captured, but I tend to find that specifying a remote host and/or port is enough.

Let’s see what that gets us. I’ve created a Ruby script, the essence of which is:

EM.run do
  request = EventMachine::HttpRequest.new('http://api.example.com/')
  deferrable = request.get :path => '/api/v2/chickens.json'
  deferrable.callback { puts "It worked"; EM.stop }
  deferrable.errback  { puts "It failed"; EM.stop }
end

Let’s see what the on-the-wire communication turns into:

172.017.012.012.53284-010.011.012.234.00080: GET / HTTP/1.1
User-Agent: EventMachine HttpClient
Host: api.example.com


010.011.012.234.00080-172.017.012.012.53284: HTTP/1.1 301 Moved Permanently
Date: Sat, 05 Mar 2011 11:59:55 GMT
Content-Type: text/html
Content-Length: 178
Connection: close
Location: http://www.example.com/

[ body content elided ]

That’s annoying, because I’d specifically requested /api/v2/chickens.json, but it’s actually requesting / (the GET / HTTP/1.1 line). What’s with that? Well, it turns out that the published stable gem for em-http-request doesn’t actually implement the :path option; I need to install the beta 1.0 gem to get that. Oops. Moral of the story: if you’re going to read the source to the gem you’re using, read the source to the installed version you’re using, not just the latest version on GitHub!

The second problem is a little more subtle. I got to the stage where the first request was working successfully, but subsequent requests reusing the same HttpRequest would fail miserably, just waiting for no response to occur. An example:

EM.run do
  request = EventMachine::HttpRequest.new('http://api.example.com/')
  deferrable = request.get :path => '/api/v2/chickens.json'
  deferrable.callback do
    deferred_eggs = request.get :path => '/api/v2/chickens/1/eggs.json'
    deferred_eggs.callback { puts "It worked"; EM.stop }
    deferred_eggs.errback  { puts "Egg retrieval failed"; EM.stop }
  end
  deferrable.errback  { puts "It failed"; EM.stop }
end

Let’s take a wee look at the over-the-wire conversation as captured by tcpflow:

172.017.012.012.54442-010.011.012.234.00080: GET /api/v2/chickens.json HTTP/1.1
User-Agent: EventMachine HttpClient
Host: api.example.com


010.011.012.234.00080-172.017.012.012.54442: HTTP/1.1 200 OK
Date: Sat, 05 Mar 2011 13:39:47 GMT
Content-Type: application/json;charset=ISO-8859-1
Connection: close
Set-Cookie: JSESSIONID=B4D6AE4456BC6D93E6B3441D4FEC6946; Path=/api/v2
Content-Language: en-US

[ body content elided ]


172.017.012.012.54442-010.011.012.234.00080: GET /api/v2/chickens/1/eggs.json HTTP/1.1
User-Agent: EventMachine HttpClient
Host: api.example.com

Here we see the initial request, the response and the second request, but no response. That’s really weird. Isn’t it? Then I spotted something. Do you see the initial line of each flow, with a string of numbers? That’s the source IP and port, then the destination IP and port. Let’s break the first one apart:

172.017.012.012.54442-010.011.012.234.00080

which translates to:

  • the client IP address is 172.17.12.12. That’s the internal IP of my laptop on my home network.
  • the client TCP port is 54442. Both sides of a TCP connection get a port, and the client side usually has a random, unused, high port number chosen for it by the kernel. Each TCP connection gets a different client port, and they’re typically not reused for a while after you’re done with them.
  • the server IP address is 10.11.12.234 which I cleverly remembered to change just now, lest you discover who [REDACTED] is. ;)
  • the server TCP port is 80, the well known port for HTTP.

Now, let’s look at the second request’s connection information:

172.017.012.012.54442-010.011.012.234.00080

Isn’t that interesting? They’ve both got the same client TCP port. Now, we’re not doing keepalive, and the server responded with Connection: close (which, sadly, it does, even if I do attempt to switch keepalive on). So, as far as the server’s concerned, the TCP session is done and it has closed the connection. But the client, em-http-request, hasn’t closed its end, and has decided to send another request along the same TCP connection. Since the server has already closed its side of the connection, it never sees the second request and, naturally, never responds.

I smell a bug with em-http-request that I’ve reported here: #77 Reusing HttpRequest.

I hope this has served as a useful introduction to tcpflow. It’s a great tool for discovering what really happens on the wire when you’re using higher level APIs and libraries. It really comes into its own when the library is, say, closed source and you have to deal with it as a black box. There are plenty of higher level tools with GUIs that can help you with this kind of task too, but I really like that tcpflow does one thing and does it really well. Thanks, Jeremy!

Vagrant box for Ubuntu 10.10 Maverick (64 bit)

I didn’t really intend this to turn into a series, but hey ho. I needed to build a 64-bit Ubuntu 10.10 vagrant image for a client project I’m working on. (I still need to get around to he RHEL-a-like image, I’ll get there one day!)

So, without further ado, here we are: Vagrant Base Box for Ubuntu 10.10 “Maverick” (64-bit). Getting started with it is pretty simple. First of all, make sure you’ve got VirtualBox 4.0.4 or greater installed, along with Vagrant 0.7.0 or greater. Then all you need to do is:

vagrant box add maverick64 http://mathie-vagrant-boxes.s3.amazonaws.com/maverick64.box

This will take a while to download the box and unpack it in the way that Vagrant likes to do. Finally, let’s just test it out:

mkdir maverick_demo
cd maverick_demo
vagrant init maverick64
vagrant up
vagrant ssh

which should wind up with you ssh’d into a pristine minimal Ubuntu 10.10 environment.

Changelog

As I update the box, I’ll update the change log here, newest changes at the top.

24th February, 2011