MySQL and time zones

It turns out that MySQL has support for time zones, but doesn’t have the necessary information to support named time zones out of the box. The instructions on getting time zone support are described in MySQL Server Time Zone Support section of the manual, but the short version is to run this script and then pass the results into MySQL to create the appropriate tables in the mysql db:

mysql_tzinfo_to_sql /usr/share/zoneinfo |mysql -u root mysql

This resets the contents of the time_zone, time_zone_name, time_zone_transition and time_zone_transition_type tables in the system mysql database. It constructs the tables from your operating system’s time zone information database, so if that ever changes, it would be prudent to re-run this script. (Note to self: I need a puppet manifest for updating it when the tzdata package is updated…!)

Having done this, I now get access to the symbolic names for time zones. Which is useful for using CONVERT_TZ:

SELECT CONVERT_TZ('2012-06-01 12:00:00', 'America/Los_Angeles', 'Europe/London');

which will give us back 8pm London time. Better still, the time zones are then DST aware. So, I can now do:

SELECT CONVERT_TZ('2012-06-01 12:00:00', 'Etc/UTC', 'Europe/London');

and get back 1pm local time.

There are a couple of situations where this can be useful. From here on, I’m assuming that you use DATETIME columns and that you normalise all these DATETIME columns to UTC because I use Rails and that’s what Rails ActiveRecord does. See the last paragraph where I discover TIMESTAMP and wonder if this is a sensible pattern in the first place.

The first situation is where you’re taking inputs in local time (local time in this example is Europe/London, which only gets interesting when DST is in effect in the summer). If you do:

SELECT * FROM readings
  WHERE started_at = '2012-06-01 12:00:00';

then you are implicitly assuming that the argument your passing is already in UTC. What if it’s in the user’s local time? Well:

SELECT * FROM readings
  WHERE started_at = CONVERT_TZ('2012-06-01 12:00:00', 'Europe/London', 'Etc/UTC');

which will convert the time to UTC before performing the query. I think that since we’re converting the input value and leaving the (indexed) column alone in the WHERE clause, this isn’t going to have any pejorative impact on index use. Of course, if ActiveRecord is your hammer, you’re already used to the process of converting to UTC time for SQL queries.

The second situation is in converting the results back to local time. If you do:

SELECT started_at FROM readings;

then you’re going to get back the readings in UTC. However, if you do:

SELECT CONVERT_TZ(started_at, 'Etc/UTC', 'Europe/London') AS local_started_at
  FROM readings;

then you get the time back as a local time. If ActiveRecord is your thing, this is all handled inside it, so you don’t need to worry about it at the DB level.

The third situation is a little more interesting. Say you’re doing date based aggregation in SQL (grouping by YEARWEEK or WEEKDAY for example). If you do something along the lines of:

SELECT YEARWEEK(started_at) AS year_week_started_at, SUM(value) AS value
  FROM readings
  GROUP BY YEARWEEK(started_at);

then your results will be subtly wrong because you’re grouping into weeks based on the UTC time. In the case of BST, the hour at the start and end of the week is being allocated to the wrong week… So, instead we convert:

SELECT YEARWEEK(CONVERT_TZ(started_at, 'Etc/UTC', 'Europe/London')) AS year_week_started_at, SUM(value) AS value
  FROM readings
  GROUP BY YEARWEEK(CONVERT_TZ(started_at, 'Etc/UTC', 'Europe/London'));

Now we’re correctly grouping by local time.

From an application perspective, I would imagine setting up these queries to pass in the “local” time zone based on the user’s time zone setting — perhaps grabbed from the user’s model, or the session.

Of course, MySQL has a built in type called TIMESTAMP which does most of this work for us. It stores timestamps internally as UTC, but converts them on the fly into the session time_zone setting. I’m not sure if that safely covers the grouping behaviour in the third example. Sadly, being an ActiveRecord weenie, I think I’m stuck with the convention of using DATETIME columns and the UTC/local time conversion happening inside ActiveRecord. Which means that when I need to dig down for performant SQL queries with times, I need to be aware of time zone issues.

Things I still need to figure out (help?):

  • Whether somebody has already tweaked ActiveRecord to make use of time zone aware TIMESTAMP columns natively instead of DATETIME columns with internal conversion. Or, if not, how hard it would be to implement. I’d imagine having something that wraps the request and sets the time zone on the AR session(s) to the user’s preferred time zone, then carries on its queries as usual, but gets back a timestamp which includes the time zone information and creates a TimeWithZone instance from that…?

  • If there is any pejorative impact on index use with any of the changes to these queries. Or any other horrible performance implications.

  • If using TIMESTAMP columns solves the grouping issue covered in the third situation.

Chances are I’ll be ‘refactoring’ an application based on this research in the next couple of days, so there may be an update or two to this post as that happens. :)

Pulling an Espresso

Has Bean, my goto place for beans and guides to brewing coffee, doesn’t have a guide for pulling espresso (which is fair enough, I suppose, seeing as they’re brew guides!). So here’s what I do. I wrote this note in EverNote a few weeks back, and was showing it to Annabel this morning (in the hopes of getting a coffee in bed!). She suggested I publish it here.

I’m building espressos with a Rancilio Silvia, and I’ve got a Rancilio Rocky grinder without a doser. The grinder settings vary wildly between models, so test and adjust as necessary!

Numbers

  • Grinder 8
  • Dose 18g coffee beans. Our scales are a bit variable, so 18g can range all the way from ‘not enough’ to ‘too much’. Err on the side of too much; you can discard the excess levelling it off.

Method

  1. Run some water through the espresso machine into the cup. This warms up the portafilter a bit, and warms up the cup.

  2. Set the grinder to the right setting while it is still empty. Weigh the beans and dump into the grinder. Make sure none get stuck on the guard.

  3. Remove the portafilter and dry it off with a cloth. This stops any of the ground coffee from clumping.

  4. Grind the beans directly into the portafilter, giving it a wee shoogle now and then so the grinds stay relatively level.

  5. Give the portafilter a wee tap on a hard surface and level the grinds off with your finger — the grinds should be about level with the top of the portafilter at this stage.

  6. First tamp. I tamp on a set of bathroom scales for two reasons: I get some feedback on how much pressure I’m applying; and I don’t damage the kitchen worksurfaces! I aim to apply about 3 stone (40ish lb) pressure when tamping.

  7. Tap the side of the portafilter a little with the tamper to loosen up the edges.

  8. Second tamp. Much like the first. Sometimes I slightly rock the tamper pointing to North, East, South & West because I read somewhere that’s a good technique. Can’t tell if it makes any difference though. :)

  9. Dust any grinds off the top of the portafilter. Keep it nice and clean so that grinds don’t unnecessarily clog up the machine, or reduce pressure.

  10. Firmly reattach the portafilter to the espresso machine.

  11. Discard the water that was keeping the cup warm, and place under the portafilter.

  12. Pull an espresso. Ideally, you’d expect the coffee to start coming through after about 5 seconds, and for it to continue to pull the espresso for about 20 seconds. But my way of knowing the pull is “done” is to watch for the stream of coffee to change colour. It’ll change from a rich looking crema colour to a weak beige. That’s a good time to stop.

  13. Take a slurp of the crema from the cup right when it’s freshest. (Unless you’re making the coffee for somebody else. Oh, go on, even if it’s for somebody else, if they’re not watching!)

  14. Remove the portafilter and dump the grinds. Rinse the portafilter under the group head, then wipe them both clean. Replace the portafilter and you’re ready to pull the next espresso!

Things to note:

  • Once you get into the swing of it, you can time things right so that the Silvia is just at the right temperature when you’re pulling the espresso. I tend to pull enough water in step 1 that the boiler clicks on. By the time I get to step 12, the boiler has just clicked off. This seems to get a better pull of espresso.

  • Cleaning is a key part of the ritual! Make sure you’ve got a clean cloth kicking around, and keep the espresso machine clean, and free of grinds.

It seems like a very long winded process, but in reality it takes just a couple of minutes. In fact, it’s about the same length of time as boiling the kettle and making a cup of tea, something I often do in parallel.

What would you do differently?

All late projects are the same

This is the first in a series of posts where I actually digest and write about articles I’ve read elsewhere. What better way to understand something you’ve read than try to write about it? Of course, with this blog’s track record, it may well be a series of one. :)


Tom’s been working with software for the last 50 years. For the past couple of decades he’s been involved in supporting litigation for software project delivery failures. In this article, he eschews his original, pithy, reason for software projects being late:

I thought all late projects were the same in that they were really estimation failures, not performance failures.

He goes on to explain why that’s not really true — mostly because people never really do estimations in the first place:

This was cute but not very accurate at a deep level because so many projects don’t really do any estimating at all. Rather, they propose a goal and then get someone to espouse it as an estimate. Delivery by January of next year? Sure, why not?

Sounds about right in my experience. In fact, it rings true for a current project, in that the delivery is set for July 2013 and we’re all to work to that target. I’ve said, “well, let’s see what we can build in ~70 days of effort.” I mean that we need to vary the scope (never the quality!) if the schedule is fixed. I’m being up front about that (I hope).

Of course, I never like to bring up estimation as an excuse for late delivery because I hate estimating and I’m terrible at it. ;)

But Tom goes on state a simpler maxim for why software projects are late:

All projects that finish late have this one thing in common: they started late.

Personally, I’d question whether this rule is peculiar to software projects. It seems generally true. Which then gets you to wondering why software projects still have a peculiar tendency to be late.

Who says software projects tend to be later than other projects, anyway? Anecdotally, if you watch Grand Designs every single building project turns up massively late and over budget. There have been one or two high profile non-software delivery failures just in Edinburgh recently, too.

Anyway, Tom then posits three reasons for projects being kicked off late:

  1. Nobody had the guts to kick off the project until the competition proved it doable and desirable; by then, the project was in catch-up mode and had to be finished lickety-split.

  2. If the project were started long enough before its due date to finish on time, all involved would have had to face up to the fact from the beginning that it was going to cost a lot more than anyone was willing to pay.

  3. No one knew that the project needed to be done until the window of opportunity was already closing.

He called BS on the third one — citing Google as being 15 years “too late” to the search engine market — and suggested that it really disguises one of the first two.

The first reason is a clear business failure, and attaching responsibility for the failure to the software delivery team does seem somewhat like blame shifting. Of course, the company that kicks off risky projects before they’ve identified a market need (before it’s been proved “doable and desirable”) is often the company that doesn’t survive long enough to get into litigation with their suppliers, so there’s an element of self selection in Tom’s samples, I suspect.

The second one is the most compelling, to my mind. If a project has a 100x return on investment, people aren’t going to get too upset if it takes twice as long as originally estimated; you’re still getting a 50x ROI, which isn’t to be sniffed at. But if the project only has a 10% return on investment, lateness will be disastrous!

Maybe the rule is that the rigour of estimation should be inversely proportional to the expected return on investment?

He finishes off with a nicely quotable line, so I will too:

What’s really wrong with us software folks is that we’re continually beating ourselves up for something that’s somebody else’s fault.

No, not that one. While I get the sentiment, and I totally get that we shouldn’t be beating ourselves up over other people’s failures, it still sounds an awful lot like:

A big boy done it and ran away.

Shower Power

A few weeks ago, I watched a talk from CUSEC 2012 by Bret Victor called Inventing on Principle where he described the principle by which he lives his life. You should go watch the talk now, it’s awesome. I’ll wait.

Done? Wasn’t it a great talk? That thing with the game, and the future paths of the character … wasn’t that just awesome? I got a bit lost on the overall point of the talk (you should have a guiding principle and you should live your life by it), because I loved Bret’s guiding principle; the whole idea of immediate connections really resonated with me.

Then I realised another thing that’d been on my todo list to read (because it was too beautiful to shove into Instapaper) was Bret’s work too: Up and Down the Ladder of Abstraction. It’s all about gaining a deeper understanding of a concept by exploring it at different levels of abstraction. But again, skipping the point itself, take a look at and interact with the examples. Instead of some dry, boring graphs, or even some sexy (but meaningless) infographics, there are visualisations of the model that you can play with. And playing with these models helps you properly understand how variables interact, helping you to deeply understand the model.

Then I read another of his articles, the Scrubbing Calculator and something went ping in my head. (More on that another day, perhaps.)

I had to try it out on a wee project. It turns out that Bret also has a Javascript library called Tangle which helps you to create “reactive” documents. These reactive documents allow you to create a narrative flow around a model, letting users pick numbers from the model and change them (by dragging), which is then immediately reflected throughout the rest of the narrative. So the readers of your document get instant feedback on the impact of changing the input values on your model. That’s kinda cool.

As it happens, my son had spent … rather a long time in the shower that morning. We had a wee bit of a “discussion” about the money that was being wasted, but I wasn’t really able to communicate the consequences effectively (“it must be costing a fortune!”, I said to my wife, Annabel). Aha, I have my model. And so Shower Power was born.

It took me a wee while to research the model and understand how to get from “Malcolm spends 30 minutes in the shower” to “which costs £0.61″. Either I wasn’t paying attention in Standard Grade Physics at school, or I haven’t used those brain cells in the intervening 18 years. (Was it really that long ago?!) I got there in the end, and I’ve outlined the algorithm on the about page.

It works, and I rather like it. Have a play around and let me know what you think? I like the idea that there’s some narrative around it, rather than it just being a table of numbers. Given a little bit more thought, the narrative could be better crafted to bubble the important variables closest to the top, and defer the rarely used ones further down. I particularly like that it’s not just numbers you can change, but discrete options too (e.g. how often you shower).

With the people I tested it on, it clearly wasn’t obvious that you could pick an underlined number and drag it to change the effect – maybe a bit of signposting with guiders.js would have helped there?

I was particularly surprised by one of the variables in the model: it doesn’t matter what temperature your boiler heats the water to (well, it has to be higher than the desired shower temperature!). If I hadn’t played around with the model in this way, I’d never have noticed that – in fact, I spent a couple of hours debugging why changing that value didn’t change the results! So now we set the boiler temperature that bit lower, to reduce the chances of scalding from hot taps.

Anyway, it was a fun experiment, and I’m quite pleased with the results.

I’m feeling inspired to build something awesome (I’d tell you, but … bad experiences with that in the past … let me build an MVP first) with this sort of idea in mind. What do you reckon? Could be awesome? Crazy? Just a bit meh?

(Coming soon, a follow up post on how I built what is effectively a single-page static site with Rails (because when your only tool is a hammer, everything is a thumb) and how I used it as an excuse to explore the asset pipeline, efficient client side caching and CDNs.)

Retrieving BigDecimals from a database with Anorm & Scala

I spent a wee while yesterday kicking the tyres on the Play 2.0 framework with Scala. Aside: I think that concurrency/asynchronous processing is really important in the age of multi-core CPUs, and that Akka’s actor-based system is really interesting, and Play uses Akka 2 under the hood, so I’m becoming really interested in becoming proficient in Scala & Play 2. That said, so far I’m struggling to come to terms with Scala’s type system, and it reminds of the struggles I had with ML at University, so this could be an uphill battle!

Anyway, that’s not what this is about. The toy application I’m trying to build (which I’ll share if and when I’m done!) involves storing locations (a latitude and longitude) as decimals. I’ve got my table (stored as an evolution) along the lines of:

CREATE TABLE sites (
  id SERIAL PRIMARY KEY,
  name VARCHAR(255) UNIQUE NOT NULL,
  latitude NUMERIC NOT NULL,
  longitude NUMERIC NOT NULL,
  UNIQUE(latitude, longitude)
);

(I’m using PostgreSQL.) And here’s the first implementation of the model, app/models/Site.scala, which just has a case class to encapsulate the table data, and a singleton object with a finder to return a sequence of site (so that I can list them in a table):

package models
import play.api.db._
import play.api.Play.current
import anorm._
import anorm.SqlParser._

case class Site(
  id: Pk[Long],
  name: String,
  latitude: BigDecimal,
  longitude: BigDecimal
)

object Site {
  val allFieldsParser = {
    get[Pk[Long]]("sites.id") ~
    get[String]("sites.name") ~
    get[BigDecimal]("sites.latitude") ~
    get[BigDecimal]("sites.longitude") map {
      case id ~ name ~ latitude ~ longitude =>
        Site(id, name, latitude, longitude)
    }
  }

  def findAll(): Seq[Site] = {
    DB.withConnection { implicit connection =>
      SQL("SELECT * FROM sites").as(Site.allFieldsParser *)
    }
  }
}

(Totally cargo-culted, BTW.) This means that in the controller I should be able to call Site.findAll() and get back a sequence of sites I can then use to map to HTML fragments. Win.

But I’m getting a compilation error! In the browser, it’s saying could not find implicit value for parameter extractor: anorm.Column[BigDecimal] and pointing to the error being on get(BigDecimal)("sites.latitude") in the parser.

I couldn’t find any documentation on retrieving decimals from a database with anorm (which is mostly why I’m writing this) and looking through the source code didn’t exactly elucidate. However, after a bit of trial and error, I discovered that it works if I add:

import java.math.BigDecimal

at the start of the model file.

So, if you’re having trouble making BigDecimal columns work in your Anorm-based models, remember to explicitly import java.math.BigDecimal. Hopefully this will find its way into Google, making the next Scala newbie’s introduction a little less painful. :)

I’d love if somebody could explain why. Perhaps it would improve my understanding of Scala!

Update Aha, I have a theory. I think that, by default, scala.math.BigDecimal is available in my model’s scope. And scala.math.BigDecimal isn’t a subtype of java.math.BigDecimal, which is what Anorm deals in. So importing the Java version overrides the Scala version and makes the type system happy.

Now I’m wondering: why the two unrelated (in type-land anyway) versions of BigDecimal? Which should my apps be using? Which should Anorm be exposing?

Today I Learned: Vim command line fu

OK, so this was “Thursday I learned” but I figured I should write it down before I forget again. One of my frustrations with Vim is better learning to use the command line. In this particular instance, I had been searching for a phrase inside a single file and I wanted to instead search for it with git grep (using :Ggrep from the awesome fugitive.vim). Of course, I didn’t want to retype the thing I’d already been searching for. Turns out there are a couple of viable options:

  • ctrl-r / will insert the last search pattern. So, having searched for something within a file and now wanting to search for it throughout the repository, I could do :Ggrep ctrl-r /<cr>. I think I want to turn that into a shortcut of some variety…

  • The other possibility is to insert the word under the cursor. ctrl-r ctrl-w will insert the “word” under the cursor and ctrl-r ctrl-a will insert the “WORD” under the cursor (with ‘word’ and ‘WORD’ meaning what they usually do).

Here’s a random bunch of other useful expansions:

  • ctrl-r " will insert the contents of the unnamed register (i.e. the last thing you yanked or deleted without specifying a register).
  • ctrl-r + will insert the clipboard contents.
  • ctrl-r % will insert the current filename.
  • ctrl-r ctrl-p will insert the filename under the cursor, expanded in the same way as gf does. This could be particularly useful with filetype plugins that extend the behaviour of gf (like Rails.vim).

It was useful to read through the rest of the command line reference to reinforce the rest of the command line movement keys, too. So far I’d mostly just been mashing keys, assuming it behaves a bit like bash command line editing. Mostly I was right:

  • ctrl-b (not ctrl-a) to get to the start of the command line buffer.
  • ctrl-e to get to the end of the command line buffer.
  • ctrl-w to delete the word before the cursor.
  • ctrl-u to delete the characters from before the cursor to the start of the line (in other words, ctrl-e ctrl-u will delete the entire contents of the command line).
  • ctrl-c to get safely out of the command line.

Update Drew, from Vimcasts fame, has pointed out that ctrl-r works in insert mode too. That’s mind-blowingly useful. For example, you can use ctrl-r " in insert mode to paste something without exiting insert mode. I wish I’d already known that!

Symlink corruption on Mac OS X

Mac OS X on my desktop computer (a newish 27″ iMac, using a Promise Thunderbolt disk array for the root filesystem) seems to be having filesystem troubles. I notice it through symlinks going awry, though I’m sure they’re not the only victim. I tidied all the errant symlinks up two weeks ago, hoping it was a temporary glitch, but they’re back again today. Here’s an example:

> find -L /System -type l -print0 |xargs -0 ls -l
lrwxr-xr-x  1 root  wheel  24 15 Jan 09:42 /System/Library/Frameworks/ApplicationServices.framework/Frameworks/CoreGraphics.framework/Headers -> >File</string>????<key>L
lrwxr-xr-x  1 root  wheel  24 15 Jan 09:42 /System/Library/Frameworks/ApplicationServices.framework/Frameworks/HIServices.framework/Headers -> ?6?s?A??]h?_?:d9?r?
lrwxr-xr-x  1 root  wheel  24 15 Jan 09:42 /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/CoreGraphics.framework/Headers -> >File</string>????<key>L
lrwxr-xr-x  1 root  wheel  24 15 Jan 09:42 /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/HIServices.framework/Headers -> ?6?s?A??]h?_?:d9?r?
lrwxr-xr-x  1 root  wheel  24 15 Jan 09:42 /System/Library/Frameworks/ApplicationServices.framework/Versions/Current/Frameworks/CoreGraphics.framework/Headers -> >File</string>????<key>L
lrwxr-xr-x  1 root  wheel  24 15 Jan 09:42 /System/Library/Frameworks/ApplicationServices.framework/Versions/Current/Frameworks/HIServices.framework/Headers -> ?6?s?A??]h?_?:d9?r?

Each of those symlinks are pointing to some garbage. (Interestingly, the garbage quite often looks like the partial contents of a plist file.)

Here’s another example, and this is one I remember fixing last time:

lrwxr-xr-x  1 root  wheel  27 12 Nov 19:06 /System/Library/Frameworks/JavaVM.framework/Frameworks -> Versions/Current/Frameworks
lrwxr-xr-x  1 root  wheel  24 15 Jan 09:43 /System/Library/Frameworks/JavaVM.framework/Headers -> Versions/Current/Headers
lrwxr-xr-x  1 root  wheel  23 12 Nov 19:06 /System/Library/Frameworks/JavaVM.framework/JavaVM -> Versions/Current/JavaVM
lrwxr-xr-x  1 root  wheel  26 15 Jan 09:45 /System/Library/Frameworks/JavaVM.framework/Resources -> Versions/Current/Resources
lrwxr-xr-x  1 root  wheel   1  8 Jan 14:57 /System/Library/Frameworks/JavaVM.framework/Versions/Current -> c

The problem here isn’t the first four symlinks – they’re all pointing to the right places – but the last one (which they’re all pointing through) which is pointing to ‘c’, not ‘A’ like it should.

The symlink targets all seem to be the right length, just the wrong characters.

How do I go about communicating with Apple about the problem so I can get it resolved? It doesn’t really seem the sort of thing I can take to a Genius Bar…