Hypermedia APIs, Part Two

Posted in Articles, Development, Ruby, Web

Last time, I treated you guys to some Solomonesque baby‐splitting between Steve Klabnik and DHH, and then spent two dozen paragraphs talking about how Gowalla’s API was pretty groundbreaking and how Scott Raymond was like the Lou Reed of hypermedia APIs.

To balance out all this ridiculous self‐praise, I’ll talk about one of our screwups: how a well‐intentioned decision to add a feature led us into a practical dilemma within our API. This wasn’t something I worked on myself, but this is my rough recollection of how it happened.

The idea

As you’ll recall from part one, our spot URLs were quite simple: http://gowalla.com/spots/16384. One day, one naïve day, someone said, “Hey, wouldn’t it be great if our spot URLs had the name of the spot in them? If the URL for the Wahoo’s down the street was /spots/wahoos-fish-tacos-austin instead of /spots/1019?”

This is easily justifiable as an SEO decision — when someone searches for a venue, you’d want the Gowalla page for that venue to be on the first page of results, and that’s more likely to happen when the venue name is in the URL. It’s also justifiable on usability grounds; if I’ve been to the Wahoo’s page on Gowalla, and I’m trying to get back to it, I know I can type gowalla.com/spots into my browser and let the URL auto‐complete tell me where to go from there. (Though today’s Firefox‐style address bars make this a less compelling argument.)

We decided to generate a unique slug for each spot by downcasing, removing punctuation, replacing spaces with hyphens, adding the city name to the end, and (if necessary) adding further components to the end until the slug was unique: state, address (if we knew it), postal code, and finally the spot ID, should all else fail to disambiguate it.

We had some logic in our routes so that when the app saw /spots/1019, it would know to look the spot up by its ID, but when it saw /spots/wahoos-fish-tacos-austin, it would know to look the spot up by its slug. So that we had a hard‐and‐fast rule to separate these cases, we made sure that a slug wouldn’t start with a number. As I recall, this was all Mattt Thompson’s handiwork, and he brought his usual brilliance to the task.

We’d generate the slug immediately after spot creation. I’m guessing we wrote a Resque job to add slugs to spots that already existed. And when all spots had slugs, we triumphantly turned it on. And then, a few hours later, we turned it off, somewhat less triumphantly.

Our blunder

If you read part one, you might already suspect what happened. In our API, URLs are the primary identifiers of resources. If you asked the API for a particular checkin at Wahoo’s, the response would have a spot object whose URL would be /spots/1019. When we changed a spot URL, we changed that unique identifier.

Now, because we were big on HATEOAS, this wasn’t a huge deal. And spots, in particular, were natually linked to from other resources. You wouldn’t ever start an API-related task on an individual spot page; you’d probably start at /users/me (the magical URL for getting info about the logged‐in user) or at /spots (the URL for doing a spot search). And even if you had an old spot URL lying around, it would work fine, because we still responded to the ID‐based URLs. So as long as we made the switch all at once, this was not a big deal.

Unfortunately, we didn’t make the switch all at once. Though we had updated the logic on the spot model so that asking for the URL would return the slug‐based URL, we forgot about the one place where we took a shortcut.

Visited spots

When I gave you the excerpt from a sample call to /users/savetheclocktower in part one, I left in a weird property: something called visited_spots_urls_url. This was a URL that, when requested, would return a list of URLs for all the spots that a user had checked into.

// GET /users/savetheclocktower/visited_spots_urls
{
  "urls": [
    "/spots/1019",
    "/spots/15555",
    "/spots/91142",
    // ...
  ]
}

Why on earth did we have this? Because there are plenty of use cases for a unique list of all the unique places a user has been without any time/frequency context. For one, we used it in the mobile clients so that, in a list of spots, we could mark the spots you’d checked into before, hopefully helping you pick the correct spot out of a list so that you could check in just a bit faster.

Maybe we deserved it for having a magical API resource that returned a list of indeterminate size, but this was a solution that had worked for us so far.

The first time someone asked for this resource, we made a database query (on the stamps table, not the gigantic checkins table, thank god) to obtain the result. Then we stuffed it into memcached so that we wouldn’t have to make that awful database query again until the user checked into a new spot.

Except, well, not quite. We didn’t put spot URLs into memcached; we put spot IDs into memcached, because we wanted that collection around for server‐side purposes, too, and on the server side you’d rather be working with IDs than URLs. To go from spot IDs to spot URLs, we did this:

def uniq_visited_spot_urls
  uniq_visited_spot_ids.map { |id| "/spots/#{id}" }
end

I took the scenic route, but here’s our problem: to make a switch to slug‐based URLs, we needed the whole API to use the same logic for generating spot URLs. Except that logic lived on the spot model itself, and if you had just a spot ID (as in this example) there was no quick way to turn it into a spot slug.

So even after we flipped the switch, /users/me/visited_spot_urls returned old‐style spot URLs because it was taking a shortcut for performance reasons. A couple people complained on the API mailing list; someone in the office said, “hey, the app isn’t showing check marks next to spots I’ve checked into”; and then we figured out the problem and switched back to the old spot URLs until we could think this whole thing through.

Stuck

The problem I described above wasn’t a deal‐breaker. We could’ve figured out a better way to cache spots, or we could’ve just maintained separate cache keys for visited spot IDs versus visited spot URLs.

We never did go back to the slug URLs, though. As far as I know, we just got busy with other stuff. But I know that the more I thought about this problem, the more I felt that there wasn’t a good way around it. Three concerns fenced us in:

  1. In the API, a resource’s URL was both its unique identifier and a hyperlink. It was impossible to change one without changing the other.

  2. For the purposes of browsing gowalla.com in a web browser, it was in our interest for spots to have long, informative URLs.

  3. For the purposes of using Gowalla as an app, it was in our interest for spots to have short, bare‐bones identifiers.

This was the main tension. There are some clever ways around this, but none that are quite clever enough:

  1. Try to satisfy all three concerns at once. Stick with the short URLs for the API. But if someone requests /spots/1019 in a browser, just do a 302 redirect to /spots/wahoos-fish-tacos-austin. Assuming this would have the same impact on SEO, it certainly seems like a prudent solution. But to do a proper redirect, we’d first have to look up the spot to find its slug. So we’re talking about a round trip to either a database or a caching layer before we know which URL to redirect to. Even in the best‐case scenario, it would add a perceptible lag to page load, and that’s no fun.

  2. Forget about concern #1; separate the hyperlink from the unique identifier. Stick with the short URLs for the API. But put a full_url property in the JSON response for spots. OK, but now there’s just one resource for which we have to maintain this dichotomy, and I doubt API consumers would bother with the full URL if the short URL still worked.

  3. Forget about concern #3; just move to long URLs everywhere. Except a slug‐based URL would be twice as long as an ID‐based URL, maybe more. For most API calls, that’s a small (but still significant) increase. But our /users/me/visited_spots_urls response would be twice as large. This is probably the least distasteful of these three options, but mobile apps always have to be sensitive about the amount of data they’re sending over the wire, and it would make no one happy to bite that bullet.

Let’s also remember that any client‐side caching involving spot URLs (and, yes, the Gowalla app did a lot of that) is invalid now that we’ve made this switch.

And one more groan‐inducing thing: what happens when a spot’s name changes? Gowalla spots could be created by anyone, and it wasn’t uncommon for someone to make a typo when creating a spot name. We had a network of volunteer super‐users who spent their time renaming Starbuck’s to Starbucks, Barnes & Nobles to Barnes & Noble, Chervon to Chevron. And the name is part of a spot slug. So we either change the slug when the name changes, even though cool URIs don’t change; or we say that the first slug is the only slug, and that gas station will forever bear the URL /spots/chervon-giddings-tx through no fault of its own.

Or we forget about putting the spot name in the URL — and then our problem goes away.

The answer

So, yeah, maybe the answer was not to do the thing in the first place. Or maybe the answer was to do it, but not until we released the next version of the API in order to break stuff as little as possible. Or maybe the answer was to plow ahead, consequences be damned.

None of this exposes a critical flaw with hypermedia APIs. These were mundane practical problems caused by previous decisions that we ourselves made. This is merely a story about how you can do all the right stuff and still get bitten by those Rumsfeldian unknown unknowns.

Comments