Create your own bit.ly using Base58

Zoomin
Following the example of bit.ly, tinyurl, owl.ly, even coca-cola has joined the short URL bandwagon. Maybe it's because of Twitter, maybe the world is focusing on mobile devices or just saving bandwidth. In any case, it's not that hard to develop your own.

The implementation is quite straightforward: store the underlying URL in a database and return your primary key. This primary key when displayed as a number (Base10) is longer than strictly necessary; since each character can hold more than numbers, you are wasting space by constraining yourself to a decimal [0-9] range.

Enter Base58

Base58 is what you get after taking Base62 [a-zA-Z0-9] and removing any character that may induce to error when introduced by hand: 0 (zero), O (uppercase 'o'), I (uppercase 'i'), and l (lowercase 'L'). This concept was introduced to the general public by Flickr, which uses the following String:

123456789abcdefghijkmnopqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ

To me, it makes more sense to use the natural order in the ASCII chart:

123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz

I will post here an example of the latest. For a Flickr implementation you can go here.

public class StringUtils {

 private static final char[] BASE58_CHARS = 
     "123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz".toCharArray();

 public static String numberToAlpha(long number) {
  char[] buffer = new char[20];
  int index = 0;
  do {
   buffer[index++] = BASE58_CHARS[(int) (number % BASE58_CHARS.length)];
   number = number / BASE58_CHARS.length;
  } while (number > 0);
  return new String(buffer, 0, index);
 }

 public static long alphaToNumber(String text) {
  char[] chars = text.toCharArray();
  long result = 0;
  long multiplier = 1;
  for (int index = 0; index < chars.length; index++) {
   char c = chars[index];
   int digit;
   if (c >= '1' && c <= '9') {
    digit = c - '1'; 
   } else if (c >= 'A' && c < 'I') {
    digit = (c - 'A') + 9;
   } else if (c > 'I' && c < 'O') {
    digit = (c - 'J') + 17;
   } else if (c > 'O' && c <= 'Z') {
    digit = (c - 'P') + 22;
   } else if (c >= 'a' && c < 'l') {
    digit = (c - 'a') + 33;
   } else if (c > 'l' && c <= 'z') {
    digit = (c - 'l') + 43;
   } else {
    throw new IllegalArgumentException("Illegal character found: '" + c + "'");
   }

   result += digit * multiplier;
   multiplier = multiplier * BASE58_CHARS.length;
  }
  return result;
 }
}

An example of the expected output size:

44 = m
1431117682956369 = abc123ABC
// Long.MAX_VALUE
9223372036854775807 = CFq8pKn6mQN

This is hardly rocket science. You can find libraries to do this for ruby, javascript, PHP or even perl.

It's Base58 everywhere

Once you start doing this, it's nothing short of addictive. You can configure your web framework to use a Base58 converter for the key attributes in your URLs and start using /shows/cr5W instead of /shows/234324323423.

Everybody is doing this right now: Flickr, Facebook or YouTube (which seems to be using [0-9a-zA-Z_-]).

News - News - News

Last week our SimpleDS pet project got referenced by Google AppEngine Blog, and next April we are bringing the Javaspecialist Master Course to Madrid!

It's going to be awesome. This is the most advanced Java course I can refer to, and after some conversations with Dr. Kabutz he agreed to bring it to Madrid. It consists of four days of intense performance tuning, concurrency debugging, introspection, memory profiling and some neat, challenging Java puzzles. If you feel interested, you should consider to join us in April!