Making a hash of things, part 2: Multi-part keys

Last year, I wrote about hash tables for JavaScript. This was before I knew about the plans to implement a native hash table for JavaScript called WeakMap in Mozilla Firefox.

WeakMaps are awesome.  They completely obsolete the need for that hashStringKey hack I came up with last year.  When you combine them with JavaScript proxies, you can get something even more awesome called a membrane.  (I didn’t make up the name.)  Through the membrane you can ensure a proxy either returns a primitive value (number, string, boolean, etc.), or another proxy.  In fact, if two underlying “native” objects A and B would refer to each other, through the membrane you can ensure the proxies for them, pA and pB, refer not to A or B, but to each other:  pA.b == pB, and pB.a = pA.

A couple things you can’t do with WeakMap

First, keys cannot be primitive values.  You can’t say:

var map = new WeakMap;
map.set("attributes", []); // throws exception because "attributes" isn't an object.

It just doesn’t work.  I asked Jason Orendorff about it, and the reason has to do with garbage collection.  Simply put, weak maps hold references to their keys loosely:  when no one else knows the key, the JavaScript engine can safely erase the key and the value.  With objects, that’s easy:  they’re unique.  When you copy them, you get a distinct object that isn’t equal to the original.  With primitives like a simple string, that’s not so easy:  you can lose all reference to the original string in memory, but hard-code that string elsewhere.  The weak map would have to remember it.  WeakMap currently deals with the problem by forbidding primitive keys.

Second, it’s one key to one value.  That’s what hash tables are, and what they generally should be.  But there’s really no concept of a two-part key, as there is in public key encryption.  Nor a three-part key, nor an n-part key.  So there’s no way to say that any two keys are related to each other.  Think of a two-dimensional grid: each cell has a row and a column.  The row and the column combine to form a key where you can look up a value for the cell.

My very experimental solutions

For the first problem, I implemented a brute-force primitive-to-singleton-object class, PrimitiveKeySet. It creates an object for every primitive it sees (thankfully, you have to pass it a primitive first), and returns that object. I also implemented a WeakMapWithPrimitives function, which leverages PrimitiveKeySet and wraps around the WeakMap API.  It doesn’t solve the memory leakage problem – nothing can, really – but it at least lets me use primitive keys.  I also tried to be a little smart about it:  when you tell it to delete a primitive key, it really does.

For the second problem, I did a little bootstrapping, using a tree analogy.  I started with a WeakMap (the “root map”).  I used the first part of the key to assign another WeakMap (call this a “branch map”) as a value stored in the root map.  I would use the second part of the key to assign another WeakMap to the branch map.  I repeated this over and over again until I reached the last part of the key and the last WeakMap I needed (the “leaf map”, for lack of a better term). At that point I assigned the user’s value to the last key part, on the leaf map.

I could easily say, then:

var map = new CompositeKeyMap(["row", "column"]);
map.set({
  row: 3,
  column: 4
}, "Row 3, Column 4");

I took it one step further, and added a shortcut, the PartialKey.  With a PartialKey, I wouldn’t have to specify every field, every time:

var map = new CompositeKeyMap(["row", "column", "floor"]);
var floorKey = map.getPartialKey({floor: 6});
floorKey.set({
  row: 3,
  column: 4
}, "Row 3, Column 4");
map.get({row: 3, column: 4, floor: 6}) // returns "Row 3, Column 4"

All of these you can see on my main web site under libraries/CompositeKeyMap, and with a Jasmine test suite.

Should you use these?

If you don’t intend to require Mozilla Firefox 6+, probably not.  This code won’t work without that, and there are no fallbacks.

If you want to use partial keys, I would think the CompositeKeyMap is very useful indeed.  I’d recommend one key you specify be for objects only, at least:  otherwise you might as well just use an ordinary JavaScript object as your map: {}.  Whether it should be the first or the last for memory efficiency, I can’t tell you.

I don’t see much use for WeakMapWithPrimitives, to be honest.  I did that as a proof-of-concept only, a stepping stone to the CompositeKeyMap.

Thanks for reading – and feel free to play with it or review the code as Mozilla Firefox 6 launches next week.  Comments welcome!