A sparse uintptr_t-sized array implemented as efficient trie

This is a WIP to implement a structure allowing to have a practically unbound array: mapping of uintptr_t to uintptr_t - something conceptually like:

uintptr_t array[UINTPTR_MAX];

typically used to associate a pointer to some structure with some int-sized index, being simple in use (not having to deal with optimal hash functions, changing number of buckets on growing, etc.).

Array is implemented as a digital tree (or trie), that is, 16-ary tree - this means that, for 32-bit words, the tree in worst case will have 8 levels depth (8 cache-line misses) - and that is upper bound (measures are taken to lower it in common cases). In contrast, red-black tree will achieve such depth already at very small number of entries (about 256). Implementation is inspired by Judy arrays and intended to be used like them, but in FreeBSD kernel.

Why just not port Judy itself to kernel? There are several reasons:


The data structure must:

Design overview

Since typical usage is to store pointers, value of NULL/zero is used to mark absence of value - this is limitation of interface. NULL values are not actually stored.

All used data structures have the size of 16 machine words (thus the name "hextree") - so 64 bytes on 32-bit architectures and 128 on 64-bit ones. That sizes are also a typical cache-line size on those architectures, thus all structures are allocated (from UMA zone) cache-line aligned (in cases when cache-line sizes differ, these still are usually multiples of each other - so still good).

The fact structures are always size-aligned gives us following observation: the low 6 bits of pointer to structure are always zero (7 bits on 64-bit architectures, but for portability we have only 6). So we can utilize a trick from Judy: pack important information into these bits, like:

  1. Type of structure we point to (leaf or intermediate node).
  2. In some cases, next hex-digit of path, saving us from storing intermediate node with just one key.

What is path? Well, in tries / digital trees the keys them are usually not stored: the key for a given value is inferred by it's position in the tree, that is, the path from root node. In a straightforward implementation of 16-ary tree, you do all nodes as single type - array of 16 words, getting always 8 levels to reach all values. Very memory consuming, of course (so digital trees are rarely used in practice). But observe that array is sparse, and there will be typical situation when some intermediae node holds just one pointer, other are zero - the whole purpose of it is just to provide you next hex-digit of path, and then you go to next level. Such nodes can be eliminated, decreasing tree depth and cache-line fills, if you store this digit directly in parent node's pointer.

What other compressing technique can be used, while staying simple in implementation? If, on some level, you have very small number of children, it doesn't matter, one or more of next levels - you could store them directly in this node, which will be able to store up to 8 key-value pairs (remember, our leaves are just machine words, they're small, we could store them right here). This is not limited to final leaf values, but could also be used for intermediate nodes, provided we store number of significant digits in key (this limits us to max 16 levels, though - but hope at the times of 128-bit machines better structures will be invented already). Time to linear scan 8 pairs is still much faster than dereferencing pointer with possible cache miss.

So we end up with following structure types:

We know type of each node via least-significant bits of pointer to it, because all space in nodes is used for data, type itself is not stored there. This leads us to requirement for user code to pass pointer to pointer to root node (user stores just one pointer to Hextree), and roughly the following algorithm of work:

The HEXTREE_TYPE_NODE_8 stores the keys of key-value pairs with least-significant digit of key indicating how many of other key hex-digits make a path, or, in other words, how deep (which level) is node pointed to. Because HEXTREE_TYPE_NODE_8 already specifies full path to next node, both pointers to it or from it don't need to use compressing technique of placing one path's hex-digit to low bits of pointer. The same applies to pointers to HEXTREE_TYPE_LEAF_8. This leads us to the following possible values of least-significant bits of pointer in Hextree:

#define HEXTREE_TYPE_LEAF_8         0
#define HEXTREE_TYPE_LEAF_16        1
#define HEXTREE_TYPE_NODE_16        2
#define HEXTREE_TYPE_NODE_8         3
#define HEXTREE_TYPE_LEAF_16_DIGIT (HEXTREE_TYPE_LEAF_16 << 4)  // with digit: 16 .. 31 = 01xxxx
#define HEXTREE_TYPE_NODE_16_DIGIT (HEXTREE_TYPE_NODE_16 << 4)  // with digit: 32 .. 47 = 10xxxx


Hextree (last edited 2013-01-27 04:21:21 by VadimGoncharov)