Photo: John-Morgan

ThRaSH'2011 Keynote

The Power of Tabulation Hashing

**Abstract**
Randomized algorithms are often enjoyed for their simplicity, but the
hash functions used to yield the desired theoretical guarantees are
often neither simple nor practical. Here we show that the simplest
possible tabulation hashing provides unexpectedly strong guarantees.
The scheme itself dates back to Carter and Wegman (STOC'77). Keys are
viewed as consisting of \(c\) characters. We initialize \(c\) tables
\(T_1, \dots, T_c\) mapping characters to random hash codes. A key
\(x=(x_1, \dots, x_c)\) is hashed to
\(T_1[x_1] \oplus \cdots \oplus T_c[x_c]\), where \(\oplus\) denotes xor.
While this scheme is not even 4-independent, we show that it provides
many of the guarantees that are normally obtained via higher
independence, e.g., Chernoff-type concentration, min-wise hashing for
estimating set intersection, and cuckoo hashing.
We shall also discuss a twist to simple tabulation that leads to
extremely robust performance for linear probing with small buffers.

**Biography**
Mikkel Thorup has a D.Phil. from Oxford University from 1993. From
1993 to 1998 he was at the University of Copenhagen. Since then he
has been at AT&T Labs-Research. He is a Fellow of the ACM and of AT&T,
Member of the Royal Danish Academy of Sciences and Letters,
and co-winner of the 2011 MAA Robbins Award. His main work is in
algorithms and data structures and he is the editor of this area for
the Journal of the ACM. One of his best-known results is a linear-time
algorithm for the single-source shortest paths problem in undirected
graphs. Mikkel prefers to seek his mathematical inspiration in nature,
combining the quest with his hobbies of bird watching and mushroom
picking.

Last updated: Thu, 30 Jun 2011 23:54