Universal Hashing

Even a clever hash function is vulnerable to bad expected-case performance.

Any fixed hash function has some set of keys that will result in a pile up.

We can use randomness to get around this, mostly.

Resulting algorithm is the only truly provably good hashing scheme (uniform hashing assumption is a crock), but I don't think anyone uses it in practice.

Like using random pivots in quicksort, the idea is that we'll pick a random hash function from a family of hash functions.

If the family has a particular ``universal'' property (not difficult to have), we can prove a true expected running time for worst-case inputs.

The critical property is that any pair of keys collides in just 1/m of the possible hash functions.

Not always useful in practice, because often we use our hash table only once (and the randomization, in this case, only occurs once per use).

Still, gives us a warm, fuzzy feeling.


next up previous
Next: DYNAMIC OPERATIONS Up: ANALYSIS Previous: Punchline