Prove Markov’s Inequality: If X is any random variable and a > 0, then Pr( |X| ≥ a) ≤ E( |X| )/a. Show how this inequality can be applied to Theorems 1 and 5.3.
Theorem 1
If N balls are placed into M = N2 bins, the probability that no bin has more than one ball is less than
.
Proof.
If a pair (i, j) of balls are placed in the same bin, we call that a collision. Let Ci,j be the expected number of collisions produced by any two balls (i, j). Clearly the probability that any two specified balls collide is 1/M, and thus Ci,j is 1/M, since the number of collisions that involve the pair (i, j) is either 0 or 1. Thus the expected number of collisions in the entire table is
. Since there are N(N – 1)/2 pairs, this sum is N(N – 1)/(2M) = N(N – 1)/(2N2) <
. Since the expected number of collisions is below
, the probability that there is even one collision must also be below 
Theorem 2
If N items are placed into a primary hash table containing N bins, then the total size of the secondary hash tables has expected value at most 2N.
Proof.
Using the same logic as in the proof of Theorem 1, the expected number of pairwise collisions is at most N(N – 1)/2N, or (N – 1)/2. Let bi be the number of items that hash to position i in the primary hash table; observe that
space is used for this cell in the secondary hash table, and that this accounts for bi (bi – 1)/2 pairwise collisions, which we will call ci. Thus the amount of space used for the ith secondary hash table is 2ci + bi. The total space is then
. The total number of collisions is (N – 1)/2 (from the first sentence of this proof); the total number of items is of course N, so we obtain a total secondary space requirement of 2(N – 1)/2 + N < 2N.
We need at least 10 more requests to produce the solution.
0 / 10 have requested this problem solution
The more requests, the faster the answer.