Came across Bloom filter during a presentation, and it seems that I am the only one who doesn’t know it. (Bummer.)

The above wiki page is pretty straightforward, and I could follow most of it except the derivation of optimal `k = m/n * ln(2)`, the number of hash functions.

Let’s formulate it in pure mathematics:

For `f(k) = (1 - e^(-kn/m))^k`, find `k` to minimize `f(k)`.

For the sake of easy derivation, let’s substitute `e^(-kn/m)` with `p`, then `k = -m/n*ln(p)`.

`ln f = e ^ [k * ln(1-p)] = e ^ [-m/n * ln(p) * ln(1-p)]`

Therefore, minimizing `f` is equivalent to maximizing `g = ln(p) * ln(1-p)`, due the negative sign before `m`. According to the domain of `ln`, we know that `0 < p < 1`. In other words, we would like to find `p` between `(0, 1)`, which maximizes `g = ln(p) * ln(1-p)`.

`g` is a composed function, and we could see how it looks like using wolframalpha*ln(1-x)). After seeing its plot, we just need to solve `g'(p) == 0`.

`g'(p) = 1/p * ln(1-p) + ln(p) * (-1/1-p) = 0` => `p*ln(p) = (1-p)*ln(1-p)`. I don’t know how to find all roots for this equation, but one simple root is `p = 1-p`, i.e. `p = 1/2`. Then, we know `k = m/n *ln(2)` using our original substitution. Q.E.D.

## Reference

http://www.eecs.harvard.edu/~michaelm/NEWWORK/postscripts/cbf2.pdf