Concentration Inequalities
The goal of concentration inequalities is to provide bounds on the probability of a random variable taking values in its tail (regions farthest away from the mean).
This is especially useful when we don’t know the distribution of a random variable, or we have a combination of other random variables (sample mean, quicksort, multi-arm bandit…).
Markov’s Inequality
#
If X is a non-negative random variable with expectation μ, then
P(X≥t)≤tμ
Chebyshev’s Inequality
#
If we know the mean μ and the variance σ2 of a random variable X, then the probability of a result being t standard deviations away from the mean can be expressed as
P(∣X−μ∣≥tσ)≤t21
Alternatively,
P(∣X−μ∣≥c)≤c2σ2
Chernoff Bound
#
P(X≥t)=P(eλX≥eλt)≤eλtE[eλx]
Optimized form:
P(X≥c)≤mint>0MX(t)e−tc
where MX(t)=E(etX) (moment generating function)
Hoeffding’s Inequality
#
Hoeffding’s inquality is a special case of the Chernoff bound where the bounds of a random variable are known.
Hoeffding’s Lemma: if X is a bounded random variable between a and b with mean μ, then
MX(t)≤exp(8(b−a)2λ2+μλ)
Hoeffding’s Inequality: If we have n independent (may not be identically distributed) bounded random variables between a and b,
P(n1i=1∑n(Xi−E(Xi))≥t)≤exp(−(a−b)22nt2)
Proof
#
Let Y be n1∑i=1n(Xi−E(Xi)). Then, the MGF E[eλY] = E[exp(λ/n∑i=1nXi)].
Using the properties of exponentials (exponential of sum is product of exponentials),
E[eλY]=E[∏i=1nexp(λ/nXi)].
Using independence rules, E[∏i=1nexp(λ/nXi)]=∏i=1nE[exp(nλXi)]
Using Hoefding’s Lemma, the MGF must then be bounded by ∏i=1nexp(8(b−a)2λ2+μλ).