The sample mean of a set of IID random variables is an RV that averages the first instances:
Statistics of the sample mean (for any ):
The sample mean is typically applied to repeated trials of an experiment. The trials are independent, and the probability distribution of outcomes should be identical from trial to trial.
Notice that the variance of the sample mean limits to 0 as . As more trials are repeated, and the average of all results is taken, the fluctuations of this average will shrink toward zero.
As the distribution of will converge to a PMF with all the probability concentrated at and none elsewhere.
Theory 2 - Tail estimation
Every distribution must trail off to zero for large enough . The regions where trails off to zero (large magnitude of ) are informally called ‘tails’.
Tail probabilities
A tail probability is a probability with one of these forms:
Markov’s inequality
Assume that . Take any .
Then the Markov’s inequality states:
Chebyshev’s inequality
Take any , and .
Then the Chebyshev’s inequality states:
Markov vs. Chebyshev
Chebyshev’s inequality works for any , and it usually gives a better estimate than Markov’s inequality.
The main value of Markov’s inequality is that it only requires knowledge of .
Think of Chebyshev’s inequality as a tightening of Markov’s inequality using the additional data of .
Derivation of Markov’s inequality - Continuous RV
Under the hypothesis that and , we have:
On the range we may convert to , making the integrand bigger:
Simplify:
Also:
Therefore:
Extra - Derivation of Chebyshev’s inequality
Notice that the variable is always positive. Chebyshev’s inequality is a simple application of Markov’s inequality to this variable.
Specifically, using as the Markov constant, Markov’s inequality yields:
Then, by monotonicity of square roots:
And of course . Chebyshev’s inequality follows.
Theory 3 - Law of Large Numbers
Let be a collection of IID random variables with for any , and for any .