Deviations
01 Theory - Tail estimation
Theory 2 - Tail estimation
Every distribution must trail off to zero for large enough . The regions where trails off to zero (large magnitude of ) are informally called ‘tails’.
Tail probabilities
A tail probability is a probability with one of these forms, for any :
Markov’s inequality
Assume that . Take any .
Then the Markov’s inequality states:
Chebyshev’s inequality
Take any , and .
Then the Chebyshev’s inequality states:
Markov vs. Chebyshev
Chebyshev’s inequality works for any , and it usually gives a better estimate than Markov’s inequality.
The main value of Markov’s inequality is that it only requires knowledge of .
Think of Chebyshev’s inequality as a tightening of Markov’s inequality using the additional data of .
Derivation of Markov’s inequality - Continuous RV
Under the hypothesis that and , we have:
On the range we may convert to , making the integrand smaller:
Simplify:
Also:
Therefore:
Link to originalExtra - Derivation of Chebyshev’s inequality
Notice that the variable is always positive. Chebyshev’s inequality is a simple application of Markov’s inequality to this variable.
Specifically, using as the Markov constant, Markov’s inequality yields:
Then, by monotonicity of square roots:
And of course . Chebyshev’s inequality follows.
02 Illustration
Example - Markov and Chebyshev
Markov and Chebyshev
A tire shop has 500 customers per day on average.
(a) Estimate the odds that more than 700 customers arrive today.
(b) Assume the variance in daily customers is 100. Repeat (a) with this information.
Solution
Write for the number of daily customers.
(a) Using Markov’s inequality with , we have:
(b) Using Chebyshev’s inequality with , we have:
The Chebyshev estimate is much smaller!
Link to original
03 Theory - Sample mean
Theory 1 - Sample mean
Sample mean and its variance
The sample mean of a set of IID random variables is an RV giving the average value:
Statistics of the sample mean:
The sample mean is typically applied to repeated trials of an experiment. The trials are independent, and the probability distribution of outcomes should be identical from trial to trial.
Notice that the variance of the sample mean limits to 0 as . As more trials are repeated, and the average of all results is taken, the fluctuations of this average will shrink toward zero.
As the distribution of will converge to a PMF with all the probability concentrated at and none elsewhere.
Link to original
Large Numbers
04 Theory - Law of Large Numbers
Theory 3 - Law of Large Numbers
Let be a collection of IID random variables with for any and for any .
Recall the sample mean:
Link to originalLaw of Large Numbers (weak form)
For any , by Chebyshev’s inequality we have:
Therefore:
05 Illustration
Example - LLN: Average winnings
LLN: Average winnings
A roulette player bets as follows: he wins $100 with probability 0.48 and loses $100 with probability 0.52. The expected winnings after a single round is therefore which equals $4.
By the LLN, if the player plays repeatedly for a long time, he expects to lose $4 per round on average.
The ‘expects’ in the last sentence means: the PMF of the cumulative average winnings approaches this PMF:
This is by contrast to the ‘expects’ of expected value: the probability of achieving the expected value (or something near) may be low or zero! For example, a single round of this game cannot result in a $4 loss.
Link to original
Exercise - Enough samples
Enough samples
Suppose are IID samples of .
(a) Compute and and .
(b) Use the finite LLN to find such that:
(c) How many samples are needed to guarantee that:
Link to original