Law of Large Numbers
Markov and Chebyshev
A tire shop has 500 customers per day on average.
(a) Estimate the odds that more than 700 customers arrive today.
(b) Assume the variance in daily customers is 10. Repeat (a) with this information.
Solution
Write
(a) Using Markov’s inequality with
(b) Using Chebyshev’s inequality with
The Chebyshev estimate is much smaller!
LLN: Average winnings
A roulette player bets as follows: he wins $100 with probability 0.48 and loses $100 with probability 0.52. The expected winnings after a single round is therefore
By the LLN, if the player plays repeatedly for a long time, he expects to lose
The ‘expects’ in the last sentence means: the PMF of the cumulative average winnings approaches this PMF:
This is by contrast to the ‘expects’ of expected value: the probability of achieving the expected value (or something near) may be low or zero! For example, a single round of this game.
Enough samples
Suppose
(a) Compute
(b) Use the finite LLN to find
(c) How many samples
Statistical testing
One-tail test: Weighted die
Your friend gives you a single regular die, and say she is worried that it has been weighted to prefer the outcome of 2. She wants you to test it.
Design a significance test for the data of 20 rolls of the die to determine whether the die is weighted. Use significance level
Solution
Let
The Claim: “the die is weighted to prefer 2”
The null hypothesis
Assuming
⚠️ Notice that “prefer 2” implies the claim is for more 2s than normal.
Therefore: Choose a one-tail rejection set.
Need
- Equivalently:
Solve for
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | |
---|---|---|---|---|---|---|---|---|
0.026 | 0.130 | 0.329 | 0.567 | 0.769 | 0.898 | 0.963 | 0.989 |
Therefore, choose
The final answer is:
Two-tail test: Circuit voltage
A boosted AC circuit is supposed to maintain an average voltage of
Design a two-tail test incorporating the data of 40 independent measurements to determine if the expected value of the voltage is truly
Solution
Use
The Claim to test:
Rejection region:
where
Assuming
Recall Chebyshev’s inequality:
Now solve:
Therefore the rejection region should be:
One-tail test with a Gaussian: Weight loss drug
Assume that in the background population in a specific demographic, the distribution of a person’s weight
Design a test at the
Solution
Since the drug is tested on 64 individuals, we use the sample mean
The Claim: “the drug is effective in reducing weight”
The null hypothesis
Assuming
⚠️ One-tail test because the drug is expected to reduce weight (unidirectional).
Rejection region:
Compute that
Since
Furthermore:
Then:
Solve:
Therefore, the rejection region: