01 Theory - Bernoulli, binomial, geometric, Pascal, uniform
In a Bernoulli process, an experiment with binary outcomes is repeated; for example flipping a coin repeatedly. Several discrete random variables may be defined in the context of some Bernoulli process.
Notice that the sample space of a Bernoulli process is infinite: an outcome is any sequence of trial outcomes, e.g.
Bernoulli variable
A random variable is a Bernoulli indicator, written , when indicates whether a success event, having probability , took place in trial number of a Bernoulli process.
Bernoulli indicator PMF:
%& An RV that always gives either or for every outcome is called an indicator variable.
Binomial variable
A random variable is binomial, written , when counts the number of successes in a Bernoulli process, each having probability , over a specified number of trials.
Binomial PMF:
For example, if , then gives the odds that success happens exactly 5 times over 10 trials, with probability of success for each trial.
In terms of the Bernoulli indicators, we have:
If is the success event, then , and is the probability of failure.
Geometric variable
A random variable is geometric, written , when counts the discrete wait time in a Bernoulli process until the first success takes place, given that success has probability in each trial.
Geometric PMF:
For example, if , then gives the probability of getting: failure on the first trials AND success on the trial.
Pascal variable
A random variable is Pascal, written , when counts the discrete wait time in a Bernoulli process until success happens times, given that success has probability in each trial.
Pascal PMF:
For example, if , then gives the probability of getting: the success on (precisely) the trial.
Interpret the formula: ways to arrange successes among ‘prior’ trials, times the probability of exactly successes and failures in one particular sequence.
The Pascal distribution is also called the negative binomial distribution, e.g. .
Uniform variable
A discrete random variable is uniform on a finite set , written , when the probability is a fixed constant for outcomes in and zero for outcomes outside .
Discrete uniform PMF:
Continuous uniform PDF:
02 Illustration
Example - Roll die until
Example - Winning the World Series
Expectation and variance
03 Theory - Expectation and variance
Expected value
The expected value of random variable is the weighted average of the values of , weighted by the probability of those values.
Discrete formula using PMF:
Continuous formula using PDF:
Notes:
Expected value is sometimes called expectation, or even just mean, although the latter is best reserved for statistics.
The Greek letter is also used in contexts where ‘mean’ is used.
Let be a random variable, and write .
Variance
The variance measures the average squared deviation of from . It estimates how concentrated is around .
Defining formula:
Shorter formula:
Calculating variance
Discrete formula using PMF:
Continuous formula using PDF:
Standard deviation
The quantity is called the standard deviation of .
04 Illustration
Exercise - Tokens in bins
Example - Expected value: rolling dice
Example - Expected value by finding new PMF
Exercise - Variance using simplified formula
Poisson process
05 Theory - Poisson variable
Poisson variable
A random variable is Poisson, written , when counts the number of “arrivals” in a fixed “interval.” It is applicable when:
The arrivals come at a constant average rate .
The arrivals are independent of each other.
Poisson PMF:
A Poisson variable is comparable with a binomial variable. Both count the occurrences of some “arrivals” over some “space of opportunity.”
The binomial opportunity is a set of repetitions of a trial.
The Poisson opportunity is a continuous interval of time.
In the binomial case, success occurs at some rate , since where is the success event.
In the Poisson case, arrivals occur at some rate .
The Poisson distribution is actually the limit of binomial distributions by taking while remains fixed, so in perfect balance with .
Let and let . Fix and let . Then for any :
For example, let , so , and look at as :
Interpretation - Binomial model of rare events
Let us interpret the assumptions of this limit. For large but small such that remains moderate, the binomial distribution describes a large number of trials, a low probability of success per trial, but a moderate total count of successes.
This setup describes a physical system with a large number of parts that may activate, but each part is unlikely to activate; and yet the number of parts is so large that the total number of arrivals is still moderate.
06 Illustration
Example - Radioactive decay is Poisson
Consider a macroscopic sample of Uranium.
Each atom decays independently of the others, and the likelihood of a single atom popping off is very low; but the product of this likelihood by the total number of atoms is a moderate number.
So there is some constant average rate of atoms in the sample popping off, and the number of pops per minute follows a Poisson distribution.
Calls to a call center is Poisson
Consider a call center that receives help requests from users of a popular phone manufacturer.
The total number of users is very large, and the likelihood of any given user calling in a given minute is very small, but the product of these rates is moderate.
So there is some constant average rate of calls to the center, and the number of calls per minute follows a Poisson distribution.
Exercise - Typos per page
A draft of a textbook has an average of 6 typos per page.
What is the probability that a randomly chosen page has typos?
Answer: 0.849
Hint: study the complementary event.
Example - Arrivals at a post office
07 Theory - Poisson limit of binomial
Extra - Derivation of binomial limit to Poisson
Consider a random variable , and suppose is very large.
Suppose also that is very small, such that is not very large, but the extremes of and counteract each other. (Notice that then will not be large so the normal approximation does not apply.) In this case, the binomial PMF can be approximated using a factor of . Consider the following rearrangement of the binomial PMF:
Since is very large, the factor in brackets is approximately , and since is very small, the last factor of is also approximately 1 (provided we consider small compared to ). So we have:
Write , a moderate number, to find:
Here at last we find , since as . So as :
Extra - Binomial limit to Poisson and divisibility
Consider a sequence of increasing with decreasing such that is held fixed. For example, let while .
Think of this process as increasing the number of causal agents represented: group the agents together into bunches, and consider the odds that such a bunch activates. (For the call center, a bunch is a group of users; for radioactive decay, a bunch is a unit of mass of Uranium atoms.)
As doubles, we imagine the number of agents per bunch to drop by half. (For the call center, we divide a group in half, so twice as many groups but half the odds of one group making a call; for the Uranium, we divide a chunk of mass in half, getting twice as many portions with half the odds of a decay occurring in each portion.
This process is formally called division of a distribution, and the fact that the Poisson distribution arises as the limit of such division means that it is infinitely divisible.
Extra - Theorem: Poisson approximation of the binomial
Suppose and . Then:
for any .
In consequence of this theorem, a Poisson distribution may be used to approximate the probabilities of a binomial distribution for large when it is impracticable (even for a computer) to calculate large binomial coefficients.
The theorem shows that the Poisson approximation is appropriate when is a moderate number while is a small number.