Bernoulli process

01 Theory - Bernoulli, binomial, geometric, Pascal, uniform

Theory 1 - Bernoulli, binomial, geometric, Pascal, uniform

In a Bernoulli process, an experiment with binary outcomes is repeated; for example flipping a coin repeatedly. Several discrete random variables may be defined in the context of some Bernoulli process.

Notice that the sample space of a Bernoulli process is infinite: an outcome is any sequence of trial outcomes, e.g.

Bernoulli variable

A random variable is a Bernoulli indicator, written , when indicates whether a success event, having probability , took place in trial number of a Bernoulli process.

Bernoulli indicator PMF:

An RV that always gives either or for every outcome is called an indicator variable.

Binomial variable

A random variable is binomial, written , when counts the number of successes in a Bernoulli process, each having probability , over a specified number of trials.

Binomial PMF:

  • For example, if , then gives the odds that success happens exactly 5 times over 10 trials, with probability of success for each trial.
  • In terms of the Bernoulli indicators, we have:
  • If is the success event, then , and is the probability of failure.

Geometric variable

A random variable is geometric, written , when counts the discrete wait time in a Bernoulli process until the first success takes place, given that success has probability in each trial.

Geometric PMF:

  • For example, if , then gives the probability of getting: failure on the first trials AND success on the trial.

Pascal variable

A random variable is Pascal, written , when counts the discrete wait time in a Bernoulli process until success happens times, given that success has probability in each trial.

Pascal PMF:

  • For example, if , then gives the probability of getting: the success on (precisely) the trial.
  • Interpret the formula: ways to arrange successes among ‘prior’ trials, times the probability of exactly successes and failures in one particular sequence.
  • The Pascal distribution is also called the negative binomial distribution, e.g. .

Uniform variable

A discrete random variable is uniform on a finite set , written , when the probability is a fixed constant for outcomes in and zero for outcomes outside .

Discrete uniform PMF:

Continuous uniform PDF:

Link to original

02 Illustration

Example - Roll die until

Roll die until

Roll a fair die repeatedly. Find the probabilities that:

(a) At most 2 threes occur in the first 5 rolls.

(b) There is no three in the first 4 rolls, using a geometric variable.

Solution

(a)

(1) Labels.

Use to count the number of threes among the first six rolls.

Seek as the answer.


(2) Calculations.

Divide into exclusive events:

(b)

(1) Labels.

Use to give the roll number of the first time a three is rolled.

Seek as the answer.


Sum the PMF formula for .

(2) Compute:


(3)

Geometric series formula.

For any geometric series:

Apply formula:

Final answer is .

Link to original

Example - Cubs winning the World Series

Cubs winning the World Series

Suppose the Cubs are playing the Yankees for the World Series. The first team to 4 wins in 7 games wins the series. What is the probability that the Cubs win the series?

Assume that for any given game the probability of the Cubs winning is and losing is .

Solution

(a) Using a binomial distribution

(1) Label.

Let .

Thus is the probability that the Cubs win exactly 4 games over 7 played.

Seek as the answer.


(2) Calculate.

Use binomial PMF:


(3) Insert data:


(4) Compute:

Convert :


(b) Using a Pascal distribution

(1) Label.

Let .

Thus is the probability that the Cubs win their game on game number .

Seek as the answer.


(2) Calculate.

Use Pascal PMF:


(3) Insert data:


(4) Compute:

Convert :

Notice: The algebra seems very different, right up to the end!

Link to original

Expectation and variance

03 Theory - Expectation and variance

Theory 1

Expected value

The expected value of random variable is the weighted average of the values of , weighted by the probability of those values.

Discrete formula using PMF:

Continuous formula using PDF:

Notes:

  • Expected value is sometimes called expectation, or even just mean, although the latter is best reserved for statistics.
  • The Greek letter is also used in contexts where ‘mean’ is used.

Let be a random variable, and write .

Variance

The variance measures the average squared deviation of from . It estimates how concentrated is around .

  • Defining formula:
  • Shorter formula:

Calculating variance

  • Discrete formula using PMF:
  • Continuous formula using PDF:

Standard deviation

The quantity is called the standard deviation of .

Link to original

04 Illustration

Exercise - Tokens in bins

Gambling game - tokens in bins

Consider a game like this: a coin is flipped; if then draw a token from Bin 1, if then from Bin 2.

  • Bin 1 contents: 1 token $1,000, and 9 tokens $1
  • Bin 2 contents: 5 tokens $50, and 5 tokens $1

It costs $50 to enter the game. Should you play it? (A lot of times?) How much would you pay to play?

Solution

Link to original

Example - Expected value: rolling dice

Expected value: rolling dice

Let be a random variable counting the number of dots given by rolling a single die.

Then:

Let be an RV that counts the dots on a roll of two dice.

The PMF of :

center Then:

Notice that .

In general, .

Let be a green die and a red die.

From the earlier calculation, and .

Since , we derive by simple addition!

Link to original

Example - Expected value by finding new PMF

Expected value by finding new PMF

Let have distribution given by this PMF:

center Find .

Solution

(1) Compute the PMF of .

PMF arranged by possible value:


(2) Calculate the expectation.

Using formula for discrete PMF:

Link to original

Exercise - Variance using simplified formula

Variance for composite using PMF and simpler formula

Suppose has this PMF:

123

Find using the formula with .

(Hint: you should find and along the way.)

Link to original

Poisson process

05 Theory - Poisson variable

Theory 1 - Poisson variable

Poisson variable

A random variable is Poisson, written , when counts the number of “arrivals” in a fixed “interval.” It is applicable when:

  • The arrivals come at a constant average rate .
  • The arrivals are independent of each other.

Poisson PMF:

A Poisson variable is comparable with a binomial variable. Both count the occurrences of some “arrivals” over some “space of opportunity.”

  • The binomial opportunity is a set of repetitions of a trial.
  • The Poisson opportunity is a continuous interval of time.

In the binomial case, success occurs at some rate , since where is the success event.

In the Poisson case, arrivals occur at some rate .


The Poisson distribution is actually the limit of binomial distributions by taking while remains fixed, so in perfect balance with .

Let and let . Fix and let . Then for any :

For example, let , so , and look at as :

Interpretation - Binomial model of rare events

Let us interpret the assumptions of this limit. For large but small such that remains moderate, the binomial distribution describes a large number of trials, a low probability of success per trial, but a moderate total count of successes.

This setup describes a physical system with a large number of parts that may activate, but each part is unlikely to activate; and yet the number of parts is so large that the total number of arrivals is still moderate.

Link to original

06 Illustration

Example - Radioactive decay is Poisson

Radioactive decay is Poisson

Consider a macroscopic sample of Uranium.

Each atom decays independently of the others, and the likelihood of a single atom popping off is very low; but the product of this likelihood by the total number of atoms is a moderate number.

So there is some constant average rate of atoms in the sample popping off, and the number of pops per minute follows a Poisson distribution.

Link to original

Example - Arrivals at a post office

Arrivals at a post office

Client arrivals at a post office are modelled well using a Poisson variable.

Each potential client has a very low and independent chance of coming to the post office, but there are many thousands of potential clients, so the arrivals at the office actually come in moderate number.

Suppose the average rate is 5 clients per hour.

(a) Find the probability that nobody comes in the first 10 minutes of opening. (The cashier is considering being late by 10 minutes to run an errand on the way to work.)

(b) Find the probability that 5 clients come in the first hour. (I.e. the average is achieved.)

(c) Find the probability that 9 clients come in the first two hours.

Solution

(a)

(1) Convert rate for desired window.

Expect clients every 10 minutes.

Let .

Seek as the answer.


(2) Compute.

Formula:

Insert data and compute:

(b)

Rate is already correct.

Let .

Compute the answer:

(c)

Convert rate for desired window.

Expect 10 clients every 2 hours.

Let .

Compute the answer:

Notice that 0.125 is smaller than 0.175.

Link to original

07 Theory - Poisson limit of binomial

Theory 2 - Poisson limit of binomial

Extra - Derivation of binomial limit to Poisson

Consider a random variable , and suppose is very large.

Suppose also that is very small, such that is not very large, but the extremes of and counteract each other. (Notice that then will not be large so the normal approximation does not apply.) In this case, the binomial PMF can be approximated using a factor of . Consider the following rearrangement of the binomial PMF:

Since is very large, the factor in brackets is approximately , and since is very small, the last factor of is also approximately 1 (provided we consider small compared to ). So we have:

Write , a moderate number, to find:

Here at last we find , since as . So as :


Extra - Binomial limit to Poisson and divisibility

Consider a sequence of increasing with decreasing such that is held fixed. For example, let while .

Think of this process as increasing the number of causal agents represented: group the agents together into bunches, and consider the odds that such a bunch activates. (For the call center, a bunch is a group of users; for radioactive decay, a bunch is a unit of mass of Uranium atoms.)

As doubles, we imagine the number of agents per bunch to drop by half. (For the call center, we divide a group in half, so twice as many groups but half the odds of one group making a call; for the Uranium, we divide a chunk of mass in half, getting twice as many portions with half the odds of a decay occurring in each portion.

This process is formally called division of a distribution, and the fact that the Poisson distribution arises as the limit of such division means that it is infinitely divisible.


Extra - Theorem: Poisson approximation of the binomial

Suppose and . Then:

for any .

In consequence of this theorem, a Poisson distribution may be used to approximate the probabilities of a binomial distribution for large when it is impracticable (even for a computer) to calculate large binomial coefficients.

The theorem shows that the Poisson approximation is appropriate when is a moderate number while is a small number.

Link to original