Bernoulli process

01 Theory - Bernoulli, binomial, geometric, Pascal, uniform

Theory 1 - Bernoulli, binomial, geometric, Pascal, uniform

In a Bernoulli process, an experiment with binary outcomes is repeated; for example flipping a coin repeatedly. Several discrete random variables may be defined in the context of some Bernoulli process.

Notice that the sample space of a Bernoulli process is infinite: an outcome is any sequence of trial outcomes, e.g. HTHHTTHHHTTTHHHHTTTT

Bernoulli variable

A random variable Xi is a Bernoulli indicator, written XiBer(p), when Xi indicates whether a success event, having probability p, took place in trial number i of a Bernoulli process.

Bernoulli PMF:

PXi(k)={pk=1qk=00else

Here q=1p.

An RV that always gives either 0 or 1 for every outcome is called an indicator variable.

Binomial variable

A random variable S is binomial, written SBin(n,p), when S counts the number of successes in a Bernoulli process, each having probability p, over a specified number n of trials.

Binomial PMF:

PS(k)=(nk)pk(1p)nkfor k=0,1,2,,n
  • For example, if SBin(10,0.2), then PS(5) gives the odds that success happens exactly 5 times over 10 trials, with probability 0.2 of success for each trial.
  • In terms of the Bernoulli indicators, we have: S=X1+X2++Xn
  • If A is the success event, then p=P[A] is the success probability, and q=1p is the failure probability.

Geometric variable

A random variable N is geometric, written NGeom(p), when N counts the discrete wait time in a Bernoulli process until the first success takes place, given that success has probability p in each trial.

Geometric PMF:

PN(k)=qk1pfor k=1,2,3,

Here q=1p.

  • For example, if NGeom(30%), then PN(7) gives the probability of getting: failure on the first 6 trials AND success on the 7th trial.

Pascal variable

A random variable L is Pascal, written LPasc(,p), when L counts the discrete wait time in a Bernoulli process until success happens times, given that success has probability p in each trial.

Pascal PMF:

PL(k)=(k11)(1p)kpfor k=,+1,+2,
  • For example, if LPasc(3,0.25), then PL(8) gives the probability of getting: the 3rd success on (precisely) the 8th trial.
  • Interpret the formula: # ways to arrange 2 successes among 7 ‘prior’ trials, times the probability of exactly 3 successes and 5 failures in one specific sequence.
  • The Pascal distribution is also called the negative binomial distribution, e.g. LNegbin(,p).

Uniform variable

A discrete random variable X is uniform on a finite set AS, written XUnif(A), when the probability is a fixed constant for outcomes in A and zero for outcomes outside A.

Discrete uniform PMF:

PX(k)={1|A|when kA0when kA

Continuous uniform PDF:

fX(x)={1P[A]when xA0when xA
Link to original

02 Illustration

Example - Roll die until

Roll die until

Roll a fair die repeatedly. Find the probabilities that:

(a) At most 2 threes occur in the first 5 rolls.

(b) There is no three in the first 4 rolls, using a geometric variable.

Solution

(a)

(1) Label variables and events:

Use a variable SBin(5,1/6) to count the number of threes among the first six rolls.

Seek P[S2] as the answer.


(2) Calculations:

Divide into exclusive events:

P[S2]PS(0)+PS(1)+PS(2)(50)(16)0(56)5+(51)(16)1(56)4+(52)(16)2(56)36256480.965

(b)

(1) Label variables and events:

Use a variable NGeom(1/6) to give the roll number of the first time a three is rolled.

Seek P[N>4] as the answer.


(2) Compute:

Sum the PMF formula for Geom(1/6):

P[N>4]k=5(56)k1(16)

(3) Recall geometric series formula:

For any geometric series:

a+ar+ar2+ar3+=a1r

Therefore:

P[N>4]=k=5(56)k1(16)(56)4 Link to original

Example - Cubs winning the World Series

Cubs winning the World Series

Suppose the Cubs are playing the Yankees for the World Series. The first team to 4 wins in 7 games wins the series. What is the probability that the Cubs win the series?

Assume that for any given game the probability of the Cubs winning is p=45% and losing is q=55%.

Solution

Method (a): We solve the problem using a binomial distribution.

(1) Label variables and events:

Use a variable XBin(7,p). This X counts the number of wins over 7 games. Thus, for example, PX(4) is the probability that the Cubs win exactly 4 games over 7 played.

Seek PX(4)+PX(5)+PX(6)+PX(7) as the answer.


(2) Calculate using binomial PMF:

PX(k)=(7k)pkq7k

Insert data:

PX(4)++PX(7)(74)p4q3+(75)p5q2+(76)p6q1+(77)p7q0

Compute:

76532p4q3+762p5q2+71p6q1+1p7q0p4(35q3+21p1q2+7p2q+p3)

Convert q(1p):

p4(35(1p)3+21p(1p)2+7p2(1p)+p3)35p484p5+70p620p70.39

Method (b): We solve the problem using a Pascal distribution instead.

(1) Label variables and events:

Use a variable YPasc(4,p). This Y measures the discrete wait time until the 4th win. Thus, for example, PY(k) is the probability that the Cubs win their 4th game on game number k.

Seek PY(4)+PY(5)+PY(6)+PY(7) as the answer.


(2) Calculate using Pascal PMF:

PY(k)=(k13)p4qk4

Insert data:

PY(4)++PY(7)(33)p4q0+(43)p4q1+(53)p4q2+(63)p4q3

Compute:

1p4+41p4q1+542p4q2+65432p4q3p4(1+4q+10q2+20q3)

Convert q(1p):

p4(1+4(1p)+10(1p)2+20(1p)3)35p484p5+70p620p70.39

Notice: The calculation seems very different than method (a), right up to the end!

Link to original

Expectation and variance

03 Theory - Expectation and variance

Theory 1

Expected value

The expected value E[X] of random variable X is the weighted average of the values of X, weighted by the probability of those values.

Discrete formula using PMF:

E[X]=kkPX(k)

Continuous formula using PDF:

E[X]=+xfX(x)dx

Notes:

  • Expected value is sometimes called expectation, or even just mean, although the latter is best reserved for statistics.
  • The Greek letter μ is also used in contexts where ‘mean’ is used.

Let X be a random variable, and write E[X]=μ.

Variance

The variance Var[X] measures the average squared deviation of X from μ. It estimates how concentrated X is around μ.

  • Defining formula:
Var[X]=E[(Xμ)2]
  • Shorter formula:
Var[X]=E[X2]E[X]2

Calculating variance

  • Discrete formula using PMF:
Var[X]=k(kμ)2PX(k)
  • Continuous formula using PDF:
Var[X]=+(xμ)2fX(x)dx

Standard deviation

The quantity σX=Var[X] is called the standard deviation of X.

Link to original

04 Illustration

Example - Tokens in bins

Gambling game - tokens in bins

Consider a game like this: a coin is flipped; if H then draw a token from Bin 1, if T then from Bin 2.

  • Bin 1 contents: 1 token $1,000, and 9 tokens $1
  • Bin 2 contents: 5 tokens $50, and 5 tokens $1

It costs $50 to enter the game. Should you play it? (A lot of times?) How much would you pay to play?

Solution

(1) Setup:

Let X be a random variable measuring your winnings in the game.

The possible values of X are 1, 50, and 1000.


(2) Find the PDF of X:

For k=1 have PX(1)12910+12510710

For k=50 have PX(50)1251014

For k=1000 have PX(1000)12110120

These add to 1, and PX(x)=0 for all other x.


(3) Find E[X] using the discrete formula:

E[X]=kkPX(k)1PX(1)+50PX(50)+1000PX(1000)1710+5014+100012063.2

Since 63.2>50, if you play it a lot at $50 you will generally make money.


Challenge Q: If you start with $200 and keep playing to infinity, how likely is it that you go broke?

Link to original

center

Example - Expected value: rolling dice

Expected value: rolling dice

Let X be a random variable counting the number of dots given by rolling a single die.

Then:

E[X]116+216++61672

Let S be an RV that counts the dots on a roll of two dice.

The PMF of S:

center Then:

E[S]2136+3236+4336++121367

Notice that 72+72=7.

In general, E[X+Y]=E[X]+E[Y].

Let X be a green die and Y a red die.

From the earlier calculation, E[X]=72 and E[Y]=72.

Since S=X+Y, we derive E[S]=7 by simple addition!

Link to original

Example - Expected value by finding new PMF

Expected value by finding new PMF

Let X have distribution given by this PMF:

center Find E[|X2|].

Solution

(1) Compute the PMF of |X2|:

PMF arranged by possible value:

P[|X2|=0]P[X=2]=114P[|X2|=1]P[X=1]+P[X=3]=17+314=514P[|X2|=2]P[X=4]=27P[|X2|=3]P[X=5]=27P[|X2|=k]0fork0,1,2,3.

(2) Calculate the expectation:

Using formula for discrete PMF:

E[|X2|]0114+1514+227+3272514 Link to original

Exercise - Variance using simplified formula

Variance for composite using PMF and simpler formula

Suppose X has this PMF:

k:123
PX(k):1/72/74/7

Find Var[11+X] using the formula Var[Y]=E[Y2]E[Y]2 with Y=11+X.

(Hint: you should find E[Y]=1342 and E[Y2]=13126 along the way.)

Link to original