Repeated trials

01 Theory

Theory 1

Repeated trials

When a single experiment type is repeated many times, and we assume each instance is independent of the others, we say it is a sequence of repeated trials or independent trials.

The probability of any sequence of outcomes is derived using independence together with the probabilities of outcomes of each trial.


A simple type of trial, called a Bernoulli trial, has two possible outcomes, 1 and 0, or success and failure, or T and F. A sequence of repeated Bernoulli trials is called a Bernoulli process.

  • Write sequences like TFFTTF for the outcomes of repeated trials of this type.
  • Independence implies
P[TFFTTF]=P[T]P[F]P[F]P[T]P[T]P[F]
  • Write p=P[T] and q=P[F], and because these are all outcomes (exclusive and exhaustive), we have q=1p. Then:
P[TFFTTF]pqqppqp3q3
  • This gives a formula for the probability of any sequence of these trials.

A more complex trial may have three outcomes, A, B, and C.

  • Write sequences like ABBACABCA for the outcomes.
  • Label p=P[A] and q=P[B] and r=P[C]. We must have p+q+r=1.
  • Independence implies
P[ABBACABCA]pqqprpqrpp4q3r2
  • This gives a formula for the probability of any sequence of these trials.

Let S stand for the sum of successes in some Bernoulli process. So, for example, “S=3” stands for the event that the number of successes is exactly 3. The probabilities of S events follow a binomial distribution.

Suppose a coin is biased with P[H]=20%, and H is ‘success’. Flip the coin 20 times. Then:

P[S=3](203)(0.2)3(0.8)17

Each outcome with exactly 3 heads and 17 tails has probability (0.2)3(0.8)17. The number of such outcomes is the number of ways to choose 3 of the flips to be heads out of the 20 total flips.

The probability of at least 18 heads would then be:

P[S18]P[S=18]+P[S=19]+P[S=20](2018)(0.2)18(0.8)2+(2019)(0.2)19(0.8)1+(2020)(0.2)20(0.8)0

With three possible outcomes, A, B, and C, we can write sum variables like SA which counts the number of A outcomes, and SB and SC similarly. The probabilities of events like (SA,SB,SC)=(2,3,5) follow a multinomial distribution.

Link to original

02 Illustration

Example - Multinomial: Soft drinks preferred

Multinomial: Soft drinks preferred

Folks coming to a party prefer Coke (55%), Pepsi (25%), or Dew (20%). If 20 people order drinks in sequence, what is the probability that exactly 12 have Coke and 5 have Pepsi and 3 have Dew?

Solution

The multinomial coefficient (2012,5,3) gives the number of ways to assign 20 people into bins according to preferences matching the given numbers, C=12 and P=5 and D=3.

Each such assignment is one sequence of outcomes. All such sequences have probability (0.55)12(0.25)5(0.2)3.

The answer is therefore:

(2012,5,3)(0.55)12(0.25)5(0.2)320!12!5!3!(0.55)12(0.25)5(0.2)3Link to original

Reliability

03 Theory

Theory 1

Consider some process schematically with components in series and components in parallel:

center

  • Each component has a probability of success or failure.
  • Event Wi indicates ‘success’ of that component (same name).
  • Then P[Wi] is the probability of Wi succeeding.

Success for a series of components requires success of each member.

  • Series components rely on each other.
  • Success of the whole is success of part 1 AND success of part 2 AND part 3, etc.

Failure for parallel components requires failure of each member.

  • Parallel components represent redundancy.
  • Success of the whole is success of part 1 OR success of part 2 OR part 3, etc.

For series components, stack successes:

P[W]=P[W1W2W3]=P[W1]P[W2]P[W3]

For parallel components, stack failures:

P[Wc]=P[W1cW2cW3c](1P[W1])(1P[W2])(1P[W3])

E.g. if P[Wi]=p for all components i, then:

  • Series components: P[W]=p3
  • Parallel components: P[W]=1(1p)3

To analyze a complex diagram of series and parallel components, bundle each:

  • pure series set as a single compound component with its own success probability (the product)
  • pure parallel set as a single compound component with its own success probability (using the failure formula)

This is like the analysis of resistors and inductors.

Link to original

04 Illustration

Example - Series, parallel, series

Reliability: Series, parallel, series

Suppose a process has internal components arranged like this:

center

Write Wi for the event that component i succeeds, and Wic for the event that it fails.

The success probabilities for each component are given in the chart:

12345
92%89%95%86%91%

Find the probability that the entire system succeeds.

Solution

For intersections, use P[AB]=P[A]P[B] (independence) and for unions, use P[AB]=1P[AcBc].

So P[(W2W3)c]=1(.89)(.95) and:

success=W1(W2W3W4W5)P[success]=(.92)(1(1(.89)(.95))(.14)(.09))0.918Link to original

Random variables

05 Theory

Theory 1

Random variable

A random variable (RV) X on a probability space (S,,P) is a function X:S.

So X assigns to each outcome a number.

Note: The word ‘variable’ indicates that an RV outputs numbers.

Random variables can be formed from other random variables using mathematical operations on the output numbers.

Given random variables X and Y, we can form these new ones:

12(X+Y),XY,cosX,X2,etc.

Suppose sS is some particular outcome. Then, for example, (X+Y)(s) is by definition X(s)+Y(s).


Random variables determine events.

  • Given a, the event “X=a” is equal to the set X1(a)
  • That is: the set of outcomes mapped to a by X
  • That is: the event “X took the value a

Such events have probabilities. We write them like this:

P[X=a]P[X1(a)]

This generalized to events where X lies in some range or set, for example:

P[aX<b],P[X{2,4,5,6,9}]

The axioms of probability translate into rules for these events.

For example, additivity leads to:

P[X<0]+P[X=0]+P[0<X3]+P[3<X]=1

A discrete random variable has probability concentrated at a discrete set of real numbers.

  • A ‘discrete set’ means finite or countably infinite.
  • The distribution of probability is recorded using a probability mass function (PMF) that assigns probabilities to each of the discrete real numbers.
  • Numbers with nonzero probability are called possible values.

PMF

The PMF function PX(k), for X a discrete RV, is defined by:

PX(k)=P[X=k]

A continuous random variable has probability spread out over the space of real numbers.

  • The distribution of probability is recorded using a probability density function (PDF) which is integrated over intervals to determine probabilities.

PDF

The PDF function for Y (a CRV) is written fY(x) for x, and probabilities are calculated like this:

P[aYb]=abfY(x)dx

center


For any RV, whether discrete or continuous, the distribution of probability is encoded by a function:

CDF

The cumulative distribution function (CDF) for a random variable X is defined for all x by: FX(x)=P[Xx]

Notes:

  • Sometimes the relation to X is omitted and one sees just “F(x).”
  • Sometimes the CDF is called, simply, “the distribution function” because:

The CDF works for a discrete, continuous, or mixed RV

  • PMF is for discrete only
  • PDF is for continuous only
  • CDF covers both and covers mixed RVs

The CDF of a discrete RV is always a stepwise increasing function. At each step up, the jump size matches the PMF value there.

From this graph of FX(x):

we can infer the PMF values based on the jump sizes:

PX(1)PX(0)PX(1)PX(2)PX(3)PX(4)
01/83/83/81/80

For a discrete RV, the CDF and the PMF can be calculated from each other using formulas.

PMF from CDF

Given a PMF PX(x), the CDF is determined by:

FX(x)=kixPX(ki)

where {k1,k2,} is the set of possible values of X.

Link to original

06 Illustration

Example - PDF and CDF: Roll 2 dice

PDF and CDF: Roll 2 dice

Roll two dice colored red and green. Let XR record the number of dots showing on the red die, XG the number on the green die, and let S be a random variable giving the total number of dots showing after the roll, namely S=XR+XG.

  • Find the PMFs of XR and of XG and of S.
  • Find the CDF of S.
  • Find P[S=8].

Solution

(1) Construct sample space:

Denote outcomes with ordered pairs of numbers (i,j), where i is the number showing on the red die and j is the number on the green one.

Therefore i,j are integers satisfying 1i,j6.


(2) Create chart of outcomes:

center


(3) Define random variables:

We have XR(i,j)=i and XG(i,j)=j.

Therefore S(i,j)=i+j.


(4) Find PMF of XR:

Use variable k for each possible value of XR, so k=1,2,,6. Find PXR(k):

PXR(k)=P[XR=k]|outcomes with k on red||all outcomes|636=16

Therefore PXR(k)=1/6 for every k.


(5) Find PMF of XG similarly:

PXG(k)=16for allk

(6) Find PMF of S:

PS(k)=P[S=k]|outcomes with sum k||all outcomes|

Count outcomes along diagonal lines in the chart.

k23456789101112
PS(k)136236336436536636536436336236136

center

Evaluate: P[S=8]=5/36.


(7) Find CDF of S:

CDF definition: FS(x)=P[Sx]

Apply definition: add new PMF value at each increment:

FS(x)={01x<21/362x<33/363x<46/364x<510/365x<615/366x<721/367x<826/368x<930/369x<1033/3610x<1135/3611x<1236/3612x Link to original

Example - Total heads count; binomial expansion of 1

PMF for total heads count; binomial expansion of 1

A fair coin is flipped n times.

Let X be the random variable that counts the total number of heads in each sequence.

The PMF of X is given by:

PX(k)=(nk)(12)n

Since the total probability must add to 1, we know this formula must hold:

1=possible kPX(k)1=k=0n(nk)(12)n

Is this equation really true?

There is another way to view this equation: it is the binomial expansion (x+y)n where x=12 and y=12:

(12+12)n=k=0n(nk)(12)n Link to original

Example - Life insurance payouts

Life insurance payouts

A life insurance company has two clients, A and B, each with a policy that pays $100,000 upon death. Consider events D1 that the older client dies next year, and D2 that the younger dies next year. Suppose P[D1]=0.10 and P[D2]=0.05.

Define a random variable X measuring the total money paid out next year in units of $1,000. The possible values for X are 0, 100, 200. Now calculate:

P[X=0]P[D1c]P[D2c]0.950.900.86P[X=100]0.050.90+0.950.100.14P[X=200]0.050.100.005Link to original

Example - Probabilities via CDF

Probabilities via CDF

Suppose the CDF of X is given by FX(x)=11+ex. Compute:

(a) P[X1] (b) P[X<1] (c) P[0.5X0.2] (d) P[2X]

Link to original