Events and outcomes

01 Theory

Theory 1

Events and outcomes – informally

  • An event is a description of something that can happen.
  • An outcome is a complete description of something that can happen.

All outcomes are events. An event is usually a partial description. Outcomes are events given with a complete description.

Here ‘complete’ and ‘partial’ are within the context of the probability model.

It can be misleading to say that an ‘outcome’ is an ‘observation’.

  • ‘Observations’ occur in the real world, while ‘outcomes’ occur in the model.
  • To the extent the model is a good one, and the observation conveys complete information, we can say ‘outcome’ for the observation.

Notice: Because outcomes are complete, no two distinct outcomes could actually happen in a run of the experiment being modeled.

When an event happens, the fact that it has happened constitutes information.

Events and outcomes – mathematically

  • The sample space is the set of possible outcomes, so it is the set of the complete descriptions of everything that can happen.
  • An event is a subset of the sample space, so it is a collection of outcomes.

For mathematicians: some “wild” subsets are not valid events. Problems with infinity and the continuum...

Notation

  • Write S for the set of possible outcomes, sS for a single outcome in S.

  • Write A,B,C,S or A1,A2,A3,S for some events, subsets of S.

  • Write for the collection of all events. This is frequently a huge set!

  • Write |A| for the cardinality or size of a set A, i.e. the number of elements it contains.

Using this notation, we can consider an outcome itself as an event by considering the “singleton” subset {ω}S which contains that outcome alone.

Link to original

02 Illustration

Example - Coin flipping

Coin flipping

Flip a fair coin two times and record both results.

  • Outcomes: sequences, like HH or TH.

  • Sample space: all possible sequences, i.e. the set S={HH,HT,TH,TT}.

  • Events: for example:

    • A={HH,HT}=“first was heads”
    • B={HT,TH}=“exactly one heads”
    • C={HT,TH,HH}=“at least one heads”

With this setup, we may combine events in various ways to generate other events:

Complex events: for example: AB={HT}, or in words:

“first was heads”AND“exactly one heads”=“heads-then-tails”

Notice that the last one is a complete description, namely the outcome HT. AB={HH,HT,TH}, or in words:

“first was heads”OR“exactly one heads”=“starts with heads, else it’s tails-then-heads” Link to original

Practice exercise

Coin flipping: counting subsets

Flip a fair coin five times and record the results.

How many elements are in the sample space? (How big is S?) How many events are there? (How big is ?)

Link to original

03 Theory

Theory 2

New events from old

Given two events A and B, we can form new events using set operations:

AB“event A OR event BAB“event A AND event BAc𝐧𝐨𝐭 event A

We also use these terms for events A and B:

  • They are mutually exclusive when AB=, that is, they have no elements in common.

  • They are collectively exhaustive AB=S, that is, when they jointly cover all possible outcomes.

In probability texts, sometimes AB is written “AB” or even (frequently!) “AB”.

Rules for sets

Algebraic rules

  • Associativity: (AB)C=A(BC). Analogous to (A+B)+C=A+(B+C).

  • Distributivity: A(BC)=(AB)(AC). Analogous to A(B+C)=AB+AC.

De Morgan’s Laws

  • (AB)c=AcBc

  • (AB)c=AcBc In other words: you can distribute “ c ” but must simultaneously do a switch .

Link to original

Probability models

04 Theory

Theory 1

Axioms of probability

A probability measure is a function P: satisfying:

Kolmogorov Axioms:

  • Axiom 1: P[A]0 for every event A (probabilities are not negative!)

  • Axiom 2: P[S]=1 (probability of “anything” happening is 1)

  • Axiom 3: additivity for any countable collection of mutually exclusive events:

P[A1A2A3]=P[A1]+P[A2]+P[A3]+ when:AiAj=for all ij

Notation: we write P[A] instead of P(A), even though P is a function, to emphasize the fact that A is a set.

Probability model

A probability model or probability space consists of a triple (S,,P):

  • S the sample space

  • the set of valid events, where every A satisfies AS

  • P: a probability measure satisfying the Kolmogorov Axioms

Finitely many exclusive events

It is a consequence of the Kolmogorov Axioms that additivity also works for finite collections of mutually exclusive events:

P[AB]=P[A]+P[B]P[A1An]=P[A1]++P[An]

Inferences from Kolmogorov

A probability measure satisfies these rules. They can be deduced from the Kolmogorov Axioms.

  • Negation: Can you find P[Ac] but not P[A]? Use negation:
P[A]=1P[Ac]
  • Monotonicity: Probabilities grow when outcomes are added:
ABP[A]P[B]
  • Inclusion-Exclusion: A trick for resolving unions:
P[AB]=P[A]+P[B]P[AB]

(even when A and B are not exclusive!)

Link to original

05 Illustration

Example - iPhones and iPads

iPhones and iPads

At Mr. Jefferson’s University, 25% of students have an iPhone, 30% have an iPad, and 60% have neither.

What is the probability that a randomly chosen student has some iProduct? (Q1)

What about both? (Q2)

Solution

(1) Set up the probability model:

A student is chosen at random:

Outcomes are chosen students.

The sample space S is the set of all students. Events are subsets of S.

Write O=“has iPhone” and A=“has iPad” (regarding the chosen student).

All students are equally likely to be chosen. Therefore P[E]=|E||S| for any event E. Therefore P[O]=0.25 and P[A]=0.30.

Furthermore, P[OcAc]=0.60. This states that 60% have “not iPhone AND not iPad”.


(2) Define the desired event:

Q1: desired event=OA

Q2: desired event=OA


(3) Compute the probabilities:

We do not know that O and A are exclusive.

We could try inclusion-exclusion:

P[OA]=P[O]+P[A]P[OA]

We know P[O]=0.25 and P[A]=0.30. So this formula, with given data, RELATES Q1 and Q2. It does not solve either one by itself.

We have not yet used the information that P[OcAc]=0.60.

To use this, simplify it with De Morgan’s Laws:

P[OcAc]P[(OA)c]1P[OA]

Therefore:

P[OcAc]=0.60P[OA]=0.40

We have answered Q1. Recall that inclusion-exclusion relates Q1 and Q2 and solve to answer Q2:

P[OA]=P[O]+P[A]P[OA]0.40=0.25+0.30P[OA]P[OA]=0.15Link to original

Example - Lucia is Host or Player

Lucia is Host or Player

The professor chooses three students at random for a game in a class of 40, one to be Host, one to be Player, one to be Judge. What is the probability that Lucia is either Host or Player?

Solution

(1) Set up the probability model:

Label the students 1 to 40. Write L for Lucia’s number.

Outcomes: assignments such as (H,P,J)=(2,5,8). These are ordered triples with distinct entries in 1,2,,40.

Sample space: S is the collection of all such distinct triples

Events: any subset of S

Probability measure: assume all outcomes are equally likely, so P[(i,j,k)]=P[(r,l,p)] for all i,j,k,r,l,p

In total there are 403938 triples of distinct numbers.

Therefore P[(i,j,k)]=1403938 for any specific outcome (i,j,k).

Therefore P[A]=|A|403938 for any event A. (Recall |A| is the number of outcomes in A.)


(2) Define the desired event:

We want to find P[“Lucia is Host or Player”]. Define A=“Lucia is Host” and B=“Lucia is Player”. Thus:

A={(L,j,k)|any j,k},B={(i,L,k)|any i,k}

So, in this notation, we seek P[AB].


(3) Compute the desired probability:

Importantly, AB= (mutually exclusive). There are no outcomes in S in which Lucia is both Host and Player.

By additivity, we infer P[AB]=P[A]+P[B].

Now compute P[A]. There are 3938 ways to choose j and k from the students besides Lucia. Therefore |A|=3938. Therefore:

P[A]|A|4039383938403938140

Now compute P[B]. It is similar: P[B]=140.

Finally compute that P[A]+P[B]=120, so the answer is:

P[AB]P[A]+P[B]120 Link to original

Conditional probability

06 Theory

Theory 1

Conditional probability

The conditional probability of “B given A” is defined by:

P[B|A]=P[BA]P[A]

This conditional probability P[B|A] represents the probability of event B taking place given the assumption that A took place. (All within the given probability model.)

By letting the actuality of event A be taken as a fixed hypothesis, we can define a conditional probability measure by plugging events into the slot of B:

P[|A]=P[A]P[A]

It is possible to verify each of the Kolmogorov axioms for this function, and therefore P[|A] itself defines a bona fide probability measure.

Conditioning

What does it really mean?

Conceptually, P[B|A] corresponds to creating a new experiment in which we run the old experiment and record data only those times that A happened. Or, it corresponds to finding ourselves with knowledge or data that A happened, and we seek our best estimates of the likelihoods of other events, based on our existing model and the actuality of A.

Mathematically, P[B|A] corresponds to restricting the probability function to outcomes in A, and renormalizing the values (dividing by P[A]) so that the total probability of all the outcomes (in A) is now 1.

The definition of conditional probability can also be turned around and reinterpreted:

Multiplication rule

P[AB]=P[A]P[B|A]

“The probability of A AND B equals the probability of A times the probability of B-given-A.”

This principle generalizes to any events in sequence:

Generalized multiplication rule

P[A1A2A3]=P[A1]P[A2|A1]P[A3|A1A2] P[A1An]=P[A1]P[A2|A1]P[A3|A1A2]P[An|A1An1]

The generalized rule can be verified like this. First substitute A2 for B and A1 for A in the original rule. Now repeat, substituting A3 for B and A1A2 for A in the original rule, and combine with the first one, and you find the rule for triples. Repeat again with A4 and A1A2A3, combine with the triples, and you get quadruples.

Link to original

07 Illustration

Practice exercise

Simplifying conditionals inclusion

Let AB. Simplify the following values:

P[A|B],P[A|Bc],P[B|A],P[B|Ac] Link to original

Example - Multiplication: draw two cards

Multiplication: draw two cards

Two cards are drawn from a standard deck (without replacement).

What is the probability that the first is a 3, and the second is a 4?

Solution

This “two-stage” experiment lends itself to a solution using the multiplication rule for conditional probability.

(1) Label events:

  • Write T for the event that the first card is a 3.
  • Write F for the event that the second card is a 4.

We seek P[TF]. We will use the multiplication rule:

P[TF]=P[T]P[F|T]

(2) Compute probabilities:

We know P[T]=452. (Does not depend on the second draw.)

For the conditional probability, note that if the first is a 3, then there are four remaining 4s and 51 remaining cards. Therefore:

P[F|T]=451

(3) Apply multiplication rule:

P[TF]=P[T]P[F|T]P[TF]=45245141351 Link to original

Example - Flip a coin, then roll dice

Multiplication: flip a coin, then roll dice

Flip a coin. If the outcome is heads, roll two dice and add the numbers. If the outcome is tails, roll a single die and take that number. What is the probability of getting a tails AND a number at least 3?

Solution

(1) This “two-stage” experiment lends itself to a solution using the multiplication rule for conditional probability.

Label the events of interest.

Let H and T be the events that the coin showed heads and tails, respectively.

Let A1,,A12 be the events that the final number is 1,,12, respectively.

The value we seek is P[TA3].


(2) Observe known (conditional) probabilities.

We know that P[H]=12 and P[T]=12.

We know that P[A5|T]=16, for example, or that P[A2|H]=136.


(3) Apply multiplication rule:

P[TA3]=P[T]P[A3|T]

We know P[T]=12 and can see by counting that P[A3|T]=23.

Therefore P[TA3]=13.

Link to original

Example - Coin flipping: at least 2 heads

Coin flipping: at least 2 heads

Flip a fair coin 4 times and record the outcomes as sequences, like HHTH.

Let A2 be the event that there are at least two heads, and A1 the event that there is at least one heads.

First let’s calculate P[A2].

Define A2, the event that there were exactly 2 heads, and A3, the event of exactly 3, and A4 the event of exactly 4. These events are exclusive, so:

P[A2]=P[A2A3A4]P[A2]+P[A3]+P[A4]

Each term on the right can be calculated by counting:

P[A2]=|A2|24(42)16616P[A3]=|A3|24(43)16416P[A4]=|A4|24(44)16116

Therefore, P[A2]=1116.

Now suppose we find out that “at least one heads definitely came up”. (Meaning that we know A1.) For example, our friend is running the experiment and tells us this fact about the outcome.

Now what is our estimate of likelihood of A2?

The formula for conditioning gives:

P[A2|A1]=P[A2A1]P[A1]

Now A2A1=A2. (Any outcome with at least two heads automatically has at least one heads.) We already found that P[A2]=1116. To compute P[A1] we simply add the probability P[A1], which is 416, to get P[A1]=1516.

Therefore:

P[A2|A1]=11/1615/161115 Link to original

08 Theory

Theory 2

Law of Total Probability: 2 cases

For any events A and B:

P[B]=P[A]P[B|A]+P[Ac]P[B|Ac]

This rule can also be called “Division into Cases.”

Interpretation: event B may be divided along the lines of A, with some of P[B] coming from the part in A and the rest from the part in Ac.

This law can be generalized to any partition of the sample space S. A partition is a collection of events Ai which are mutually exclusive and jointly exhaustive:

AiAj=,iAi=S

The generalized formulation of Total Probability for a partition is:

Law of Total Probability: n cases

For a partition Ai of the sample space S:

P[B]=i=1nP[Ai]P[B|Ai]

By setting n=2 with A1=A and A2=Ac, we recover the 2-case Law from the n-case version of the Law.

Link to original

09 Illustration

Practice exercise

Marble transferred, marble drawn

Setup:

  • Bin 1 holds five red and four green marbles.
  • Bin 2 holds four red and five green marbles.

Experiment:

  • You take a random marble from Bin 1 and put it in Bin 2 and shake Bin 2.
  • Then you draw a random marble from Bin 2 and look at it.

What is the probability that the marble you look at is red?

Link to original