Bayes’ Theorem

01 Theory

Bayes’ Theorem

For any events and :

  • !! Bayes’ Theorem is sometimes called Bayes’ Rule.

Bayes’ Theorem - Derivation

Start with the observation that , or event “ AND ” equals event “ AND ”.

Apply the multiplication rule to each of order:

Equate them and rearrange:

The main application of Bayes’ Theorem is to calculate when it is easy to calculate from the problem setup. Often this occurs in multi-stage experiments where event describes outcomes of an intermediate stage.

Note: these notes use alphabetical order , as a mnemonic for temporal or logical order, i.e. that comes first in time, or that otherwise that is the prior conditional from which it is easier to calculate .

02 Illustration

Example - Bayes’ Theorem - COVID tests

10 - Bayes’ Theorem: COVID tests

Intuition - COVID testing

Some people find the low number surprising. In order to repair your intuition, think about it like this: roughly 2.5% of tests are positive, with roughly 2% coming from false positives, and roughly 0.5% from true positives. The true ones make up only of the positive results!

(This rough approximation is by assuming .)

If two tests both come back positive, the odds of COVID are now 98%.

If only people with symptoms are tested, so that, say, 20% of those tested have COVID, that is, , then one positive test implies a COVID probability of 92%.

Exercise - Bayes’ Theorem and Multiplication: Inferring bin from marble

11 - Inferring bin from marble

Independence

03 Theory

Two events are independent when information about one of them does not change our probability estimate for the other. Mathematically, there are three ways to express this fact:

Independence

Events and are independent when these (logically equivalent) equations hold:

  • ! The last equation is symmetric in and .
    • Check: and
    • This symmetric version is the preferred definition of the concept.

Multiple-independence

A collection of events is mutually independent when every subcollection satisfies:

A potentially weaker condition for a collection is called pairwise independence, which holds when all 2-member subcollections are independent:

One could also define -member independence, or -member independence. Plain ‘independence’ means any-member independence.

04 Illustration

Exercise - Independence and complements

12 - Independence and complements

Example - Checking independence by hand

13 - Independence by hand: red and green marbles

Tree diagrams

05 Theory

A tree diagram depicts the components of a multi-stage experiment. Nodes, or branch points, represent sources of randomness. center|300

An outcome of the experiment is represented by a pathway taken from the root (left-most node) to a leaf (right-most node). The branch chosen at a given node junction represents the outcome of the “sub-experiment” constituting that branch point. So a pathway encodes the outcomes of all sub-experiments.

Each branch from a node is labeled with a probability number. This is the probability that the sub-experiment of that node has the outcome of that branch.

  • The probability label on some branch is the conditional probability of that branch, assuming the pathway from root to prior node.
    • In the example: .
    • Therefore, branch labels from given node sum to 1. (Law of Total Probability)
  • The probability of a given (overall) outcome is the product of the probabilities on each branch of the pathway to that outcome.
    • Makes sense, because (e.g.):
    • More generally: remember that (e.g.):
    • This overall outcome probability may be written at the leaf.

One can also use a tree diagram to remember quickly how to calculate certain probabilities.

For example, what is in the diagram? Answer: add up the pathway probabilities (leaf numbers) terminating in . That makes

For example, what is ? Answer: divide the leaf probability of by the total probability of . That makes:

06 Illustration

Example - Tree diagrams: Marble transferred, marble drawn

14 - Marble transferred, marble drawn

Counting

07 Theory

In many “games of chance”, it is assumed by symmetry principles that all outcomes are equally likely. From this assumption we infer the rule for : In words: the probability of event is the number of outcomes in divided by the number of possible outcomes.

When this formula applies, it is important to be able to count total outcomes, as well as outcomes satisfying various conditions.

Permutations

Permutations count the number of ordered lists one can form from some items. For a list of items taken from a total collection of , the number of permutations is:

To see where this comes from: There are choices for the first item, then for the second, then … then for the item. So the number is . Observe:

Combinations, binomial coefficient

Combinations count the number of sets (ignoring order) one can form from some items. We define a notation for it like this: This counts the number of sets of distinct elements taken from a total collection of items.

Another name for combinations is the binomial coefficient.

This formula can be derived from the formula for permutations. The possible permutations can be partitioned into combinations: each combination gives a set, and by specifying an ordering of elements in the set, we get a permutation. For a set of elements taken from items, there are ways to put them into a specific order. So the number of permutations must be a factor of greater than the number of combinations.

This notation, , is also called the binomial coefficient because it provides the coefficients of a binomial expansion: For example:

There are also ‘higher’ combinations:

Multinomial coefficient

The general multinomial coefficient is defined by the formula:

where and .

The multinomial coefficient measures the number of ways to partition items into sets with sizes , respectively.

Notice that so we already defined these values with binomial coefficients. But with , we have new values. They correspond to the coefficients in multinomial expansions. For example gives coefficients for .

08 Illustration

Exercise - Combinations: Counting teams with Cooper

15 - Counting teams with Cooper

Example - Combinations: Groups with Haley and Hugo

16 - Haley and Hugo from 2 groups of 3

Example - Counting VA license plates

17 - Counting VA license plates

Counting out 4 teams

18 - Counting out 4 teams