Bayes’ Theorem
01 Theory
Bayes’ Theorem
For any events
and :
- !! Bayes’ Theorem is sometimes called Bayes’ Rule.
Bayes’ Theorem - Derivation
Start with the observation that
, or event “ AND ” equals event “ AND ”. Apply the multiplication rule to each of order:
Equate them and rearrange:
The main application of Bayes’ Theorem is to calculate
Note: these notes use alphabetical order
02 Illustration
Example - Bayes’ Theorem - COVID tests
Intuition - COVID testing
Some people find the low number surprising. In order to repair your intuition, think about it like this: roughly 2.5% of tests are positive, with roughly 2% coming from false positives, and roughly 0.5% from true positives. The true ones make up only
of the positive results! (This rough approximation is by assuming
.) If two tests both come back positive, the odds of COVID are now 98%.
If only people with symptoms are tested, so that, say, 20% of those tested have COVID, that is,
, then one positive test implies a COVID probability of 92%.
Exercise - Bayes’ Theorem and Multiplication: Inferring bin from marble
Independence
03 Theory
Two events are independent when information about one of them does not change our probability estimate for the other. Mathematically, there are three ways to express this fact:
Independence
Events
and are independent when these (logically equivalent) equations hold:
- ! The last equation is symmetric in
and . - Check:
and - This symmetric version is the preferred definition of the concept.
- Check:
Multiple-independence
A collection of events
is mutually independent when every subcollection satisfies: A potentially weaker condition for a collection
is called pairwise independence, which holds when all 2-member subcollections are independent: One could also define
-member independence, or -member independence. Plain ‘independence’ means any-member independence.
04 Illustration
Exercise - Independence and complements
Example - Checking independence by hand
Tree diagrams
05 Theory
A tree diagram depicts the components of a multi-stage experiment. Nodes, or branch points, represent sources of randomness.
An outcome of the experiment is represented by a pathway taken from the root (left-most node) to a leaf (right-most node). The branch chosen at a given node junction represents the outcome of the “sub-experiment” constituting that branch point. So a pathway encodes the outcomes of all sub-experiments.
Each branch from a node is labeled with a probability number. This is the probability that the sub-experiment of that node has the outcome of that branch.
- The probability label on some branch is the conditional probability of that branch, assuming the pathway from root to prior node.
- In the example:
. - Therefore, branch labels from given node sum to 1. (Law of Total Probability)
- In the example:
- The probability of a given (overall) outcome is the product of the probabilities on each branch of the pathway to that outcome.
- Makes sense, because (e.g.):
- More generally: remember that (e.g.):
- This overall outcome probability may be written at the leaf.
- Makes sense, because (e.g.):
One can also use a tree diagram to remember quickly how to calculate certain probabilities.
For example, what is
For example, what is
06 Illustration
Example - Tree diagrams: Marble transferred, marble drawn
Counting
07 Theory
In many “games of chance”, it is assumed by symmetry principles that all outcomes are equally likely. From this assumption we infer the rule for
When this formula applies, it is important to be able to count total outcomes, as well as outcomes satisfying various conditions.
Permutations
Permutations count the number of ordered lists one can form from some items. For a list of
items taken from a total collection of , the number of permutations is:
To see where this comes from:
There are
Combinations, binomial coefficient
Combinations count the number of sets (ignoring order) one can form from some items. We define a notation for it like this:
This counts the number of sets of distinct elements taken from a total collection of items. Another name for combinations is the binomial coefficient.
This formula can be derived from the formula for permutations. The possible permutations can be partitioned into combinations: each combination gives a set, and by specifying an ordering of elements in the set, we get a permutation. For a set of
This notation,
There are also ‘higher’ combinations:
Multinomial coefficient
The general multinomial coefficient is defined by the formula:
where
and . The multinomial coefficient measures the number of ways to partition
items into sets with sizes , respectively.
Notice that
08 Illustration
Exercise - Combinations: Counting teams with Cooper
Example - Combinations: Groups with Haley and Hugo
Example - Counting VA license plates
Counting out 4 teams