Significance testing

06 Theory - Significance testing

Theory 1 - Significance testing

Significance test

Ingredients of a significance test (unary hypothesis test):

  • Null hypothesis event
    • Identify a Claim
    • Then: is background assumption (supposing Claim isn’t known)
    • Goal is to invalidate in favor of Claim
  • Rejection Region (decision rule): an event
    • is unlikely assuming
    • Directionality: is more likely if Claim
    • Write in terms of decision statistic and significance level
  • Ability to compute
    • Usually: inferred from or
    • Adjust to achieve

Significance level

Suppose we are given a null hypothesis and a rejection region .

The significance level of is:

Sometimes the condition is dropped and we write , e.g. when a background model without assuming is not known.

Null hypothesis implies a distribution

Frequently will not take the form of an event in a sample space, .

Usually is unspecified, yet determines a known distribution.

At a minimum, the assumption of must determine numbers .

More generally, we do not need these details:

  • Background sample space
  • Non-conditional distribution (full model): or
  • Complement conditionals: or

In basic statistical inference theory, there are two kinds of error.

  • Type I error concludes with rejecting when is true.
  • Type II error concludes with maintaining when is false.

Type I error is usually a bigger problem. We want to consider “innocent until proven guilty.”

 is true is false
Maintain null hypothesisMade right callWrong acceptance
Reject null hypothesisWrong rejection
Made right call

To design a significance test at , we must identify , and specify with the property that .

When is written using a variable , we must choose between:

  • One-tail rejection region: with or with
  • Two-tail rejection region: with
Link to original

07 Illustration

Example - One-tail test: Weighted die

One-tail test: Weighted die

Your friend gives you a single regular die, and say she is worried that it has been weighted to prefer the outcome of 2. She wants you to test it.

Design a significance test for the data of 20 rolls of the die to determine whether the die is weighted. Use significance level .

Solution

(1) Let count the number of 2s that come up.

The Claim: “the die is weighted to prefer 2” The null hypothesis : “the die is normal”

Assuming is true, then , and therefore:


(2) ⚠️ Notice that “prefer 2” implies the claim is for more 2s than normal.

Therefore: Choose a one-tail rejection set.

Need such that

  • Equivalently:

(3) Solve for by computing conditional CDF values:

01234567
0.0260.1300.3290.5670.7690.8980.9630.989

Therefore, choose . Then and no smaller (integer) will have Type I error below 0.05.

The final answer is:

Link to original

Two-tail test: Circuit voltage

Two-tail test: Circuit voltage

A boosted AC circuit is supposed to maintain an average voltage of with a standard deviation of . Nothing else is known about the voltage distribution.

Design a two-tail test incorporating the data of 40 independent measurements to determine if the expected value of the voltage is truly . Use .

Solution

(1) Use as the decision statistic, i.e. the sample mean of 40 measurements of .

The Claim to test: The null hypothesis :

Rejection region:

where is chosen so that


(2) Assuming , we expect that:

Recall Chebyshev’s inequality:


(3) Now solve:

Therefore the rejection region should be:

Link to original

One-tail test with a Gaussian: Weight loss drug

One-tail test with a Gaussian: Weight loss drug

Assume that in the background population in a specific demographic, the distribution of a person’s weight satisfies . Suppose that a pharmaceutical company has developed a weight-loss drug and plans to test it on a group of 64 individuals.

Design a test at the significance level to determine whether the drug is effective.

Solution

(1) Since the drug is tested on 64 individuals, we use the sample mean as the decision statistic.

The Claim: “the drug is effective in reducing weight” The null hypothesis : “no effect: weights on the drug still follow

Assuming is true, then .

⚠️ One-tail test because the drug is expected to reduce weight (unidirectional).

Rejection region:


(2) Compute that .

Since , we know that .


(3) Furthermore:

Then:


(4) Solve:

Therefore, the rejection region:

Link to original

Binary hypothesis testing

01 Theory - Binary testing, MAP and ML

Theory 2 - Binary testing, MAP and ML

Binary hypothesis test

Ingredients of a binary hypothesis test:

  • Complementary hypotheses and
    • Maybe also know the prior probabilities and
    • Goal: determine which case we are in, or
  • Decision rule made of complementary events and
    • is likely given , while is likely given
    • Decision rule: outcome , accept ; outcome , accept
    • Usually: written in terms of decision statistic using a design
    • We cover three designs:
      • MAP and ML (minimize ‘error probability’)
      • MC (minimizes ‘error cost’)
    • Designs use and (or , ) to construct and

MAP design

Suppose we know:

  • Both prior probabilities and
  • Both conditional distributions and (or and )

The maximum a posteriori probability (MAP) design for a decision statistic :

Discrete case:

Continuous case:

Then .

The MAP design minimizes the total probability of error.

ML design

Suppose we know only:

  • Both conditional distributions

The maximum likelihood (ML) design for :

ML is a simplified version of MAP. (Set and to .)


The probability of a false alarm, a Type I error, is called .

The probability of a miss, a Type II error, is called .

Total probability of error:

False alarm false alarm

Suppose sets off a smoke alarm, and is ‘no fire’ and is ‘yes fire’.

Then is the odds that we get an alarm assuming there is no fire.

This is not the odds of experiencing a false alarm (no context). That would be .

This is not the odds of a given alarm being a false one. That would be .

Link to original

02 Illustration

Example - ML test: Smoke detector

ML test: Smoke detector

Suppose that a smoke detector sensor is configured to produce when there is smoke, and otherwise. But there is background noise with distribution .

Design an ML test for the detector electronics to decide whether to activate the alarm.

What are the three error probabilities? (Type I, Type II, Total.)

Solution

(1) First, establish the conditional distributions:

Density functions:


(2) The ML condition becomes:


(3) Therefore, is , while is .

The decision rule is: activate alarm when .


(4) Type I error:

Type II error:

Total error:

Link to original

Example - MAP test: Smoke detector

MAP test: Smoke detector

Suppose that a smoke detector sensor is configured to produce when there is smoke, and otherwise. But there is background noise with distribution .

Suppose that the background chance of smoke is . Design a MAP test for the alarm.

What are the three error probabilities? (Type I, Type II, Total.)

Solution

(1) First, establish priors:

The MAP condition becomes:


(2) Therefore, is , while is .

The decision rule is: activate alarm when .


(3) Type I error:

Type II error:

Total error:

Link to original

03 Theory - MAP criterion proof

Theory 3 - MAP criterion proof

Explanation of MAP criterion - discrete case

First, we show that the MAP design selects for all those which render more likely than .

Observe this Calculation:

Now, take the condition for , and cross-multiply:

Divide both sides by and apply the above Calculation in reverse:

This is what we sought to prove.


Next, we verify that the MAP design minimizes the total probability of error.

The total probability of error is:

Expand this with summation notation (assuming the discrete case):

Now, how do we choose the set (and thus ) in such a way that this sum is minimized?

Since all terms are positive, and any may be placed in or in freely and independently of all other choices, the total sum is minimized when we minimize the impact of placing each .

So, for each , we place it in if:

That is equivalent to the MAP condition.

Link to original

04 Theory - MC design

Theory 4 - MC design

  • Write for cost of false alarm, i.e. cost when is true but decided .
    • Probability of incurring cost is .
  • Write for cost of miss, i.e. cost when is true but decided .
    • Probability of incurring cost is .

Expected value of cost incurred

MC design

Suppose we know:

  • Both prior probabilities and
  • Both conditional distributions and (or and )

The minimum cost (MC) design for a decision statistic :

Discrete case:

Continuous case:

Then .

The MC design minimizes the expected value of the cost of error.

MC minimizes expected cost

Inside the argument that MAP minimizes total probability of error, we have this summation:

The expected value of the cost has a similar summation:

Following the same reasoning, we see that the cost is minimized if each is placed into precisely when the MC design condition is satisfied, and otherwise it is placed into .

Link to original

05 Illustration

Example - MC Test: Smoke detector

MC Test: Smoke detector

Suppose that a smoke detector sensor is configured to produce when there is smoke, and otherwise. But there is background noise with distribution .

Suppose that the background chance of smoke is . Suppose the cost of a miss is the cost of a false alarm. Design an MC test for the alarm.

Compute the expected cost.

Solution

(1) We have priors:

And we have costs:

(The ratio of these numbers is all that matters in the inequalities of the condition.)

The MC condition becomes:


(2) Therefore, is , while is .

The decision rule is: activate alarm when .


(3) Type I error:

Type II error:

Total error:


(4) PMF of total cost:

Therefore .

Link to original