Theory 1 - Significance testing

Significance test

Ingredients of a significance test (unary hypothesis test):

  • H0 — Null hypothesis event
    • Identify a Claim
    • Then: H0 is background assumption (supposing Claim isn’t known)
    • Goal is to invalidate H0 in favor of Claim
  • R — Rejection Region event (decision rule)
    • R is written in terms of decision statistic X and significance level α
    • R is unlikely assuming H0. R is more likely if Claim
  • P[R|H0] — Able to compute this
    • Usually: inferred from fX|H0 or PX|H0
    • Adjust R to achieve P[R|H0]=α

Significance level

Suppose we are given a null hypothesis H0 and a rejection region R.

The significance level of R is:

α=P[R|H0]=P[reject H0|H0 is true]

Sometimes the condition is dropped and we write α=P[R], e.g. when a background model without assuming H0 is not known.

Null hypothesis implies a distribution

Usually S is unspecified, yet H0 determines a known distribution.

In this case H0 will not take the form of an event in a sample space, H0S.

At a minimum, H0 must determine P[R|H0].

We do NOT need these details:

  • Background sample space S
  • Non-conditional distribution (full model): fX or PX
  • Complement conditionals: fX|H0c or PX|H0c

In basic statistical inference theory, there are two kinds of error.

  • Type I error concludes with rejecting H0 when H0 is true.
  • Type II error concludes with maintaining H0 when H0 is false.

Type I error is usually a bigger problem. We want to consider H0 as “innocent until proven guilty.”

H0 is trueH0 is false
Maintain null hypothesisMade right callWrong acceptance
Type II Error
Reject null hypothesisWrong rejection
Type I Error
Made right call

To design a significance test at α, we must identify H0, and specify R with the property that P[R|H0]=α.

When R is written using a variable X, we must choose between:

  • One-tail rejection region: x with R(x)r or x with R(x)r
  • Two-tail rejection region: x with |R(x)μ|c