Statistical testing cont’d
01
Significance test: Testing a coin by flipping until heads
Design a significance test to test the hypothesis that a given coin is fair. You think it may be biased towards tails.
Your test runs the following experiment: flip the coin repeatedly until the first time a heads comes up. Let
be the flip number of the first heads. This is your decision statistic. Your test should have significance level
. Which of these coins would pass your test?
- Two-headed coin
- Two-tailed coin
- Both
- Neither
02
Significance test: Valves at various temperatures
The lifetime of a certain fuel injection valve is known to follow an exponential distribution,
, where in failings per year and is the ambient temperature in degrees Celsius. Sometimes the valves fail a good deal more frequently than usual, possibly due to cracked gaskets used in construction. To detect failings from cracked gaskets, each day
valves are monitored in use at for the full day and the number that fail is recorded. (a) Suppose a significance test is designed such that it rejects the hypothesis “normal valves, no cracked gaskets” when just one (or more) fail the test. What is the significance level of this test, as a function of
? (b) How many valves would have to be tested every day at
in order to achieve a significance of ? (Find .) (c) Is
(to achieve ) increasing, decreasing, or constant with increasing test temperature?
03
Significance test: Blue eyes
A redditor claims that 10% of people have blue eyes, but you think it is not that many. You work at the DMV for the summer, so you write down the eye color recorded on drivers’ licenses of various people in the database.
(a) Suppose you record the eye color of 1000 people and let
be the number that are blue. If the rejection region is , what is the significance level of the test? (b) Take again the experiment in (a). If you want a significance level of
, what should the rejection region be in your test? (c) Suppose the fact is that 7% of people have blue eyes. How likely is it that your test in (b) rejects
?
04
Binary hypothesis test: Identifying Uranium
You are testing gram samples of pure Uranium to see if they are enriched. You have a Geiger counter that counts a number of gamma rays that come from nearby fission events in 1 second intervals after you press the count button.
If the sample is enriched, you expect a Poisson distribution
of gamma rays in the counter with an average of 20. If the sample is not enriched (the null hypothesis), the average count will be 10. (a) Design an ML test to decide whether it is ordinary
or enriched ( ). What is ? What are the probabilities of Type I, Type II, and Total error? (b) After running the test many times, you have noticed that 70% of the samples are ordinary, while 30% are enriched. Now design an MAP test. What is
? What are the probabilities of Type I, Type II, and Total error? (c) Missing a bit of enriched Uranium is obviously a major problem. The damage to your reputation and pocketbook of missing enriched Uranium is
the damage caused by incorrectly labeling ordinary Uranium as enriched. Now design an MC test. What is ? What are the probabilities of Type I, Type II, and Total error? (d) What is the expected cost of each application of the MC test, assuming the cost of a false alarm is $10,000? What is this number for the MAP test?
05
Binary hypothesis test: Light bulbs
Light bulbs from box
(the null hypothesis) typically last , and bulbs from box last . You have some bulbs but don’t know which box they came from. Bulb lifetimes are exponential. It costs $50 in processing if you mistakenly assign a
bulb to box , and $20 if you assign an bulb to box . After working at this for a while, you observed that 60% of the bulbs you see come from box
, and the rest from box . Design a binary hypothesis test using MC design to make a decision rule to assign bulbs to boxes.
(a) What is
? (b) What are the probabilities of Type I, Type II, and Total error?
(c) What is the expected cost for each application of the test?