4.5.3 Power and Significance Level

4.5.3  Power and Significance Level

In our example, we sought to limit to .05 the probability, conditional on  being valid, of rejecting . Rejecting a valid null hypothesis is known as a type I error. If we design a test so the conditional probability of a type I error does not exceed ε, we say the test has significance level ε.

A second form of error, called type II error, occurs when a test fails to reject an invalid null hypothesis. In our coin tossing example, our test design focused on avoiding type I error, but it could have focused on avoiding type II error.

The power of a test is the probability of it rejecting . That probability depends, of course, on the unknown parameter θ—significance level is a constant, whereas power is a function of θ. Despite this difference, significance level and power are somewhat competing priorities. At θ = θ0, we want low power, because we don’t want to reject a valid null hypothesis. Because significance level is a cap on power at θ = θ0, we desire a low significance level. However, we want power to be high when θ ≠ θ0, but low power when θ = θ0 tends to produce low power when θ ≠ θ0. Hence, the desirable property of a low significance level tends to be incompatible with the desirable property of high power when θ ≠ θ0. The priorities compete.

Exercises
4.6

Compare the two questions

  • “How probable is it that  is true given the data we obtained?” and
  • “How probable would it have been for us to obtain the data we did, assuming  is true?”

Are the two questions equivalent, or are they different questions with potentially different answers? Explain your reasoning.
Solution

4.7

In our coin tossing example, we constructed the closed interval [41, 60] as a .05 significance level non-rejection region. Construct a closed interval that is a .01 significance level non-rejection region. Is your solution unique?
Solution

4.8

In our coin tossing example and in the previous exercise, we specified tests at the .05 and .01 significance levels. What is the power of these tests assuming a fair coin? What is their power assuming the probability p of heads is 0.48? What is their power assuming p is 0.40?
Solution