4.5.1 Test Statistics
Let X be a random variable whose distribution depends on an unknown parameter θ. We wish to test the null hypothesis θ = θ0. We will empirically gather data {x[1], x[2], … , x[m]}, which we shall treat as a realization of a sample {X[1], X[2], … , X[m]}. A test statistic is a function of the sample, S(X[1], X[2], … , X[m]). Depending on its realized value s(x[1], x[2], … , x[m]), we will reject or not reject . A hypothesis test is a test statistic S with a rule indicating for which realizations s the null hypothesis is rejected. The set of realizations for which
is not rejected is called the non-rejection region.
4.5.2 Example: Coin Tossing
We wish to determine if a particular coin is fair. Let X be the Bernoulli random variable for the result of a single toss of the coin, taking the value 1 if the coin comes up heads and 0 if it comes up tails. Accordingly, X ~ B(1, p) and our null hypothesis is that p = 0.5.
We will toss the coin a hundred times and observe the number of heads, so S = X[1] + X[2] + … + X[100]. Assuming , S ~ B(100, 0.5), which we can use to calculate probabilities. There are lots of ways we might specify a non-rejection region for our test. One approach is to construct it so that, assuming
is true, there is no more than a .05 probability of inadvertently rejecting
. Based on the B(100, 0.5) CDF, we determine that the closed interval [41, 60] is such a region. Other intervals satisfy the criteria. We select this one because it is most closely centered about 50, which is the expected number of heads in a hundred tosses, assuming
. Our test is then to toss the coin a hundred times, observe the number s of heads, and reject the coin as not fair if s < 41 or if s > 60.