The centrality of experimental design, part 4

Part 1 is here, part 2 is here, and part 3 is here. It is now time to look at null hypothesis testing.

We want to test hypotheses about the set of true scores, \{\tau_1, \tau_2, \ldots, \tau_{|\mathcal{W}|}\}, or equivalently \{\beta_1, \ldots, \beta_{|\mathcal{W}|} \}. In particular, we wish to test whether there are no differences between the scores of the wines. More formally, we wish to test the null hypothesis

H_0: \beta_1 = \beta_2 = \ldots = \beta_{|\mathcal{W}|} = 0

Figure 3 shows the beginnings of making such an inference for the data in Figure 2. That these distributions overlap so little shows that we are extremely likely to correctly reject H_0 using only four scores for each wine as long as there is such a large difference between at least one pair of \{\beta_1, \ldots, \beta_{|\mathcal{W}|}\}. We would like to do this more formally, however.

Returning to the sum of the square deviations of our data, we can decompose it as a sum of the squares of our regression errors and the square deviations of our model from the data mean:

\sum_{w\in\mathcal{W}} \sum_{n=1}^N (Y_{wn} - \hat \beta_0)^2 = \sum_{w\in\mathcal{W}} \sum_{n=1}^N (Y_{wn} - \hat Y_{w})^2 + \sum_{w\in\mathcal{W}} \sum_{n=1}^N (\hat Y_{wn} - \hat\beta_0)^2

= \sum_{w\in\mathcal{W}} \sum_{n=1}^N (Y_{wn} - \hat \beta_0 - \hat \beta_w)^2 + \sum_{w\in\mathcal{W}} N\hat\beta_w^2.

Notice that all of these terms are proportional to variances. The one on the left is the variance of our data from the “grand mean.” The first one on the right is proportional to the variance of our data from the sample mean of each wine (called, “within-group variance”). And the last one is proportional to the variance of all our predicted data to the grand mean (called, “between-group variance”). The expectations of the terms on the right are:

E\left[ \sum_{w\in\mathcal{W}}\sum_{n=1}^N (Y_{wn} - \hat \beta_0 - \hat \beta_w)^2\right ] = \frac{(N-1)}{N}\sum_{w\in\mathcal{W}} \sum_{n=1}^N ( \textrm{Var}(Z_{wn}) + E[Z_{wn}]^2 )

- \sum_{w\in\mathcal{W}} \sum_{n=1}^N\mathop{\sum_{m=1}^N}_{m\ne n} ( \textrm{Cov}(Z_{wn}, Z_{wm}) + E[Z_{wn}]E[Z_{wm}])

E\left[ \sum_{w\in\mathcal{W}} N\hat\beta_w^2\right ] = \sum_{w\in\mathcal{W}} N\beta_w^2 + 2 \sum_{w\in\mathcal{W}}\sum_{n=1}^N \beta_w E[Z_{wn}]

+ \frac{(|\mathcal{W}|-1)}{N|\mathcal{W}|} \sum_{w\in\mathcal{W}} \sum_{n,m=1}^N (\textrm{Cov}(Z_{wm}, Z_{wn}) + E[Z_{wn}]E[Z_{wm}] )

-\frac{1}{N|\mathcal{W}|} \sum_{w\in\mathcal{W}} \sum_{v\in\mathcal{W}\backslash \{w\}} \sum_{n,m=1}^N (\textrm{Cov}(Z_{wn}, Z_{vm}) + E[Z_{wn}]E[Z_{vm}]).

If for all wines and scores, Z is iid with zero mean and variance \sigma^2, then the above become

E\left[ \frac{1}{|\mathcal{W}|-1}\sum_{w\in\mathcal{W}}N\hat\beta_w^2\right ] = \sigma^2 + \sum_{w\in\mathcal{W}}N\beta_w^2/(|\mathcal{W}|-1)

E\left[ \frac{1}{(N-1)|\mathcal{W}|}\sum_{w\in\mathcal{W}}\sum_{n=1}^N (Y_{wn} - \hat \beta_0 - \hat \beta_w)^2\right ] = \sigma^2.

If in addition H_0 is in effect, then \beta_w = 0, and we expect these two terms to be equal. Hence, we wish to compute our estimates of these quantities and see if

\frac{1}{|\mathcal{W}|-1}\sum_{w\in\mathcal{W}}N\hat\beta_w^2 \approx \frac{1}{(N-1)|\mathcal{W}|}\sum_{w\in\mathcal{W}}\sum_{n=1}^N (Y_{wn} - \hat \beta_0 - \hat \beta_w)^2.

More formally, with H_0 in effect and \{Z_{wn}\} iid with zero mean and variance \sigma^2, then

\sum_{w\in\mathcal{W}}\sum_{n=1}^N (Y_{wn} - \hat \beta_0 - \hat \beta_w)^2/\sigma^2 \sim \chi^2_{(N-1)|\mathcal{W}|}

\sum_{w\in\mathcal{W}}N\hat\beta_w^2/\sigma^2 \sim \chi^2_{|\mathcal{W}|-1}

and so

f := \frac{\sum_{w\in\mathcal{W}}N\hat\beta_w^2/(|\mathcal{W}|-1)}{\sum_{w\in\mathcal{W}}\sum_{n=1}^N (Y_{wn} - \hat \beta_0 - \hat \beta_w)^2/((N-1)|\mathcal{W}|)} \sim F_{|\mathcal{W}|-1,(N-1)|\mathcal{W}|}.

Hence, we compute the statistic f, and see whether the probability of achieving its value as or more extreme in an F-distribution with |\mathcal{W}|-1 and (N-1)|\mathcal{W}| degrees of freedom exceeds our significance level \alpha.

For the results shown in Fig. 2, the F-statistic is 20.36 with 3 degrees of freedom in the numerator and 12 degrees of freedom in the denominator. The probability of seeing a statistic at least that large given H_0 and \{Z_{wn}\} iid with zero mean and variance \sigma^2, is p < 10^{-4}. We are thus compelled to reject H_0. For the results in Table 1, the F-statistic is 9.06 and p < 0.0021. For a level of statistical significance of 0.05, we are thus also compelled to reject H_0 under limitations imposed by our measurement model and assumptions on \{Z_{wn}\}.

Fig4.png

It is interesting to see whether there are major discrepancies in the result of hypothesis testing when using the measurement model that does not take into consideration that the responses are integers. The figure above compares the p-values observed for a resulting F-statistic from many simulations of 4 scores of 4 wines with H_0 in effect, and with Z_{wn} iid zero mean Gaussian with \sigma^2 = 0.1. We compare the results when we restrict measurements to be integers in [1,5], to those without such a restriction, for several true wine parameters. When the true parameter is an integer, the two appear to be quite in agreement. In other words, our \alpha does reflect the probability of making a type 1 error, i.e., rejecting H_0 when it is actually true. When the true parameter is not an integer, we see that all p-values occur more frequently than they we expect them to, and so our \alpha underestimates the probability of making a type 1 error.

In the next, and final, part, I will reveal the fatal flaw.

Advertisements

One thought on “The centrality of experimental design, part 4

  1. Pingback: The centrality of experimental design, part 5 | High Noon GMT

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s