# The centrality of experimental design, part 3

Part 1 is here, and part 2 is here. Having defined our measurement model, it is time we estimate its parameters.

Given $N$ measurements of each wine in $\mathcal{W}$, e.g., the collection of scores in Table 1, we wish to estimate $\tau_w$ for each wine and $\bar\tau$ for all wines. Equivalently, we wish to estimate $\{\beta_i : i \in \{0, 1, \ldots, |\mathcal{W}|\}\}$. By the method of least squares, we find $\hat\beta_0$ from minimising the sum of deviation squares:

$\frac{\partial}{\partial \hat\beta_0}\sum_{w\in\mathcal{W}} \sum_{n=1}^N (Y_{wn} -\hat\beta_0)^2 = 0 \Longrightarrow \hat\beta_0 = \frac{1}{|\mathcal{W}|}\frac{1}{N}\sum_{w\in\mathcal{W}} \sum_{n=1}^N Y_{wn}.$

Differentiating the sum of residual squares with respect to $\hat \beta_w$, we find the remaining parameters:

$\frac{\partial}{\partial \hat\beta_w}\sum_{w\in\mathcal{W}} \sum_{n=1}^N \left(Y_{wn} - \hat\beta_0 - \sum_{w'\in\mathcal{W}} \hat\beta_{w'}\delta_{w-w'}\right)^2 = 0 \Longrightarrow \hat\beta_{w} = \frac{1}{N} \sum_{n=1}^N (Y_{wn} - \hat \beta_0).$

Then our best approximation to $Y_{wn}$ is $\hat Y_{w} = \hat\beta_0 + \hat\beta_w$, and the residual $\hat Z_{wn} = Y_{wn} - \hat Y_{wn}$. The expectations of these estimators are:

$\textrm{E}[\hat \beta_0] = \beta_0 + \frac{1}{|\mathcal{W}|}\frac{1}{N}\sum_{w\in\mathcal{W}} \sum_{n=1}^N E[Z_{wn}]$
$\textrm{E}[\hat \beta_w] = \beta_w + \frac{1}{|\mathcal{W}|}\frac{1}{N} \left( \sum_{w'\in\mathcal{W}\backslash w} \sum_{n=1}^N E[Z_{w'n}] - (|\mathcal{W}| -1) \sum_{n'=1}^N E[Z_{wn'}] \right)$

and their variances are easily seen to be:

$\textrm{Var}[\hat \beta_0] = \frac{1}{|\mathcal{W}|^2} \frac{1}{N^2}\textrm{Var}\left ( \sum_{w\in\mathcal{W}} \sum_{n=1}^N Z_{wn} \right )$
$\textrm{Var}[\hat \beta_w] = \frac{1}{|\mathcal{W}|^2} \frac{1}{N^2} \textrm{Var}\left ( \sum_{w'\in\mathcal{W}\backslash w} \sum_{n=1}^N Z_{w'n} - (|\mathcal{W}| -1) \sum_{n'=1}^N Z_{wn'} \right ).$

We see that increasing $N$ decreases the variance of each of these estimators. If for all wines and scores, $Z_{wn}$ has zero mean then these estimators are unbiased. If for all wines and scores $Z_{wn}$ is iid with variance $\sigma^2$, then the above become:

$\textrm{Var}[\hat \beta_0] = \sigma^2/(|\mathcal{W}|N) \\ \textrm{Var}[\hat \beta_w] = (|\mathcal{W}|-1)\sigma^2/(|\mathcal{W}|N) = (|\mathcal{W}|-1)\textrm{Var}[\hat \beta_0].$

This clearly shows how the uncertainty in our parameter estimates depends on both the number of wines and the number of scores for each wine.

The figure above shows a simulation. We randomly draw four independent scores from each of four Gaussian distributions with means $\{\beta_0 + \beta_w : w \in \mathcal{W}\}$, and variance $\sigma^2 = 0.1$. We then estimate the parameters. We simulate the above experiment 1,000,000 times and construct distributions to investigate the behaviour of the parameter estimates. The figure below shows these for each parameter.

While we see the variances of the deviation parameters are larger than that of the mean parameter, two observations are contrary to our predictions above. First, there does appear to be bias in the estimates even though the noise is zero mean. The bias we observe in the parameters is toward the nearest integer of the true parameter. Second, the variances we observe are larger than what we predict. The cause of these two differences is that our model of the measurements is not quite accurate. When we do not restrict the scores to being integers, but any number, then these differences are greatly diminished. A more accurate model of our measurements instead accounts for all scores being in fact integers:

$Y_{wn} = \max(5,\min(1,\lfloor \tau_w + Z_{wn} \rceil)).$

However, this does not lend itself easily to the analysis above. Nonetheless, our simulations show our estimates are reasonably well-behaved, but that we may need to account for this discrepancy when we draw inferences.

In part 4 we will look at the use of null hypothesis significance testing to determine if there is a significant difference between the wine parameters in Table 1. Then, part 5 will reveal a fatal flaw.