D. J. Hand, “Deconstructing statistical questions,” J. Royal Statistics Society A (Statistics in Society), vol. 157, no. 3, pp. 317-356, 1994.
This is a remarkable paper, addressing “errors of the third kind”: applying a statistical tool to correctly answer the wrong question. This type of error can occur when a research question is not defined in sufficient detail, or worse, when a tool is used simply because it is convenient, and/or gives the result desired. Hand gives many illustrative examples of how things can go very wrong from the beginning, and argues that before proceeding to apply the numerous statistical tools available in software packages today, we all must “deconstruct” with care the scientific and relevant statistical questions that we actually seek to answer.
At the end of the article, there are 24 (mostly) laudatory responses to it — including one by John Tukey. These are like well-thought comments on reddit, and provide revealing looks at the actual practice of statistics in science, and the practice of science with statistics. One in particular strikes me, because it is about the practice of science with statistics in academia.
Donald Preece begins:
Professor Hand speaks of the questions that the researcher wishes to consider.
There are often three in number:
- How do I obtain a statistically significant result?
- How do I get my paper published?
- When will I get promoted?
So Professor Hand’s suggestions must be supplemented by a recognition of the corruptibility and corruption of the scientific research process.
Nor can we overlook the constraints imposed by inevitable limitation of the resources. Needing further financial support, many researchers ask merely ‘How do I get results?’, meaning by ‘results’, not answers to questions, but things that are publishable in glossy reports.
This, in particular, hit home, especially after I accidentally read, E. R. Dougherty and L. A. Dalton, “Scientific knowledge is possible with small-sample classification,” EURASIP J. Bioinformatics and Systems Biology, vol. 10, 2013.
In their recent article, Dougherty and Dalton pull no punches:
Since scientific validity depends on the predictive capacity of a model, while an appropriate classification rule is certainly beneficial to classifier design, epistemologically, the error rate is paramount. …
[A]ny paper that applies an error estimation rule without providing a performance characterization relevant to the data at hand is scientifically vacuous.
Given the near universality of vacuous small-sample classification papers in the literature [where error is not estimated], one could easily reach the conclusion that scientific knowledge is impossible in small-sample settings. Of course, this would beg the question of why people are writing vacuous papers and why journals are publishing them.