Chapter 1
The Basics
In this chapter, we place the design and analysis of experiments in the health sciences in its scientific context, discuss principles, and enumerate additional considerations such as assignment of experimental conditions to experimental units and sample size considerations.
1.1 Four Basic Questions
In his book Science and the Modern World, Whitehead (1925) aptly described the scientific mentality as “a vehement and passionate interest in the relation of general principles to irreducible and stubborn facts.” There is a constant interplay between the formulation of the general principles and the stubborn facts. The following quotation from Science under a picture of a mouse embryo illustrates this interplay:
A mouse embryo at 9 days of gestation. . . . Understanding the basis for organ development can provide insights into disease and stem cell programming.
(Science, 2008)
The general principles in this case refer to insights into disease and stem cell programming. The stubborn facts deal with specific and measurable observations of the mouse embryo. Statistics—as a component of the sciences—can be characterized as a vehement and passionate interest in the relation of general principles of variation and causation to observed associations. This definition includes causation as a principal interest of statistics, not just variation. Particularly in experimental design and analysis, the key question of interest almost always is one of causation. In fact, the principle of randomization as introduced by R.A. Fisher in the last century is the centerpiece of the scientific enterprise of showing cause and effect in the face of substantial and irreducible variation. Statisticians are particularly good at dealing with variation: they have learned how to describe it, how to manage it, how to induce it, and, perhaps surprisingly, how to take advantage of it. This text will illustrate these points over and over again.
In many sciences, particularly the biological sciences, four basic questions are addressed:
1. What is the question?
2. Is it measurable?
3. Where will you get the data?
4. What do you think the data are telling you?
1. What Is the Question?
“Why is the water in the kettle boiling?” One possible answer, “The flame is making the molecules of water move faster and faster so that they can break the surface tension of the water and begin to escape.” Another possible answer (given perhaps by R.A. Fisher), “To make tea for a lady.” The first answer deals with efficient cause. The second answer with final cause. Science—and statistics—deals primarily with efficient causes, not final causes.
The context of the question is as important as the question itself. A Monty Python observation is relevant, “If you get them to ask the wrong question, you don't have to worry about the answer.”
Often the context of the question is assumed and unstated, as in the boiling water question above. A great deal of humor is based on one assumed context and a revealed context as the punch line of a joke. This may be funny on a late night show but can be fatal to a research question. For any scientific question, the context must be explicit. For example, in assessing mathematical skills it is necessary to specify the population to be assessed: fifth graders or community college students?
Even more daunting than the context is the form of a question. Social scientists are very much aware of this. But the form is every bit as crucial in the laboratory sciences. The question is frequently formulated in terms of what is measurable; this may or may not address the issue at hand.
2. Is It Measurable?
Efficient causes have the potential of being measurable. In the example of the water boiling in the kettle, we can measure the heat supplied by the flame, the average velocity of the molecules, and, perhaps more important, the variation in the molecular velocities.
Asking a measurable question can be very challenging for two reasons. First, the question needs to be specific enough so that measurements can be made. Second, the formulation of the question implicitly defines the research area to be considered. The question puts a “fence around the mystery.” It says, the mystery is here, not there. For example, the question “are current lead levels safe?” deals with a potentially toxic exposure. To make the question measurable requires a host of considerations such as population(s) of interest, specification of nonsafety, assessment of levels in the environment, and specification of lead level in the body. The study of this type of question is part of the field of toxicology, which may try to assess some aspects of toxicity in animals and other aspects in humans. This example also illustrates the societal importance of the question; the U.S. Environmental Protection Agency uses the scientific evidence to set environmental policy.
An example of a nonmeasurable question—and very pertinent to this book—“Is it ethical to do experiments on animals?” Most toxicologists would argue that it is. In this book, we are using data from animal experiments and therefore, implicitly, agree that it is ethical. A challenging question might be,“is it ethical to use animal data in this book while holding that it is unethical to do animal experiments?” Once the ethical question is answered in the affirmative, many measurable aspects of animal experiments come up under the rubric of Good Laboratory Practice. This might include measuring the temperature at which animals are housed. Einstein said, “Not everything that counts is countable, and not everything that is countable counts.” It could even be argued that the things that really count are not countable!
The social sciences provide another example of issues in measurability. There has been a 100-year debate about the existence of “intelligence.” Common language use suggests that there is (e.g.,“I thought you were more intelligent than that. . .”). Spearman in 1904 argued for such a (latent) trait on the basis of the structure of a correlation matrix.
As another example, Canadian health data do not have reference to race or national origin. The primary reason is that there is no standard acceptable definition. In other words, it is considered very difficult to measure this concept. The question has been raised whether the concept of race is a biological concept or a social concept.
3. Where Will You Get the Data?
“Getting the data” involves two steps. First, selecting the objects to be measured; second, specifying the measurements that are to be made. This, inevitably, involves a tremendous reduction of the universe of discourse. With respect to both the objects selected and the measurements made, there is the dilemma of “this, not that.” We cannot measure everything.
Implementing and accounting for the selection process is a precondition for valid experimental inference. For example, in ergonomic studies of proper lifting procedures, subjects must be selected and measurements made at specific times. Ideally, the subjects are representative of the working population or the population of interest. This is not true most of the time with many subjects being college-age students eager to make a few extra dollars. The experiment may be carried out impeccably but the question of generalizability to the population of interest still needs to be addressed.
The process of the selection of experimental units is often not addressed. One reason is that “control” treatments are included so that the assessment of the treatment effect is comparative. The underlying assumption of this argument is that there is no interaction with biased selection of experimental units. In the above example, a proper lifting procedure may be compared with an improper one in terms of muscle fatigue or muscle strain. If college-age students are used for this experiment, then the assumption is that the comparative results apply to middle-age postal workers as well. This is an implicit assumption—usually only acknowledged in the discussion section of the paper reporting the results.
4. What Do You Think the Data Are Telling You?
The statistical analysis addresses the fourth question. Statistical analysis involves a further reduction of the data, usually according to some statistical model. Most of the data in this book will be modeled, or approximated, by some kind of linear model. A simple linear model consists of
The outcome of the experiment is considered to consist of a fixed part, the population means associated with treatments indexed by the subscript i, and a random part, the re...