Agriculturalists and other scientists in biological fields who are involved in research constantly face problems associated with planning, designing, and conducting experiments. Basic familiarity and understanding of statistical methods that deal with issues of concern would be helpful in many ways. Practitioners (researchers) who collect data and then look for a statistical technique that would provide valid results may find that there may not be solutions to a problem and that the problem could have been avoided in the first place by conducting a properly designed experiment. Obviously, it is important to keep in mind that we cannot draw valid conclusions from poorly planned experiments. Second, the time and cost involved in many experiments are considerable, and a poorly designed experiment increases such costs. For example, agronomists who carry out a fertilizer experiment know the time limitations on the experiment. They know that, in the temperature zone, seeds must be planted in the spring and harvested in the fall. The experimental plot must include all the components of a complete design. Otherwise, what is omitted from the experiment will have to be carried out in subsequent trials in the next cropping season or the next year. This additional time and expenditure could be minimized by a properly planned experiment that would produce valid results as efficiently as possible.
Good experimental designs are the product of the technical knowledge of one’s field, an understanding of statistical techniques, and skill in designing experiments. Any research endeavor, be it correlational or experimental, may entail the following phases: conception, design, data collection, analysis, and dissemination. Statistical methodologies can be used to conduct better scientific experiments if they are incorporated into the entire scientific process, i.e., from recognition of the problem to experiment design, data analysis, and interpretation. The intent of this book is to provide practitioners with the necessary guidelines and techniques for designing experiments, as well as the statistical methodology to analyze the results of experiments in an efficient way.
Agricultural experiments usually entail comparisons of crop or animal varieties. When planning agricultural experiments, we must keep in mind that large uncontrolled variations are common occurrences. For example, a crop scientist who plants the same variety of a crop in a field may find variations in yield that are due to periodic variations across a field or to some other factor that the experimenter has no control over. Throughout this book we are concerned with the methodologies used in designing experiments that will separate, with confidence and accuracy, varietal differences of crops and animals from uncontrolled variations.
It is essential that you become familiar with the terminology used in this book as we discuss the concept of experimentation. Agricultural experimentations are conducted in response to questions raised by researchers who are interested either in comparing the effects of several conditions on some phenomena or in discovering the unknown effects of a particular process. An experiment facilitates the study of such phenomena under controlled conditions. Therefore, the creation of controlled conditions is the most essential characteristic of experimentation. It has been said that wisdom consists not so much in knowing the right answers, as in knowing the right questions to ask (Gill, 1980). Hence, how we formulate our questions and hypotheses are critical to the experimental procedure that will follow.
Once we have established a hypothesis, we look at a method to objectively test the validity of that hypothesis. Our final results and conclusions depend to a large extent on the manner in which the experiment was statistically designed and how the data were collected. Agricultural researchers know that it is difficult to avoid differences in yield of the same crop variety planted in two adjacent fields, or the weight gain in two animals fed the same ration, because of uncontrolled variations. Such differences in yield resulting from experimental units treated alike are called experimental error. The intent of an optimal design is to provide a mechanism for estimation and control of experimental error in the field experiments conducted. Practitioners should keep in mind that it is very difficult to account for all the sources of natural variation even when an entire population of factors of interest is under study. The problem is further aggravated when we depend on a sample that is, at best, an approximation of the population or the characteristic of interest. Given this dilemma, what practitioners can hope for is a predictive model that minimizes experimental error. Such models are based on theoretic knowledge, empirical validity, and an understanding of experimental material. Before we discuss the various designs, and any methodology for the estimation and control of experimental error, a quick review of other commonly encountered terms is in order.
When discussing experimentation, you will encounter the term experimental unit. An experimental unit is an entity that receives a treatment. For example, for a horticulturist it may be a plot of land or a batch of seed, for an animal scientist it may be a group of pigs or sheep, and for an agricultural engineer it may be a manufactured item. Thus, an experimental unit may be looked upon as a small subdivision of the experimental material, which receives the treatment.
In choosing an experimental unit, the researcher must pay special attention to the size of the unit, the representative nature of the unit, and how independent the units are of one another. In terms of the size of the unit, technical and cost factors play a major role. The essential point to determine is how many units are needed to attain a specified precision in the most economical way.
As far as the representative nature of the unit is concerned, it is important that the conditions of the experiment be as close as possible to those of the actual study subject. For example, if an animal scientist is interested in a feeding experiment in which the results are to apply to Holstein dairy cows, then ideally the sample of cows in the experiment should be selected from a population of Holstein dairy cows.
The independence of experimental units from one another is also an important element in a properly conducted experiment. Independence of units implies that the researcher must ensure that the treatment applied to one unit has no effect on the observation obtained in another. Furthermore, the occurrence of unusually high or low observations in one unit should have absolutely no effect on what may be observed in another unit.
Replication refers to the repetition of the basic experiment or treatment. Thus, an animal scientist who treats ten animals with a particular antibiotic has ten replicates. Researchers replicate their experiments to control for experimental error. It is also used to estimate the error variance and to increase the precision of estimates. It is important to keep in mind that increased replication of experimental units is desirable for increasing the accuracy of estimates of means and other functions of the measured variable (Gill, 1980; Das and Giri, 1986). However, the cost of increased replication plays an important role in the number of replicates a researcher is able to afford.
As the experimenter makes a decision to use a particular design, the choice of the number of replicates must also be made. It has been suggested that present-day design practices utilize existing information to the fullest, and replication is sometimes not needed at all (Anderson and McLean, 1974). In some cases, such as factorial fractional replication (discussed in Chapter 6, Section 6.4), only a part of all the treatment combinations is used in an experiment.
No matter how many replicates are used, the important point to keep in mind is whether the experiment will provide valid results in terms of estimates of effects, and whether there are enough degrees of freedom for error to adequately test for various effects.
Randomization is simply an objective method of random allocation of the experimental material or allocation of treatment to experimental units. We also use randomization to make certain that the order in which individual trials of experiments are performed is determined randomly. The advantages of randomization are: (1) it allows for protection against systematic error caused by subjective assignment of treatments; that is, each treatment will have an equal chance of being assigned to an experimental unit; (2) it helps in “averaging out” the effects of uncontrolled conditions or extraneous factors that persist over long or repeated experiments; and (3) it validates the statistical assumption that states that observations (or errors) are independently distributed random variables. In sum, we could say that randomization is the cornerstone of statistical theory in the design of experiments. Fisher (1956) stated to ensure the error estimate will be a good and valid one, we must randomize experimental treatments among experimental units. For further reading on this topic, Fisher (1956), Ogawa (1974), Gill (1980), Gomez and Gomez (1984), Mead (1988), Mead, Curnow, and Hasted (1993), Montgomery, (2000), and Samuels and Witmer (2003) may be consulted.
We shall refer to the experimental variables as factors, or we may define a controllable condition in an experiment as a factor. For example, a fertilizer, a new feed ration, and a fungicide are all considered factors. Factors may be quantitative or qualitative, and may take a finite number of values and types. Quantitative factors are those described by numerical values on some scale. The strength of a drug dosage (such as milligrams of a sulfa drug), the rate of application of a fertilizer, and temperature are examples of quantitative factors. Qualitative factors, such as type of protein in a diet, sex of an animal, age, or the genetic makeup of a plant or animal are factors that can be distinguished from each other, but not on a numerical scale.
Different factors are included in experimental designs for different reasons. The decision of including or excluding a factor, the levels of each factor in the experiment, and the criterion for selecting such factors are the responsibility of the experimenter, who may be influenced by theoretical considerations. What is essential to remember is that factors should be related to one another in simple ways. The relation of one factor to another implies how the levels of that factor are combined with the levels of another factor. When choosing factors for any experiment, the researcher should ask the following questions:
What treatments in the experiment should be related directly to the objectives of the study?
Does the experimental technique adopted require the use of additional factors?
Can the experimental unit be divided naturally into groups such that the main tr...