eBook - ePub

ANOVA and ANCOVA

Name: ANOVA and ANCOVA
ISBN: 9781118491690

A GLM Approach

Andrew Rutherford,

English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

ANOVA and ANCOVA

A GLM Approach

Andrew Rutherford,

About this book

Provides an in-depth treatment of ANOVA and ANCOVA techniques from a linear model perspective

ANOVA and ANCOVA: A GLM Approach provides a contemporary look at the general linear model (GLM) approach to the analysis of variance (ANOVA) of one- and two-factor psychological experiments. With its organized and comprehensive presentation, the book successfully guides readers through conventional statistical concepts and how to interpret them in GLM terms, treating the main single- and multi-factor designs as they relate to ANOVA and ANCOVA.

The book begins with a brief history of the separate development of ANOVA and regression analyses, and then goes on to demonstrate how both analyses are incorporated into the understanding of GLMs. This new edition now explains specific and multiple comparisons of experimental conditions before and after the Omnibus ANOVA, and describes the estimation of effect sizes and power analyses leading to the determination of appropriate sample sizes for experiments to be conducted. Topics that have been expanded upon and added include:

Discussion of optimal experimental designs
Different approaches to carrying out the simple effect analyses and pairwise comparisons with a focus on related and repeated measure analyses
The issue of inflated Type 1 error due to multiple hypotheses testing
Worked examples of Shaffer's R test, which accommodates logical relations amongst hypotheses

ANOVA and ANCOVA: A GLM Approach, Second Edition is an excellent book for courses on linear modeling at the graduate level. It is also a suitable reference for researchers and practitioners in the fields of psychology and the biomedical and social sciences.

Trusted by 375,005 students

Access to over 1.5 million titles for a fair monthly price.

Study more efficiently using our study tools.

Publisher

Wiley

Year

2012

Print ISBN

9780470385555

Edition

eBook ISBN

9781118491690

Topic

Mathematics

Subtopic

Probability & Statistics

Index

Mathematics

CHAPTER 1 An Introduction to General Linear Models: Regression, Analysis of Variance, and Analysis of Covariance

1.1 REGRESSION, ANALYSIS OF VARIANCE, AND ANALYSIS OF COVARIANCE

Regression and analysis of variance (ANOVA) are probably the most frequently applied of all statistical analyses. Regression and analysis of variance are used extensively in many areas of research, such as psychology, biology, medicine, education, sociology, anthropology, economics, political science, as well as in industry and commerce.

There are several reasons why regression and analysis of variance are applied so frequently. One of the main reasons is they provide answers to the questions researchers ask of their data. Regression allows researchers to determine if and how variables are related. ANOVA allows researchers to determine if the mean scores of different groups or conditions differ. Analysis of covariance (ANCOVA), a combination of regression and ANOVA, allows researchers to determine if the group or condition mean scores differ after the influence of another variable (or variables) on these scores has been equated across groups. This text focuses on the analysis of data generated by psychology experiments, but a second reason for the frequent use of regression and ANOVA is they are applicable to experimental, quasi-experimental, and non-experimental data, and can be applied to most of the designs employed in these studies. A third reason, which should not be underestimated, is that appropriate regression and ANOVA statistical software is available to analyze most study designs.

1.2 A POCKET HISTORY OF REGRESSION, ANOVA, AND ANCOVA

Historically, regression and ANOVA developed in different research areas to address different research questions. Regression emerged in biology and psychology toward the end of the nineteenth century, as scientists studied the relations between people’s attributes and characteristics. Galton (1886, 1888) studied the height of parents and their adult children, and noticed that while short parents’ children usually were shorter than average, nevertheless, they tended to be taller than their parents. Galton described this phenomenon as “regression to the mean.” As well as identifying a basis for predicting the values on one variable from values recorded on another, Galton appreciated that the degree of relationship between some variables would be greater than others. However, it was three other scientists, Edgeworth (1886), Pearson (1896), and Yule (1907), applying work carried out about a century earlier by Gauss (or Legendre, see Plackett, 1972), who provided the account of regression in precise mathematical terms. (See Stigler, 1986, for a detailed account.)

The t-test was devised by W.S. Gosset, a mathematician and chemist working in the Dublin brewery of Arthur Guinness Son & Company, as a way to compare the means of two small samples for quality control in the brewing of stout. (Gosset published the test in Biometrika in 1908 under the pseudonym “Student,” as his employer regarded their use of statistics to be a trade secret.) However, as soon as more than two groups or conditions have to be compared more than one t-test is needed. Unfortunately, as soon as more than one statistical test is applied, the Type 1 error rate inflates (i.e., the likelihood of rejecting a true null hypothesis increases—this topic is returned to in Sections 2.1 and 3.6.1). In contrast, ANOVA, conceived and described by Ronald A. Fisher (1924, 1932, 1935b) to assist in the analysis of data obtained from agricultural experiments, was designed to compare the means of any number of experimental groups or conditions without increasing the Type 1 error rate. Fisher (1932) also described ANCOVA with an approximate adjusted treatment sum of squares, before describing the exact adjusted treatment sum of squares a few years later (Fisher, 1935b, and see Cox and McCullagh, 1982, for a brief history). In early recognition of his work, the F-distribution was named after him by G.W. Snedecor (1934).

ANOVA procedures culminate in an assessment of the ratio of two variances based on a pertinent F-distribution and this quickly became known as an F-test. As all the procedures leading to the F-test also may be considered as part of the F-test, the terms “ANOVA” and “F-test” have come to be used interchangeably. However, while ANOVA uses variances to compare means, F-tests per se simply allow two (independent) variances to be compared without concern for the variance estimate sources.

In subsequent years, regression and ANOVA techniques were developed and applied in parallel by different groups of researchers investigating different research topics, using different research methodologies. Regression was applied most often to data obtained from correlational or non-experimental research and came to be regarded only as a technique for describing, predicting, and assessing the relations between predictor(s) and dependent variable scores. In contrast, ANOVA was applied to experimental data beyond that obtained from agricultural experiments (Lovie, 1991a), but still it was considered only as a technique for determining whether the mean scores of groups differed significantly. For many areas of psychology, particularly experimental psychology, where the interest was to assess the average effect of different experimental manipulations on groups of subjects in terms of a particular dependent variable, ANOVA was the ideal statistical technique. Consequently, separate analysis traditions evolved and have encouraged the mistaken belief that regression and ANOVA are fundamentally different types of statistical analysis. ANCOVA illustrates the compatibility of regression and ANOVA by combining these two apparently discrete techniques. However, given their histories it is unsurprising that ANCOVA is not only a much less popular analysis technique, but also one that frequently is misunderstood (Huitema, 1980).

1.3 AN OUTLINE OF GENERAL LINEAR MODELS (GLMs)

The availability of computers for statistical analysis increased hugely from the 1970s. Initially statistical software ran on mainframe computers in batch processing mode. Later, the statistical software was developed to run in a more interactive fashion on PCs and servers. Currently, most statistical software is run in this manner, but, increasingly, statistical software can be accessed and run over the Web.

Using statistical software to analyze data has had considerable consequence not only for analysis implementations, but also for the way in which these analyses are conceived. Around the 1980s, these changes began to filter through to affect data analysis in the behavioral sciences, as reflected in the increasing number of psychology statistics texts that added the general linear model (GLM) approach to the traditional accounts (e.g., Cardinal and Aitken, 2006; Hays, 1994; Kirk, 1982, 1995; Myers, Well, and Lorch, 2010; Tabachnick and Fidell, 2007; Winer, Brown, and Michels, 1991) and an increasing number of psychology statistics texts that presented regression, ANOVA, and ANCOVA exclusively as instances of the GLM (e.g., Cohen and Cohen, 1975, 1983; Cohen et al., 2003; Hays, 1994; Judd and McClelland, 1989; Judd, McClelland, and Ryan, 2008; Keppel and Zedeck, 1989; Maxwell and Delaney, 1990, 2004; Pedhazur, 1997).

A major advantage afforded by computer-based analyses is the easy use of matrix algebra. Matrix algebra offers an elegant and succinct statistical notation. Unfortunately, however, human matrix algebra calculations, particularly those involving larger matrices, are not only very hard work but also tend to be error prone. In contrast, computer implementations of matrix algebra are not only very efficient in computational terms, but also error free. Therefore, most computer-based statistical analyses employ matrix algebra calculations, but the program output usually is designed to concord with the expectations set by traditional (scalar algebra) calculations.

When regression, ANOVA, and ANCOVA are expressed in matrix algebra terms, a commonality is evident. Indeed, the same matrix algebra equation is able to summarize all three of these analyses. As regression, ANOVA, and ANCOVA can be described in an identical manner, clearly they share a common pattern. This common pattern is the GLM. Unfortunately, the ability of the same matrix algebra equation to describe regression, ANOVA, and ANCOVA has resulted in the inaccurate identification of the matrix algebra equation as the GLM. However, just as a particular language provides a means of expressing an idea, so matrix algebra provides only one notation for expressing the GLM.

Tukey (1977) employed the GLM conception when he described data as

(1.1)

The same GLM conception is employed here, but the fit and residual component labels are replaced with the more frequently applied labels, model (i.e., the fit) and error (i.e., the residual). Therefore, the usual expression of the GLM conception is that data may be accommodated in terms of a model plus error

(1.2)

In equation (1.2), the model is a representation of our understanding or hypotheses about the data, while the error explicitly acknowledges that there are other influences on the data. When a full model is specified, the error is assumed to reflect all influences on the dependent variable scores not controlled in the experiment. These influences are presumed to be unique for each subject in each experimental condition. However, when less than a full model is represented, the score component attributable to the omitted part(s) of the full model also is accommodated by the error term. Although the omitted model component increments the error, as it is neither uncontrolled nor unique for each subject, the residual label would appear to be a more appropriate descriptor. Nevertheless, many GLMs use the error label to refer to the error parameters, while the residual label is used most frequently in regression analysis to refer to the error parameter estimates. The relative sizes of the full or reduced model components and the error components also can be used to judge how well the particular model accommodates the data. Nevertheless, the tradition in data analysis is to use regression, ANOVA, and ANCOVA GLMs to express different types of ideas about how data arises.

1.3.1 Regression

Simple linear regression examines the degree of the linear relationship (see Section 1.5) between a single predictor or independent variable and a response or dependent variable, and enables values on the dependent variable to be predicted from the values recorded on the independent variable. Multiple linear regression does the same, but accommodates an unlimited number of predictor variables.

In GLM terms, regression attempts to explain data (the dependent variable scores) in terms of a set of independent variables or predictors (the model) and a residual component (error). Typically, the researcher applying regression is interested in predicting a quantitative dependent variable from one or more quantitative independent variables and in determining the relative contribution of each independent variable to the prediction. There is also interest in what proportion of the variation in the dependent variable can be attributed to variation in the independent variable(s).

Regression also may employ categorical (also known as nominal or qualitative) predictors-the use of independent variables such as gender, marital status, and type of teaching method is common. As regression is an elementary form of GLM, it is possible to construct regression GLMs equivalent to any ANOVA and ANCOVA GLMs by selecting and organizing quantitative variables to act as categorical variables (see Section 2.7.4). Nevertheless, throughout this chapter, the convention of referring to these particular quantitative variables as categorical variables will be maintained.

1.3.2 Analysis of Variance

Single factor or one-way ANOVA compares the means of the dependent variable scores obtained from any number of groups (see Chapter 2). Factorial ANOVA compares the mean dependent variable scores across groups with more complex structures (see Chapter 5).

In GLM terms, ANOVA attempts to explain data (the dependent variable scores) in terms of the experimental conditions (the model) and an error component. Typically, the researcher applying ANOVA is interested in determining which experimental condition dependent variable score means differ. There is also interest in what proportion of variation in the dependent variable can be attributed to differences between specific experimental groups or conditions, as defined by the independent variable(s).

The dependent variable in ANOVA is most likely to be measured on a quantitative scale. However, the ANOVA comparison is drawn between the groups of subjects receiving different experimental conditions and is categorical in nature, even when the experimental conditions differ along a quantitative scale. As regression also can employ categorical predictors, ANOVA can be regarded as a particular type of regression analysis that employs only categorical predictors.

1.3.3 Analysis of Covariance

The ANCOVA label has been applied to a number of different statistical operations (Cox and McCullagh, 1982), but it is used most frequently to refer to the statistical technique that combines regression and ANOVA. As ANCOVA is the combination of these two techniques, its calculations are more involved and time consuming than either technique alone. Therefore, it is unsurprising that an increase in ANCOVA applications is linked to the availability of computers and statistical software.

Fisher (1932, 1935b) originally developed ANCOVA to increase the precision of experimental analysis, but it is applied most frequently in quasi-experimental research. Unlike experimental research, the topics investigated with quasi-experimental methods are most likely to involve variables that, for practical or ethical reasons, cannot be controlled directly. In these situat...

Cover
Half Title page
Title page
Copyright page
Acknowledgments
Chapter 1: An Introduction to General Linear Models: Regression, Analysis of Variance, and Analysis of Covariance
Chapter 2: Traditional and GLM Approaches to Independent Measures Single Factor ANOVA Designs
Chapter 3: Comparing Experimental Condition Means, Multiple Hypothesis Testing, Type 1 Error, and a Basic Data Analysis Strategy
Chapter 4: Measures of Effect Size and Strength of Association, Power, and Sample Size
Chapter 5: GLM Approaches to Independent Measures Factorial Designs
Chapter 6: GLM Approaches to Related Measures Designs
Chapter 7: The GLM Approach to Factorial Repeated Measures Designs
Chapter 8: GLM Approaches to Factorial Mixed Measures Designs
Chapter 9: The GLM Approach to ANCOVA
Chapter 10: Assumptions Underlying ANOVA, Traditional ANCOVA, and GLMs
Chapter 11: Some Alternatives to Traditional ANCOVA
Chapter 12: Multilevel Analysis for the Single Factor Repeated Measures Design
Appendix A
Appendix B
Appendix C
References
Index

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription

No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline

Perlego offers two plans: Essential and Complete

Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.5M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.

Both plans are available with monthly, semester, or annual billing cycles.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1.5 million books across 990+ topics, we’ve got you covered! Learn about our mission

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud

Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app

Yes, you can access ANOVA and ANCOVA by Andrew Rutherford in PDF and/or ePUB format, as well as other popular books in Mathematics & Probability & Statistics. We have over 1.5 million books available in our catalogue for you to explore.

ANOVA and ANCOVA

A GLM Approach

ANOVA and ANCOVA

A GLM Approach

About this book

Trusted by 375,005 students

Information

CHAPTER 1

An Introduction to General Linear Models: Regression, Analysis of Variance, and Analysis of Covariance

1.1 REGRESSION, ANALYSIS OF VARIANCE, AND ANALYSIS OF COVARIANCE

1.2 A POCKET HISTORY OF REGRESSION, ANOVA, AND ANCOVA

1.3 AN OUTLINE OF GENERAL LINEAR MODELS (GLMs)

1.3.1 Regression

1.3.2 Analysis of Variance

1.3.3 Analysis of Covariance

Table of contents

Frequently asked questions