Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences
eBook - ePub

Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences

  1. 736 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences

About this book

This classic text on multiple regression is noted for its nonmathematical, applied, and data-analytic approach. Readers profit from its verbal-conceptual exposition and frequent use of examples.

The applied emphasis provides clear illustrations of the principles and provides worked examples of the types of applications that are possible. Researchers learn how to specify regression models that directly address their research questions. An overview of the fundamental ideas of multiple regression and a review of bivariate correlation and regression and other elementary statistical concepts provide a strong foundation for understanding the rest of the text. The third edition features an increased emphasis on graphics and the use of confidence intervals and effect size measures, and an accompanying website with data for most of the numerical examples along with the computer code for SPSS, SAS, and SYSTAT, at www.psypress.com/9780805822236 .

Applied Multiple Regression serves as both a textbook for graduate students and as a reference tool for researchers in psychology, education, health sciences, communications, business, sociology, political science, anthropology, and economics. An introductory knowledge of statistics is required. Self-standing chapters minimize the need for researchers to refer to previous chapters.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences by Jacob Cohen,Patricia Cohen,Stephen G. West,Leona S. Aiken in PDF and/or ePUB format, as well as other popular books in Psychology & History & Theory in Psychology. We have over one million books available in our catalogue for you to explore.

Information

1
Introduction
1.1 MULTIPLE REGRESSION/CORRELATION AS A GENERAL DATA-ANALYTIC SYSTEM
1.1.1 Overview
Multiple regression/correlation analysis (MRC) is a highly general and therefore very flexible data analytic system. Basic MRC may be used whenever a quantitative variable, the dependent variable (Y), is to be studied as a function of, or in relationship to, any factors of interest, the independent variables (IVs).1 The broad sweep of this statement is quite intentional.
1. The form of the relationship is not constrained: it may be simple or complex, for example, straight line or curvilinear, general or conditional, or combinations of these possibilities.
2. The nature of the research factors expressed as independent variables is also not constrained. They may be quantitative or qualitative, main effects or interactions in the analysis of variance (ANOVA) sense, or covariates in the analysis of covariance (ANCOVA) sense. They may be correlated with each other or uncorrelated as in balanced factorial designs in ANOVA commonly found in laboratory experiments. They may be naturally occurring (“organismic” variables) like sex, diagnosis, IQ, extroversion, or years of education, or they may be planned experimental manipulations (treatment conditions). In short, virtually any information whose bearing on the dependent variable is of interest may be expressed as research factors.
3. The nature of the dependent variable is also not constrained. Although MRC was originally developed for scaled dependent variables, extensions of the basic model now permit appropriate analysis of the full range of dependent variables including those that are of the form of categories (e.g., ill vs. not ill) or ordered categories.
4. Like all statistical analyses, the basic MRC model makes assumptions about the nature of the data that are being analyzed and is most confidently conducted with “well-behaved” data that meet the underlying assumptions of the basic model. Statistical and graphical methods now part of many statistical packages make it easy for the researcher to determine whether estimates generated by the basic MRC model are likely to be misleading and to take appropriate actions. Extensions of the basic MRC model include appropriate techniques for handling “badly behaved” or missing data and other data problems encountered by researchers.
The MRC system presented in this book has other properties that make it a powerful analytic tool. It yields measures of the magnitude of the total effect of a factor on the dependent variable as well as of its partial (unique, net) relationship, that is, its relationship over and above that of other research factors. It also comes fully equipped with the necessary apparatus for statistical hypothesis testing, estimation, construction of confidence intervals, and power analysis. Graphical techniques allow clear depictions of the data and of the analytic results. Last, but certainly not least, MRC is a major tool in the methods of causal (path, structural equation) analysis. Thus, MRC is a versatile, all-purpose system of analyzing the data over a wide range of sciences and technologies.
1.1.2 Testing Hypotheses Using Multiple Regression/Correlation: Some Examples
Multiple regression analysis is broadly applicable to hypotheses generated by researchers in the behavioral sciences, health sciences, education, and business. These hypotheses may come from formal theory, previous research, or simply scientific hunches. Consider the following hypotheses chosen from a variety of research areas:
1. In health sciences, Rahe, Mahan, and Arthur (1970) hypothesized that the amount of major life stress experienced by an individual is positively related to the number of days of illness that person will experience during the following 6 months.
2. In sociology, England, Farkas, Kilbourne, and Dou (1988) predicted that the size of the positive relationship between the number of years of job experience and workers’ salaries would depend on the percentage of female workers in the occupation. Occupations with a higher percentage of female workers were expected to have smaller increases in workers’ salaries than occupations with a smaller percentage of female workers.
3. In educational policy, there is strong interest in comparing the achievement of students who attend public vs. private schools (Coleman, Hoffer, & Kilgore, 1982; Lee & Bryk, 1989). In comparing these two “treatments” it is important to control statistically for a number of background characteristics of the students such as prior academic achievement, IQ, race, and family income.
4. In experimental psychology, Yerkes and Dodson (1908) proposed a classic “law” that performance has an inverted U-shaped relationship to physiological arousal. The point at which maximum performance occurs is determined by the difficulty of the task.
5. In health sciences, Aiken, West, Woodward, and Reno (1994) developed a predictive model of women’s compliance versus noncompliance (a binary outcome) with recommendations for screening mammography. They were interested in the ability of a set of health beliefs (perceived severity of breast cancer, perceived susceptibility to breast cancer, perceived benefits of mammography, perceived barriers to mammography) to predict compliance over and above several other sets of variables: demographics, family medical history, medical input, and prior knowledge.
Each of these hypotheses proposes some form of relationship between one or more factors of interest (independent variables) and an outcome (dependent) variable. There are usually other variables whose effects also need to be considered, for reasons we will be discussing in this text. This book strongly emphasizes the critical role of theory in planning MRC analyses. The researcher’s task is to develop a statistical model that will accurately estimate the relationships among the variables. Then the power of MRC analysis can be brought to bear to test the hypotheses and provide estimations of the size of the effects. However, this task cannot be carried out well if the actual data are not evaluated with regard to the assumptions of the statistical model.
1.1.3 Multiple Regression/Correlation in Prediction Models
Other applications of MRC exist as well. MRC can be used in practical prediction problems where the goal is to forecast an outcome based on data that were collected earlier. For example, a college admissions committee might be interested in predicting college GPA based on high school grades, college entrance examination (SAT or ACT) scores, and ratings of students by high school teachers. In the absence of prior research or theory, MRC can be used in a purely exploratory fashion to identify a collection of variables that strongly predict an outcome variable. For example, coding of the court records for a large city could identify a number of characteristics of felony court cases (e.g., crime characteristics, defendant demographics, drug involvement, crime location, nature of legal representation) that might predict the length of sentence. MRC can be used to identify a minimum set of variables that yield the best prediction of the criterion for the data that have been collected (A. J. Miller, 1990). Of course, because this method will inevitably capitalize on chance relationships in the original data set, replication in a new sample will be critical. Although we will address purely predictive applications of MRC in this book, our focus will be on the MRC techniques that are most useful in the testing of scientific hypotheses.
In this chapter, we initially consider several issues that are associated with the application of MRC in the behavioral sciences. Some disciplines within the behavioral sciences (e.g., experimental psychology) have had a misperception that MRC is only suitable for nonexperimental research. We consider how this misperception arose historically, note that MRC yields identical statistical tests to those provided by ANOVA yet additionally provides several useful measures of the size of the effect. We also note some of the persisting differences in data-analytic philosophy that are associated with researchers using MRC rather than ANOVA. We then consider how the MRC model nicely matches the complexity and variety of relationships commonly observed in the behavioral sciences. Several independent variables may be expected to influence the dependent variable, the independent variables themselves may be related, the independent variables may take different forms (e.g., rating scales or categorical judgments), and the form of the relationship between the independent and dependent variables may also be complex. Each of these complexities is nicely addressed by the MRC model. Finally, we consider the meaning of causality in the behavioral sciences and the meanings of control. Included in this section is a discussion of how MRC and related techniques can help rule out at least some explanations of the observed relationships. We encourage readers to consider these issues at the beginning of their study of the MRC approach and then to reconsider them at the end.
We then describe the orientation and contents of the book. It is oriented toward practical data analysis problems and so is generally nonmathematical and applied. We strongly encourage readers to work through the solved problems, to take full advantage of the programs for three major computer packages and data sets included with the book, and, most important, to learn MRC by applying these techniques to their own data. Finally, we provide a brief overview of the content of the book, outlining the central questions that are the focus of each chapter.
1.2 A COMPARISON OF MULTIPLE REGRESSION/CORRELATION AND ANALYSIS OF VARIANCE APPROACHES
MRC, ANOVA, and ANCOVA are each special cases of the general linear model in mathematical statistics.2 The description of MRC in this book includes extensions of conventional MRC analysis to the point where it is essentially equivalent to the general linear model. It thus follows that any data analyzable by ANOVA/ANCOVA may be analyzed by MRC, whereas the reverse is not the case. For example, research designs that study how a scaled characteristic of participants (e.g., IQ) and an experimental manipulation (e.g., structured vs. unstructured tasks) jointly influence the subjects’ responses (e.g., task performance) cannot readily be fit into the ANOVA framework. Even experiments with factorial designs with unequal cell sample sizes present complexities for ANOVA approaches because of the nonindependence of the factors, and standard computer programs now use a regression approach to estimate effects in such cases. The latter chapters of the book will extend the basic MRC model still further to include alternative statistical methods of estimating relationships.
1.2.1 Historical Background
Historically, MRC arose in the biological and behavioral sciences around 1900 in the study of the natural covariation of observed characteristics of samples of subjects, including Galton’s studies of the relationship between the heights of fathers and sons and Pearson’s and Yule’s work on educational issues (Yule, 1911). Somewhat later, ANOVA/ANCOVA grew out of the analysis of agricultural data produced by the controlled variation of treatment conditions in manipulative experiments. It is noteworthy that Fisher’s initial statistical work in this area emphasized the multiple regression framework because of its generality (see Tatsuoka, 1993). However, multiple regression was often computationally intractable in the precomputer era: computations that take milliseconds by computer required weeks or even months to do by hand. This led Fisher to develop the computationally simpler, equal (or proportional) sample size ANOVA/ANCOVA model, which is particularly applicable to planned experiments. Thus multiple regression and ANOVA/ANCOVA approaches developed in parallel and, from the perspective of the substantive researchers who used them, largely independently. Indeed, in certain disciplines such as psychology and education, the association of MRC with nonexperimental, observational, and survey research led some scientists to perceive MRC to be less scientifically respectable than ANOVA/ANCOVA, which was associated with experiments.
Close examination suggests that this guilt (or virtue) by association is unwarranted—the result of the confusion of data-analytic method with the logical considerations that govern the inference of causality. Experiments in which different treatments are applied to randomly assigned groups of subjects and there is no loss (attrition) of subjects permit unambiguous inference of causality; the observation of associations among variables in a group of randomly selected subjects does not. Thus, interpretation of a finding of superior early school achievement of children who participate in Head Start programs compared to nonparticipating children depends on the design of the investigation (Shadish, Cook, & Campbell, 2002; West, Biesanz, & Pitts, 2000). For the investigator who randomly assigns children to Head Start versus Control programs, attribution of the effect to program content is straightforward. For the investigator who simply observes whether children whose parents select Head Start programs have higher school achievement than those who do not, causal inference becomes less certain. Many other possible differences (e.g., child IQ; parent education) may exist between the two groups of children that could potentially account for any findings. But each of the investigative teams may analyze their data using either ANOVA (or equivalently a t test of the mean difference in school achievement) or MRC (a simple one-predictor regression analysis of school achievement as a function of Head Start attendance with its identical t test). The logical status of causal inference is a function of how the data were produced, not how they were analyzed (see further discussion in several chapters, especially in Chapter 12).
1.2.2 Hypothesis Testing and Effect Sizes
Any relationship we observe, whether between independent variables (treatments) and an outcome in an experiment or between independent variables and a “dependent” variable in an observational study, can be characterized in terms of the strength of the relationship or its effect size (ES). We can ask how much of the total variation in the dependent variable is produced by or associated with the independent variables we are studying. One of the most attractive features of MRC is its automatic provision of regression coefficients, proportion of variance, and correlational measures of various kinds, all of which are kinds of ES measures. We venture the assertion that, despite the preoccupation of the behavioral sciences, the health sciences, education, and business with quantitative methods, the level of consciousness in many areas about strength of observed relationships is at a surprisingly low level. This is because concern about the statistical significance of effects has tended to pre-empt attention to their magnitude (Harlow, Mulaik, & Steiger, 1997). Statistical significance only provides information about whether the relationship exists at all...

Table of contents

  1. Cover Page
  2. Half Title Page
  3. Title Page
  4. Copyright Page
  5. Dedication
  6. Table of Contents
  7. Preface
  8. Chapter 1: Introduction
  9. Chapter 2: Bivariate Correlation and Regression
  10. Chapter 3: Multiple Regression/Correlation With Two or More Independent Variables
  11. Chapter 4: Data Visualization, Exploration, and Assumption Checking: Diagnosing and Solving Regression Problems I
  12. Chapter 5: Data-Analytic Strategies Using Multiple Regression/Correlation
  13. Chapter 6: Quantitative Scales, Curvilinear Relationships, and Transformations
  14. Chapter 7: Interactions Among Continuous Variables
  15. Chapter 8: Categorical or Nominal Independent Variables
  16. Chapter 9: Interactions With Categorical Variables
  17. Chapter 10: Outliers and Multicollinearity: Diagnosing and Solving Regression Problems II
  18. Chapter 11: Missing Data
  19. Chapter 12: Multiple Regression/Correlation and Causal Models
  20. Chapter 13: Alternative Regression Models: Logistic, Poisson Regression, and the Generalized Linear Model
  21. Chapter 14: Random Coefficient Regression and Multilevel Models
  22. Chapter 15: Longitudinal Regression Methods
  23. Chapter 16: Multiple Dependent Variables: Set Correlation
  24. APPENDICES
  25. References
  26. Glossary
  27. Statistical Symbols and Abbreviations
  28. Author Index
  29. Subject Index