Beyond Multiple Linear Regression
eBook - ePub

Beyond Multiple Linear Regression

Applied Generalized Linear Models And Multilevel Models in R

Paul Roback, Julie Legler

Share book
  1. 418 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Beyond Multiple Linear Regression

Applied Generalized Linear Models And Multilevel Models in R

Paul Roback, Julie Legler

Book details
Book preview
Table of contents

About This Book

Beyond Multiple Linear Regression: Applied Generalized Linear Models and Multilevel Models in R is designed for undergraduate students who have successfully completed a multiple linear regression course, helping them develop an expanded modeling toolkit that includes non-normal responses and correlated structure. Even though there is no mathematical prerequisite, the authors still introduce fairly sophisticated topics such as likelihood theory, zero-inflated Poisson, and parametric bootstrapping in an intuitive and applied manner. The case studies and exercises feature real data and real research questions; thus, most of the data in the textbook comes from collaborative research conducted by the authors and their students, or from student projects. Every chapter features a variety of conceptual exercises, guided exercises, and open-ended exercises using real data. After working through this material, students will develop an expanded toolkit and a greater appreciation for the wider world of data and statistical modeling.

A solutions manual for all exercises is available to qualified instructors at the book's website at, and data sets and Rmd files for all case studies and exercises are available at the authors' GitHub repo (

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Beyond Multiple Linear Regression an online PDF/ePUB?
Yes, you can access Beyond Multiple Linear Regression by Paul Roback, Julie Legler in PDF and/or ePUB format, as well as other popular books in Mathématiques & Programmation linéaire et non linéaire. We have over one million books available in our catalogue for you to explore.



Review of Multiple Linear Regression

1.1 Learning Objectives

After finishing this chapter, you should be able to:
  • Identify cases where linear least squares regression (LLSR) assumptions are violated.
  • Generate exploratory data analysis (EDA) plots and summary statistics.
  • Use residual diagnostics to examine LLSR assumptions.
  • Interpret parameters and associated tests and intervals from multiple regression models.
  • Understand the basic ideas behind bootstrapped confidence intervals.

1.2 Introduction to Beyond Multiple Linear Regression

Ecologists count species, criminologists count arrests, and cancer specialists count cases. Political scientists seek to explain who is a Democrat, pre-med students are curious about who gets into medical school, and sociologists study which people get tattoos. In the first case, ecologists, criminologists and cancer specialists are concerned about outcomes which are counts. The political scientists’, pre-med students’ and sociologists’ interest centers on binary responses: Democrat or not, accepted or not, and tattooed or not. We can model these non-Gaussian (non-normal) responses in a more natural way by fitting generalized linear models (GLMs) as opposed to using linear least squares regression (LLSR) models.
When models are fit to data using linear least squares regression (LLSR), inferences are possible using traditional statistical theory under certain conditions: if we can assume that there is a linear relationship between the response (Y) and an explanatory variable (X), the observations are independent of one another, the responses are approximately normal for each level of the X, and the variation in the responses is the same for each level of X. If we intend to make inferences using GLMs, necessary assumptions are different. First, we will not be constrained by the normality assumption. When conditions are met, GLMs can accommodate non-normal responses such as the counts and binary data in our preceding examples. While the observations must still be independent of one another, the variance in Y at each level of X need not be equal nor does the assumption of linearity between Y and X need to be plausible.
However, GLMs cannot be used for models in the following circumstances: medical researchers collect data on patients in clinical trials weekly for 6 months; rat dams are injected with teratogenic substances and their offspring are monitored for defects; and, musicians’ performance anxiety is recorded for several performances. Each of these examples involves correlated data: the same patient’s outcomes are more likely to be similar from week-to-week than outcomes from different patients; litter mates are more likely to suffer defects at similar rates in contrast to unrelated rat pups; and, a musician’s anxiety is more similar from performance to performance than it is with other musicians. Each of these examples violate the independence assumption of simpler linear models for LLSR or GLM inference.
The Generalized Linear Models in the book’s title extends least squares methods you may have seen in linear regression to handle responses that are non-normal. The Multilevel Models in the book’s title will allow us to create models for situations where the observations are not independent of one another. Overall, these approaches will permit us to get much more out of data and may be more faithful to the actual data structure than models based on ordinary least squares. These models will allow you to expand beyond multiple linear regression.
In order to understand the motivation for handling violations of assumptions, it is helpful to be able to recognize the model assumptions for inference with LLSR in the context of different studies. While linearity is sufficient for fitting an LLSR model, in order to make inferences and predictions the observations must also be independent, the responses should be approximately normal at each level of the predictors, and the standard deviation of the responses at each level of the predictors should be approximately equal. After examining circumstances where inference with LLSR is appropriate, we will look for violations of these assumptions in other sets of circumstances. These are settings where we may be able to use the methods of this text. We’ve kept the examples in the exposition simple to fix ideas. There are exercises which describe more realistic and complex studies.

1.3 Assumptions for Linear Least Squares Regression

Recall that making inferences or predictions with models fit using linear least squares regression requires that the following assumptions be tenable. The acronym LINE can be used to recall the assumptions requ...

Table of contents