Generalized Linear Models
  1. 532 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

About this book

The success of the first edition of Generalized Linear Models led to the updated Second Edition, which continues to provide a definitive unified, treatment of methods for the analysis of diverse types of data. Today, it remains popular for its clarity, richness of content and direct relevance to agricultural, biological, health, engineering, and ot

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Generalized Linear Models by P. McCullagh,John A. Nelder, D.R. Cox, N. Reid, Valerie Isham, R.J. Tibshirani, Thomas A. Louis, Howell Tong, Niels Keiding in PDF and/or ePUB format, as well as other popular books in Mathematics & Probability & Statistics. We have over one million books available in our catalogue for you to explore.

Information

CHAPTER 1
Introduction
1.1 Background
In this book we consider a class of statistical models that is a natural generalization of classical linear models. Generalized linear models include as special cases, linear regression and analysis-of-variance models, logit and probit models for quantal responses, log-linear models and multinomial response models for counts and some commonly used models for survival data. It is shown that the above models share a number of properties, such as linearity, that can be exploited to good effect, and that there is a common method for computing parameter estimates. These common properties enable us to study generalized linear models as a single class, rather than as an unrelated collection of special topics.
Classical linear models and least squares began with the work of Gauss and Legendre (Stigler, 1981, 1986) who applied the method to astronomical data. Their data were usually measurements of continuous quantities such as the positions and magnitudes of the heavenly bodies and, at least in the astronomical investigations, the variability in the observations was largely the effect of measurement error. The Normal, or Gaussian, distribution was viewed as a mathematical construct developed to describe the properties of such errors; later in the nineteenth century the same distribution was used to describe the variation between individuals in a biological population in respect of a character such as height, an application quite different in kind from its use for describing measurement error, and leading to the numerous biological applications of linear models.
Gauss introduced the Normal distribution of errors as a device for describing variability, but he showed that many of the important properties of least-squares estimates depend not on Normality but on the assumptions of constant variance and independence. A closely related property applies to all generalized linear models. In other words, although we make reference at various points to standard distributions such as the Normal, binomial, Poisson, exponential or gamma, the second-order properties of the parameter estimates are insensitive to the assumed distributional form: the second-order properties depend mainly on the assumed variance-to-mean relationship and on uncorrelatedness or independence. This is fortunate because, in applications, one can rarely be confident that all aspects of the assumed distributional form are correct.
Another strand in the history of statistics is the development of methods for dealing with discrete events rather than with continuously varying quantities. The enumeration of probabilities for various configurations in games of cards and dice was a matter of keen interest for gamblers in the eighteenth century. From their pioneering work grew methods for dealing with data in the form of counts of events. In the context of rare events, the basic distribution is that named after Poisson. This distribution has been applied to diverse kinds of events: a famous example concerns unfortunate soldiers kicked to death by Prussian horses (Bortkewitsch, 1898). The annual number of such incidents during the period 1875–1894 was observed to be consistent with the Poisson distribution having mean about 0.7 per corps per year. There is, however, some variation in this figure between corps and between years. Routine laboratory applications of the Poisson model include the monitoring of radioactive tracers by emission counts, counts of infective organisms as measured by the number of events observed on a slide under a microscope, and so on.
Closely related to the Poisson model are models for the analysis of counted data in the form of proportions or ratios of counts. The Bernoulli distribution is often suitable for modelling the presence or absence of disease in a patient, and the derived binomial distribution may be suitable as a model for the number of diseased patients in a fixed pool of patients under study. In medical and pharmaceutical trials it is usually required to study not primarily the incidence of a particular disease, but how the incidence is affected by factors such as age, social class, housing conditions, exposure to pollutants, and any treatment procedures under study. Generalized linear models permit us to study patterns of systematic variation in much the same way as ordinary linear models are used to study the joint effects of treatments and covariates.
Some continuous measurements encountered in practice have non-Normal error distributions, and the class of generalized linear models includes distributions useful for the analysis of such data. The simplest examples are perhaps the exponential and gamma distributions, which are often useful for modelling positive data that have positively skewed distributions, such as occur in studies of survival times.
Before looking in more detail at the history of individual instances of generalized linear models, we make some general comments about statistical models and the part they play in the analysis of data, whether experimental or observational.
1.1.1 The problem of looking at data
Suppose we have a number of measurements or counts, together with some associated structural or contextual information, such as the order in which the data were collected, which measuring instruments were used, and other differences in the conditions under which the individual measurements were made. To interpret such data, we search for a pattern, for example that one measuring instrument has produced consistently higher readings than another. Such systematic effects are likely to be blurred by other variation of a more haphazard nature. The latter variation is usually described in statistical terms, no attempt being made to model or to predict the actual haphazard contribution to each observation.
Statistical models contain both elements, which we will call systematic effects and random effects. The value of a model is that often it suggests a simple summary of the data in terms of the major systematic effects together with a summary of the nature and magnitude of the unexplained or random variation. Such a reduction is certainly helpful, for the human mind, while it may be able to encompass say 10 numbers easily enough, finds 100 much more difficult, and will be quite defeated by 1000 unless some reducing process takes place.
Thus the problem of looking intelligently at data demands the formulation of patterns that are thought capable of describing succinctly not only the systematic variation in the data under study, but also for describing patterns in similar data that might be collected by another investigator at another time and in another place.
1.1.2 Theor...

Table of contents

  1. Cover
  2. Title Page
  3. Copyright Page
  4. Dedication
  5. Table of Contents
  6. Preface to the first edition
  7. Preface
  8. 1 Introduction
  9. 2 An outline of generalized linear models
  10. 3 Models for continuous data with constant variance
  11. 4 Binary data
  12. 5 Models for polytomous data
  13. 6 Log-linear models
  14. 7 Conditional likelihoods*
  15. 8 Models with constant coefficient of variation
  16. 9 Quasi-likelihood functions
  17. 10 Joint modelling of mean and dispersion
  18. 11 Models with additional non-linear parameters
  19. 12 Model checking
  20. 13 Models for survival data
  21. 14 Components of dispersion
  22. 15 Further topics
  23. Appendices
  24. References
  25. Index of data sets
  26. Author index
  27. Subject index