1
Introduction
1.1 Multilevel analysis
Multilevel analysis is a methodology for the analysis of data with complex patterns of variability, with a focus on nested sources of such variability – pupils in classes, employees in firms, suspects tried by judges in courts, animals in litters, longitudinal measurements of subjects, etc. In the analysis of such data, it is usually illuminating to take account of the fact that each level of nesting is associated with variability that has a distinct interpretation. There is variability, for example, between pupils but also between classes, and one may draw incorrect conclusions if no distinction is made between these different sources of variability. Multilevel analysis is an approach to the analysis of such data, including the statistical techniques as well as the methodology for their use. The term ‘multilevel analysis’ is used mainly in the social sciences (in the wide sense: sociology, education, psychology, economics, criminology, etc.), but also in other fields such as the biomedical sciences. Our focus will be on the social sciences. Other terms, referring to the technical aspects, are hierarchical linear models, mixed models, and random coefficient models.
In its present form, multilevel analysis is a stream which has two tributaries: contextual analysis and mixed effects models. Contextual analysis is a development in the social sciences which has focused on the effects of the social context on individual behavior. Some landmarks before 1980 are the paper by Robinson (1950) which discussed the ecological fallacy (which refers to confusion between aggregate and individual effects), the paper by Davis et al. (1961) on the distinction between within-group and between-group regression, the volume edited by Dogan and Rokkan (1969), and the paper by Burstein et al. (1978) on treating regression intercepts and slopes on one level as outcomes on the higher level.
Mixed effects models are statistical models in the analysis of variance (ANOVA) and in regression analysis where it is assumed that some of the coefficients are fixed and others are random. This subject is too vast even to mention some landmarks. A standard reference book on random effects models and mixed effects models is Searle et al. (1992), Chapter 2 of which gives an extensive historical overview. The name ‘mixed model’ seems to have been used first by Eisenhart (1947).
Contextual modeling until about 1980 focused on the definition of appropriate variables to be used in ordinary least squares regression analysis. Until the 1980s the main focus in the development of statistical procedures for mixed models was on random effects (i.e., random differences between categories in some classification system) more than on random coefficients (i.e., random effects of numerical variables). Multilevel analysis as we now know it was formed by these two streams coming together. It was realized that, in contextual modeling, the individual and the context are distinct sources of variability, which should both be modeled as random influences. On the other hand, statistical methods and algorithms were developed that allowed the practical use of regression-type models with nested random coefficients. There was a cascade of statistical papers: Aitkin et al. (1981), Laird and Ware (1982), Mason et al. (1983), Goldstein (1986), Aitkin and Longford (1986), Raudenbush and Bryk (1986), de Leeuw and Kreft (1986), and Longford (1987) proposed and developed techniques for calculating estimates for mixed models with nested coefficients. These techniques, together with the programs implementing them which were developed by a number of these researchers or under their supervision, allowed the practical use of models of which until that moment only special cases were accessible for practical use. By 1986 the basis of multilevel analysis was established. The first textbook appeared (by Goldstein, now in its fourth edition) and was followed by a few others in the 1990s and many more in the 2000s. The methodology has been further elaborated since then, and has proved to be quite fruitful in applications in many fields. On the organizational side, there are stimulating centers such as the ‘Multilevel Models Project’ in Bristol with its Newsletter and its website http://www.mlwin.com/, and there is an active internet discussion group at http://www.jiscmail.ac.uk/lists/multilevel.html.
In the biomedical sciences mixed models were proposed especially for longitudinal data; in economics mainly for panel data (Swamy, 1971), the most common longitudinal data in economics. One of the issues treated in the economics literature was the pooling of cross-sectional and time series data (e.g., Maddala, 1971; Hausman and Taylor, 1981), which is closely related to the difference between within-group and between-group regressions. Overviews are given by Chow (1984) and Baltagi (2008).
A more elaborate history of multilevel analysis is presented in the bibliographical sections of Longford (1993) and in Kreft and de Leeuw (1998). For an extensive bibliography of the older literature, see HÜttner and van den Eeden (1995). A more recent overview of much statistical work in this area can be found in the handbook by de Leeuw and Meijer (2008a).
1.1.1 Probability models
The main statistical model of multilevel analysis is the hierarchical linear model, an extension of multiple linear regression to a model that includes nested random coefficients. This model is explained in Chapter 5 and forms the basis of most of this book.
There are several ways to argue why it makes sense to use a probability model for data analysis. In sampling theory a distinction is made between design-based inference and model-based inference. This is discussed further in Chapter 14. The former means that the researcher draws a probability sample from some finite population, and wishes to make inferences from the sample to this finite population. The probability model then follows from how the sample is drawn by the researcher. Model-based inference means that the researcher postulates a probability model, usually aiming at inference to some large and sometimes hypothetical population such as all English primary school pupils in the 2000s or all human adults living in a present-day industrialized culture. If the probability model is adequate then so are the inferences based on it, but checking this adequacy is possible only to a limited extent.
It is possible to apply model-based inference to data collected by investigating some entire research population, such as all 12-year-old pupils in Amsterdam at a given moment. Sometimes the question arises as to why one should use a probability model if no sample is drawn but an entire population is observed. Using a probability model that assumes statistical variability, even though an entire research population was investigated, can be justified by realizing that conclusions are sought which apply not only to the investigated research population but also to a wider population. The investigated research population is assumed to be representative of this wider population – for pupils who are older or younger, in other towns, perhaps in other countries. This is called a superpopulation in Chapter 14, where the relation between model-based and design-based inference is further discussed. Applicability of the statistical inference to such a wider population is not automatic, but has to be carefully argued by considering whether indeed the research population may be considered to be representative of the larger (often vaguely outlined) population. This is the ‘second span of the bridge of statistical inference’ as discussed by Cornfield and Tukey (1956).1 The inference then is not primarily about a given delimited set of individuals but about social, behavioral, biological, etc., mechanisms and processes. The random effects, or residuals, playing a role in such probability models can be regarded as resulting from the factors that are not included in the explanatory variables used. They reflect the approximative nature of the model used. The model-based inference will be adequate to the extent that the assumptions of the probability model are an adequate reflection of the effects that are not explicitly included by means of observed variables.
As we shall see in Chapters 3–5, the basic idea of multilevel analysis is that data sets with a nesting structure that includes unexplained variability at each level of nesting, such as pupils in classes or employees in firms, are usually not adequately represented by the probability model of multiple linear regression analysis, but are often adequately represented by the hierarchical linear model. Thus, the use of the hierarchical linear model in multilevel analysis is in the tradition of model-based inference.
1.2 This book
This book is intended as an introductory textbook and as a reference book for practical users of multilevel analysis. We have tried to include all the main points that come up when applying multilevel analysis. Most of the data sets used in the examples, and corresponding commands to run the examples in various computer programs (see Chapter 18), are available on the website http://www.stats.ox.ac.uk/~snijders/mlbook.htm.
After this introductory chapter, the book proceeds with a conceptual chapter about multilevel questions and a chapter on ways to treat multilevel data that are not based on the hierarchical linear model. Chapters 4–6 treat the basic conceptual ideas of the hierarchical linear model, and how to work with it in practice. Chapter 4 introduces the random intercept model as the primary example of the hierarchical linear model. This is extended in Chapter 5 to random slope models. Chapters 4 and 5 focus on understanding the hierarchical linear model and its parameters, paying only very limited attention to procedures and algorithms for parameter estimation (estimation being work that most researchers delegate to the computer). Chapter 6 is concerned with testing parameters and specifying a multilevel model.
An introductory course on multilevel analysis could cover Chapters 1–6 and Section 7.1, with selected material from other chapters. A minimal course would focus on Chapters 4–6. The later chapters cover topics that are more specialized or more advanced, but important in the practice of multilevel analysis.
The text of this book is not based on a particular computer program for multilevel analysis. The last chapter, 18, gives a brief review of computer programs that can be used for multilevel analysis.
Chapters 7 (on the explanatory power of the model) and 10 (on model assumptions) are important for the interpretation of the results of statistical analyses using the hierarchical linear model. Researchers who have data sets with many missing values, or who plan to collect data sets that may run this type of risk, will profit from reading Chapter 9. Chapter 11 helps the researcher in setting up a m...