eBook - ePub

Multilevel Modeling of Categorical Outcomes Using IBM SPSS

Name: Multilevel Modeling of Categorical Outcomes Using IBM SPSS
Author: Ronald H Heck, Scott Thomas, Lynn Tabata

Ronald H Heck, Scott Thomas, Lynn Tabata

Share book

456 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Multilevel Modeling of Categorical Outcomes Using IBM SPSS

Ronald H Heck, Scott Thomas, Lynn Tabata

Book details

Book preview

Table of contents

Citations

About This Book

This is the first workbook that introduces the multilevel approach to modeling with categorical outcomes using IBM SPSS Version 20. Readers learn how to develop, estimate, and interpret multilevel models with categorical outcomes. The authors walk readers through data management, diagnostic tools, model conceptualization, and model specification issues related to single-level and multilevel models with categorical outcomes. Screen shots clearly demonstrate techniques and navigation of the program. Modeling syntax is provided in the appendix. Examples of various types of categorical outcomes demonstrate how to set up each model and interpret the output. Extended examples illustrate the logic of model development, interpretation of output, the context of the research questions, and the steps around which the analyses are structured. Readers can replicate examples in each chapter by using the corresponding data and syntax files available at www.psypress.com/9781848729568.

The book opens with a review of multilevel with categorical outcomes, followed by a chapter on IBM SPSS data management techniques to facilitate working with multilevel and longitudinal data sets. Chapters 3 and 4 detail the basics of the single-level and multilevel generalized linear model for various types of categorical outcomes. These chapters review underlying concepts to assist with trouble-shooting common programming and modeling problems. Next population-average and unit-specific longitudinal models for investigating individual or organizational developmental processes are developed. Chapter 6 focuses on single- and multilevel models using multinomial and ordinal data followed by a chapter on models for count data. The book concludes with additional trouble shooting techniques and tips for expanding on the modeling techniques introduced.

Ideal as a supplement for graduate level courses and/or professional workshops on multilevel, longitudinal, latent variable modeling, multivariate statistics, and/or advanced quantitative techniques taught in psychology, business, education, health, and sociology, this practical workbook also appeals to researchers in these fields. An excellent follow up to the authors' highly successful Multilevel and Longitudinal Modeling with IBM SPSS and Introduction to Multilevel Modeling Techniques, 2nd Edition, this book can also be used with any multilevel and/or longitudinal book or as a stand-alone text introducing multilevel modeling with categorical outcomes.

Frequently asked questions

How do I cancel my subscription?

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.

Can/how do I download books?

At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.

What is the difference between the pricing plans?

Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.

What is Perlego?

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Do you support text-to-speech?

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Is Multilevel Modeling of Categorical Outcomes Using IBM SPSS an online PDF/ePUB?

Yes, you can access Multilevel Modeling of Categorical Outcomes Using IBM SPSS by Ronald H Heck, Scott Thomas, Lynn Tabata in PDF and/or ePUB format, as well as other popular books in Psychologie & Recherche et méthodologie en psychologie. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Routledge

Year

2013

ISBN

9781136672347

Edition

Topic

Psychologie

Subtopic

Recherche et méthodologie en psychologie

CHAPTER 1

Introduction to Multilevel Models with Categorical Outcomes

Introduction

Social science research presents an opportunity to study phenomena that are multilevel, or hierarchical, in nature. Examples include college students nested in institutions within states or elementary-aged students nested in classrooms within schools. Attempting to understand individuals’ behavior or attitudes in the absence of group contexts known to influence those behaviors or attitudes can severely handicap researchers’ ability to explicate the underlying structures or processes of interest. People within particular organizations may share certain properties, including socialization patterns, traditions, attitudes, and goals.

Multilevel modeling (MLM) is an attractive approach for studying the relationships between individuals and their various social groups because it allows the incorporation of substantive theory about individual and group processes into the sampling schemes of many research studies (e.g., multistage stratified samples, repeated measures designs) or into hierarchical data structures found in many existing data sets encountered in social science, management, and health-related research (Heck, Thomas, & Tabata, 2010). MLM is fast becoming the standard analytic approach for examining data and publishing results in many fields due to its adaptability to a broad range of designs (e.g., experiments, quasi-experiments, survey), data structures (e.g., nested data, cross-classified, cross-sectional, and longitudinal data), and outcomes (continuous, categorical). Despite this applicability to many research problems, however, MLM procedures have not yet been fully integrated into research and statistics texts used in typical graduate courses.

Two major obstacles are responsible for this reality. First, no standard language has emerged from this multilevel empirical work in terms of theories, model specification, and procedures of investigation. MLM is referred to by a variety of different names, including random-coefficient, mixed-effect, hierarchical linear, and multilevel regression models. The diversity of names reflects methodological development in several different fields, which has led to differences in the manner in which the methods and analytic software are used in various fields. In general, multilevel models deal with nested data—that is, where observations are clustered within successive levels of a data hierarchy.

Second, until recently, the specification of multilevel models with continuous and categorical outcomes required special software programs such as HLM (Raudenbush, Bryk, Cheong, & Congdon, 2004), LISREL (du Toit & du Toit, 2001); MLwiN (Rasbash, Steele, Browne, & Goldstein, 2009), and Mplus (Muthén & Muthén, 1998–2006). Although the mainstream emergence and acceptance of multilevel methods over the past two decades has been largely due to the development of specialized software by a relatively small group of scholars, other more widely used statistical packages, including IBM SPSS, SAS, and Stata, have in recent years implemented routines that enable the development and specification of a wide variety of multilevel and longitudinal models (see Albright & Marinova, 2010, for an overview of each package).

In IBM SPSS, the multilevel analytic routine is referred to as MIXED, which indicates a class of models that incorporates both fixed and random effects. As such, mixed models imply the existence of data in which individual observations on an outcome are distributed (or vary) across identifiable groups. Repeated observations may also be distributed across individuals and groups. The variance parameter of the random effect indicates its distribution in the population and therefore describes the degree of heterogeneity (Hedeker, 2005). The MIXED routine is a component of the advanced statistics add-on module for the PC and the Mac, which can be used to estimate a wide variety of multilevel models with diverse research designs (e.g., experimental, quasi-experimental, nonexperimental) and data structures. It is differentiated from more familiar linear models (e.g., analysis of variance, multiple regression) through its capability of examining correlated data and unequal variances within groups. Such data are commonly encountered when individuals are nested in social groups or when there are repeated measures (e.g., several test scores) nested within individuals. Because these data structures are hierarchical, people within successive groupings may share similarities that must be considered in the analysis in order to provide correct estimation of the model parameters (e.g., coefficients, standard errors).

If the analysis is conducted only on the number of individuals in the study, the effects of group-level variables (e.g., organizational size, productivity, type of organization) may be over-valued in terms of their contribution to explaining the outcome. This is because there will typically be many more individuals than groups in a study, so the effects of group variables on the outcome may appear much stronger than they really are. If, instead, we aggregate the data from individuals and conduct the analysis between the groups, we will lose all of the variability among individuals within their groups. The optimal solution to these types of problems concerning the unit of analysis is to consider the number of groups and individuals in the analysis. When the research design is multilevel and either balanced or unbalanced (i.e., there are different numbers of individuals within groups), the estimation procedures in MIXED will provide asymptotically efficient estimates of the model’s structural parameters and variance components. In short, the MIXED routine provides a nice and effective way to specify models at two or more levels in a data hierarchy.

In our previous IBM SPSS workbook (Heck et al., 2010), our intent was to help readers set up, conduct, and interpret a variety of different types of introductory multilevel and longitudinal models using this modeling procedure. At the time we finished the workbook in April 2010, we noted that the major limitation of the MIXED model routine was that the outcomes had to be continuous. This precluded many situations where researchers might be interested in applying multilevel analytic procedures to various types of categorical (e.g., dichotomous, ordinal, count) outcomes. Although models with categorical repeated measures nested within individuals can be estimated in IBM SPSS using the generalized estimating equation (GEE) approach (Liang & Zeger, 1986), we did not include this approach in our first workbook because it does not support the inclusion of group processes at a level above individuals; that is, the analyst must assume that individuals are randomly sampled and, therefore, not clustered in groups.

Today as we evaluate the array of analytic routines available for continuous and categorical outcomes in IBM SPSS, we note that many of these procedures were incorporated over the past 15 years as part of the REGRESSION modeling routine. A few years ago, however, in IBM SPSS various procedures for examining different categorical outcomes were consolidated under the generalized linear model (GLM) (Nelder & Wedderburn, 1972), which is referred to as GENLIN. We note that procedures for handling clustered data with categorical outcomes have been slower to develop than for continuous outcomes, due to added challenges of solving a system of nonlinear mathematical equations in estimating the model parameters for categorical outcomes. This is because categorical outcomes result from types of probability distributions other than the normal distribution. Relatively speaking, mathematical equations for linear models with continuous outcomes are much less challenging to solve.

Over the past couple of years, the MIXED modeling routine has been expanded to include several different types of categorical outcomes. The various multilevel categorical models are referred to as generalized linear mixed models (or GENLIN MIXED in IBM SPSS terminology). This capability begins with Version 19, which was introduced in fall 2010, and is refined in Version 20 (introduced in fall 2011). The inclusion of this new categorical multilevel modeling capability prompted us to develop this second workbook. We wanted to provide a thorough applied treatment of models for categorical outcomes in order to finish our original intent of introducing multilevel and longitudinal analysis using IBM SPSS. Our target audience has been and remains graduate students and applied researchers in a variety of different social science fields. We hope this presentation will be a useful addition to our readers’ repertoire of quantitative tools for examining a broad range of research problems.

Our Intent

In this second workbook, our intent is to introduce readers to a range of single-level and multilevel models for cross-sectional and longitudinal data with categorical outcomes. One of our motivations for this book was our observation that introductory and intermediate statistics courses typically devote an inordinate amount of time to models for continuous outcomes and, as a result, graduate students in the social sciences have relatively little experience with various types of quantitative modeling techniques for categorical outcomes. There are many good reasons for an emphasis on models for continuous outcomes, but we believe this has left students and, ultimately, their fields ill prepared to deal with the wide range of important questions that do not accommodate continuously measured outcomes.

There are a number of important conceptual and mathematical differences between models for continuous and categorical outcomes. Categorical responses result from probability distributions other than the normal distribution and therefore require different types of underlying mathematical models and estimation methods. Because of these differences, they are often more challenging to investigate. First, they can be harder to report about because they are in different metrics (e.g., log odds, probit coefficients, event rates) from the unstandardized and standardized multiple regression coefficients with which most readers may be familiar. In other fields, such as health sciences, however, beginning researchers are more apt to encounter categorical outcomes more routinely—one example being investigating the presence or absence of a disease.

Second, with respect to multilevel modeling, models with categorical outcomes require somewhat different estimation procedures, which can take longer to converge on a solution and, as a result, may require making more compromises during investigation than typical continuous-outcome models. Despite these added challenges, researchers in the social sciences often encounter variables that are not continuous because outcomes are often perceptual (e.g., defined on an ordinal scale) or dichotomous (e.g., deciding whether or not to vote, dropping out or persisting) or refer to membership in different groups (e.g., religious affiliation, race/ethnicity). Of course, depending on the goals of the study, such variables may be either independent (predictor) or dependent (outcome) variables. Therefore, building skills in defining and analyzing single-level and multilevel models should provide opportunities for researchers to investigate different types of categorical dependent variables.

In developing this workbook, we, of course, had to make choices about what content to include and when we could refer readers to other authors for more extended treatments of issues we raise. There are many different types of quantitative models available in IBM SPSS for working with categorical variables, beginning with basic contingency tables and related measures of association, loglinear models, discriminant analysis, logistic and ordinal regression, probit regression, and survival models, as well as multilevel formulations of many of these basic single-level models. We simply cannot cover all of these various types of analytic approaches for categorical outcomes in detail. Instead, we chose to highlight some types of categorical outcomes researchers are likely to encounter regularly in investigating multilevel models with various types of cross-sectional and repeated measures designs. We encourage readers also to consult other discussions of the various analytic procedures available for categorical outcomes to widen their understanding of the assumptions and uses of these types of models.

As in our first workbook, we spend considerable time introducing and developing a general strategy for setting up, running, and interpreting multilevel models with categorical outcomes. We also devote considerable space to the various types of categorical outcomes that are frequently encountered, some of the differences involved in estimating single-level and multilevel models with categorical versus continuous outcomes, and what the meaning of the output is for various categorical outcomes. We made this decision because we believe that students are generally less familiar with models having categori...