CHAPTER 1
Structural Equation Models: The Basics
CHAPTER 2
Using LISTEL, PRELIS, and SIMPLIS
Structural Equation Models: The Basics
Structural equation modeling (SEM) is a statistical methodology that takes a confirmatory (i.e., hypothesis-testing) approach to the multivariate analysis of a structural theory bearing on some phenomenon. Typically, this theory represents “causal” processes that generate observations on multiple variables (Bentler, 1988). The term structural equation modeling conveys two important aspects of the procedure: (a) that the causal processes under study are represented by a series of structural (i.e., regression) equations, and (b) that these structural relations can be modeled pictorially to enable a clearer conceptualization of the theory under study. The hypothesized model can then be tested statistically in a simultaneous analysis of the entire system of variables to determine the extent to which it is consistent with the data. If goodness-of-fit is adequate, the model argues for the plausibility of postulated relations among variables; if it is inadequate, the tenability of such relations is rejected.
Several aspects of SEM set it apart from the older generation of multivariate procedures (see Fornell, 1982). First, as noted earlier, it takes a confirmatory, rather than an exploratory, approach to the data analysis (although aspects of the latter can be addressed). Furthermore, by demanding that the pattern of intervariable relations be specified a priori, SEM lends itself well to the analysis of data for inferential purposes. By contrast, most other multivariate procedures are essentially descriptive by nature (e.g., exploratory factor analysis), so that hypothesis testing is difficult, if not impossible. Second, whereas traditional multivariate procedures are incapable of either assessing or correcting for measurement error, SEM provides explicit estimates of these parameters. Finally, whereas data analyses using the former methods are based on observed measurements only, those using SEM procedures can incorporate both unobserved (i.e. latent) and observed variables.
Given these highly desirable characteristics, SEM has become a popular methodology for nonexperimental research, where methods for testing theories are not well developed and ethical considerations make experimental design unfeasible (Bentler, 1980). Structural equation modeling can be utilized very effectively to address numerous research problems involving nonexperimental research; in this book, I illustrate the most common applications (e.g., chapters 3, 4, 7, 9, and 11), as well as some that are less frequently found in the substantive literatures (e.g., chapters 5, 6, 8, 10, and 12). Before showing you how to use the LISREL program (Jöreskog & Sörbom, 1993b), along with its companion package PRELIS (Jöreskog & Sörbom, 1993c) and second language option SIMPLIS, however, it is essential that I first review key concepts associated with the methodology. We turn now to their brief explanation.
BASIC CONCEPTS
Latent Versus Observed Variables
In the behavioral sciences, researchers are often interested in studying theoretical constructs that cannot be observed directly. These abstract phenomena are termed latent variables, or factors. Examples of latent variables in psychology are self-concept and motivation; in sociology, powerlessness and anomie; in education, verbal ability and teacher expectancy; in economics, capitalism and social class.
Because latent variables are not observed directly, it follows that they cannot be measured directly. Thus, the researcher must operationally define the latent variable of interest in terms of behavior believed to represent it. As such, the unobserved variable is linked to one that is observable, thereby making its measurement possible. Assessment of the behavior, therefore, constitutes the direct measurement of an observed variable, albeit the indirect measurement of an unobserved variable (i.e., the underlying construct).
It is important to note that the term behavior is used here in the very broadest sense to include scores on a particular measuring instrument. Thus, observation may include, for example, self-report responses to an attitudinal scale, scores on an achievement test, in vivo observation scores representing some physical task or activity, coded responses to interview questions, and the like. These measured scores (i.e., measurements) are termed observed or manifest variables; within the context of SEM methodology, they serve as indicators of the underlying construct that they are presumed to represent. Given this necessary bridging process between observed variables and unobserved latent variables, it should now be clear why methodologists urge researchers to be circumspect in their selection of assessment measures. Although the choice of psychometrically sound instruments bears importantly on the credibility of all study findings, such selection becomes even more critical when the observed measure is presumed to represent an underlying construct.1
The Factor-Analytic Model
The oldest and best-known statistical procedure for investigating relations between sets of observed and latent variables is that of factor analysis. In using this approach to data analyses, the researcher examines the covariation among a set of observed variables in order to gather information on their underlying latent constructs (i.e., factors). There are two basic types of factor analyses: exploratory factor analysis (EFA) and confirmatory factor analysis (CFA). We turn now to a brief description of each.
EFA is designed for the situation where links between the observed and latent variables are unknown or uncertain. The analysis thus proceeds in an exploratory mode to determine how, and to what extent the observed variables are linked to their underlying factors. Typically, the researcher wishes to identify the minimal number of factors that underlie (or account for) covariation among the observed variables. For example, suppose a researcher develops a new instrument designed to measure five facets of physical self-concept (e.g., Health, Sport Competence, Physical Appearance, Coordination, Body Strength). Following the formulation of questionnaire items designed to measure these five latent constructs, he or she would then conduct an EFA to determine the extent to which the item measurements (the observed variables) were related to the five latent constructs. In factor analysis, these relations are represented by factor loadings.2 The researcher would hope that items designed to measure health, for example, exhibited high loadings on that factor, albeit low or negligible loadings on the other four factors. This factor analytic approach is considered to be exploratory in the sense that the researcher has no prior knowledge that the items do, indeed, measure the intended factors. (For extensive discussions of EFA, see Comrey, 1992; Gorsuch, 1983; McDonald, 1985; Mulaik, 1972).
In contrast to EFA, CFA is appropriately used when the researcher has some knowledge of the underlying latent variable structure. Based on knowledge of the theory, empirical research, or both, he or she postulates relations between the observed measures and the underlying factors a priori, and then tests this hypothesized structure statistically. For example, based on the example cited earlier, the researcher would argue for the loading of items designed to measure sport competence self-concept on that specific factor, and not on the health, physical appearance, coordination, or body strength self-concept dimensions. Accordingly, a priori specification of the CFA model would allow all sport competence self-concept items to be free to load on that factor, but restricted to have zero loadings on the remaining factors. The model would then be evaluated by statistical means to determine the adequacy of its goodness of fit to the sample data. (For more detailed discussio...