
- 272 pages
- English
- ePUB (mobile friendly)
- Available on iOS & Android
eBook - ePub
Test Validity
About this book
Technological and theoretical changes over the past decade have altered the way we think about test validity. This book addresses the present and future concerns raised by these developments. Topics discussed include:
* the validity of computerized testing
* the validity of testing for specialized populations (e.g., minorities, the handicapped) and
* new analytic tools to study and measure validity
Frequently asked questions
Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
- Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
- Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, weâve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere â even offline. Perfect for commutes or when youâre on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Test Validity by Howard Wainer, Henry I. Braun, Howard Wainer,Henry I. Braun in PDF and/or ePUB format, as well as other popular books in Education & Education General. We have over one million books available in our catalogue for you to explore.
Information
Section IV
STATISTICAL INNOVATIONS IN VALIDITY ASSESSMENT
Meta-analysis is concerned with quantitative methods for combining evidence from different studies. In this section, methods of meta-analysis for assessing the validity of tests through the study of data collected in a number of different settings are explored. The opening chapter by Frank Schmidt deals with that particular area of meta-analysis termed validity generalization. Focusing primarily on employment testing, he reviews the findings obtained principally by himself and his colleagues on the criterion-related validity of cognitive tests. The key result is that a very substantial portion of the variation in test-criterion correlations across a broad variety of employment settings appears to be due to âartifactualâ differences in the settings: differences in sample sizes, restriction of range, etc. Schmidt goes further by asserting that all such variation is essentially artifactual and concludes by making predictions about the role validity generalization, and meta-analysis methods generally, will play both in employment selection and in research on the cognitive demands of different jobs.
In the next chapter, Larry Hedges describes how empirical Bayes methods can be used in a meta-analysis to develop a more comprehensive approach to the problem of criterion validity. Specifically, using the empirical Bayes approach a researcher can not only estimate the true variance among correlations across studies but also can obtain improved estimates of the validity in each constituent study. Hedges shows how corrections for unreliability in test score or criterion and restriction of range can be made, even in the presence of missing data. This chapter, as well as those studies cited in the references, indicate that the empirical Bayes paradigm will play an increasingly important role in this area of meta-analysis.
The last chapter, by Bengt MuthĂ©n, introduces an important extension of both classical Item Response Theory (IRT) models and LISREL-type analyses. He proposes a multilevel model in which observed responses to test questions are tied to latent traits and these, in turn, are tied to observed external variables such as class type, socioeconomic status, etc. Interestingly, the mathematical formalism of MuthĂ©nâs model is reminiscent of that of empirical Bayes. Using multigroup versions of the model, MuthĂ©n shows how a number of questions relating to the validity of tests can be addressed. These include differential item functioning and the impact of instruction on achievement. It is particularly noteworthy that these methods facilitate the examination of measurement models in a social context. While the approach can be applied to many problems, their impact on the quantitative assessment of test validity is potentially very great.
Rubin, in his discussion (pp. 241â256) urges extreme caution in the use of those complex statistical models and procedures that can distance investigators from their data. However, he does strongly support the notion of meta-analysis, particularly the empirical Bayes approach espoused by Hedges.
Chapter 11
Validity Generalization and the Future of Criterion-Related Validity
Frank L. Schmidt
University of Iowa
In terms of the number of people affected, the two major areas of test use in the U.S. have traditionally been education and employment. This chapter is concerned primarily with employment testing, although the methods described can be and have been applied in educational testing (Linn, Hamisch, & Dunbar, 1981). In the employment area, research on selection utility has shown that use of valid employment tests of cognitive abilities in place of less valid selection methods can yield large economic gains in the form of increased output, reduced personnel costs, or both (Cascio & Ramos, 1986; Hunter & Schmidt, 1982a; Schmidt, Hunter, McKenzie, & Muldrow, 1979; Schmidt, Hunter, Outerbridge, & Trattner, 1986). The research findings in validity generalization that are the focus of this presentation bear strongly on the two important questions in this area: (1) The question of how valid cognitive ability tests and other selection procedures generally are, and (2) The question of what is required to demonstrate their validity for particular applications in particular organizations. Briefly, what research over the last 10 years has demonstrated is that (1) The mean level of validity of cognitive tests for the prediction of job performances is higher than previously believed and (2) These validities are much less variable, and much more generalizable, across settings, organizations, geographical areas, jobs, and time periods than previously believed. These findings set the stage for an increase in the validity of employment selection procedures, with consequent economic gains in worker productivity.
Validity generalization (VG) research is based on the application of a particular set of meta-analytic methods (Hunter, Schmidt, & Jackson, 1982) to criterion-related validities of tests. We initially developed our meta-analysis methods not as general research integration methods, but as a way of attacking a critically important problem in personnel psychology: the problem of âsituational specificityâ of employment test validities. For more than 50 years, most personnel psychologists had believed that employment test validities were specific to situations and settings, and that, therefore, every test had to be revalidated anew in every setting in which it was considered for use. This belief was based on the empirical fact that considerable variability was present from study to study in observed validity coefficients even when the jobs and tests studied appeared to be similar or identical. The explanation developed for this variability was that the factor structure of job performance was different from job to job and that the human observer or job analyst was too poor an information receiver and processor to detect these subtle but important differences. The conclusion was that validity studies must be conductedâtypically at considerable expenseâin every setting. That is, the conclusion was that validity evidence could not be generalized across settings. Lawshe (1948) stated:
A given test may be excellent in connection with one job and virtually useless in connection with another job. Furthermore, job classifications that seem similar from plant to plant sometimes differ significantly; so it becomes essential to test the test in practically every new situation. (p. 13)
And in the words of Albright, Glennon, and Smith (1963):
If years of personnel research have proven anything, it is that jobs that seem the same from one place to another often differ in subtle but important ways. Not surprisingly, it follows that what constitutes job success is also likely to vary from place to place. (p. 18)
The fact that our point of departure was the problem of situational specificity explains why our methods of meta-analysis are focused strongly on estimation of the true (i.e., nonartifactual) variance of study correlations and effect sizes. We hypothesized that most or all of the variance in test validity coefficients across studies and settings was due to artifactual sources, such as sampling error, and not to real differences between jobs. This focus on the variance of effect sizes and correlations is the primary difference between our methods and those of Glass and his associates (Glass, McGaw, & Smith, 1981) or those of Rosenthal (1984; Rosenthal & Rubin, 1978). In validity generalization, merely showing that the mean is substantial is not sufficient to demonstrate generalizability. One must be able to show that the standard deviation of true validities is small enough to permit generalization of the conclusion that the test has positive validity in the great majority of situations. Figs. 11.1 and 11.2 illustrate this point. Fig. 11.3 is a counter example.
None of this means that we were unconcerned with accurate estimation of the mean. Accurate estimation of mean true validities is critical because the mean affects both generalizability (by affecting the lower credibility value) and expected practical utility. Practical utility is a direct multiplicative function of the expected operational validity, other things equal (Brogden, 1949; Schmidt, Hunter, McKenzie, & Muldrow, 1979). Therefore, we introduced methods for correcting the mean observed validity for attentuation due to mean levels of range restriction and mean levels of measurement error in t...
Table of contents
- Cover Page
- Title Page
- Copyright Page
- Dedication
- Contents
- List of Contributors
- Preface
- Acknowledgments
- Introduction
- Section I Historical and Epistemological Bases of Validity
- Section II The Changing Faces of Validity
- Section III Testing Validity in Specific Subpopulations
- Section IV Statistical Innovations in Validity Assessment
- Author Index
- Subject Index