
eBook - ePub
The New Rules of Measurement
What Every Psychologist and Educator Should Know
- 263 pages
- English
- ePUB (mobile friendly)
- Available on iOS & Android
eBook - ePub
The New Rules of Measurement
What Every Psychologist and Educator Should Know
About this book
In this volume prominent scholars from both psychology and education describe how these new rules of measurement work and how they differ from the old rules. Several contributors have been involved in the recent construction or revision of a major test, while others are well-known for their theoretical contributions to measurement. The goal is to provide an integrated yet comprehensive reference source concerned with contemporary issues and approaches in testing and measurement.
Frequently asked questions
Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Perlego offers two plans: Essential and Complete
- Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
- Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, weāve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere ā even offline. Perfect for commutes or when youāre on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access The New Rules of Measurement by Susan E. Embretson, Scott L. Hershberger, Susan E. Embretson,Scott L. Hershberger in PDF and/or ePUB format, as well as other popular books in Psychology & History & Theory in Psychology. We have over one million books available in our catalogue for you to explore.
Information
CHAPTER ONE
Issues in the Measurement of Cognitive Abilities
Susan E. Embretson
The valid construction and interpretation of tests are two critically important activities for psychologists and educators, yet an apparent gap exists between the modern methods of measurement available and their use by applied workers. Many important tests have been constructed or revised by measurement principles that differ qualitatively from classical measurement concepts. Yet these principles are not well understood by test users and even many measurement specialists. However, available textbooks on measurement and testing provide only rudimentary coverage of some new principles and then fail to cover other important principles entirely.
In cognitive ability testing, computerization is the most salient change in the new generation of tests. Computerized presentation of items, immediate scoring, and report generation are attractive features of many revised tests. Computerized testing also has made adaptive testing feasible. In the adaptive testing, tests no longer have fixed-item content. Items are selected online for an examinee, depending on their responses to preceeding items. Thus, examinees no longer are exposed to items that are far above or below their performance level. Test forms are optimally selected for each person from the test item bank.
Another salient change in cognitive ability testing is increased flexibility for administering and interpreting individualized tests, such as the Differential Ability Scales (Elliot, 1990), the Woodcock-Johnson Psycho-Educational Battery (Woodcock & Johnson, 1977), and several others. Special procedures for missing data in testing (e.g., persons measured out of level or omitted items) are available so that ability may be estimated without bias. Furthermore, some individual cognitive tests also provide ability estimates that do not depend on a norm-referenced standard for meaning. The ability estimates have optimal scale properties that permit comparisons directly to abilities obtained earlier or to abilities at another developmental level. The abilities may be used to measure developmental change or distance from some developmental standard.
Item response theory (IRT) is the set of measurement principles that have made adaptive testing and increased flexibility for individualized tests practically feasible. Yet IRT has not been given sufficient coverage in graduate educations. Worse, psychologists and educators are often unaware of its application on specific tests. Test manuals do not elaborate how IRT is implemented. Often IRT is, at best, discussed only in an appendix. Why? Test publishers are sensitive to the preparation of test users. Because test users typically are not adequately trained in IRT, they would not understand it as a basis for the test. Of course, because the test actually is based on IRT principles, test users are deprived of knowledge that is crucial to test expertise. In addition, test users are not motivated to understand IRT because its application is not made sufficiently explicit.
IRT is also useful in the construct development phase of testing. IRT now includes a vast array of models that postulate qualitatively different types of underlying constructs. Comparative fit indices for different IRT models can provide interpretations about the constructs that are measured. For example, inconsistent findings about the number and nature of constructs involved in specific tests result, in part, from applying methods that are inappropriate for item-level data. Applying multidimensional IRT models to item level data results in more valid findings. Furthermore, it is often suspected that some test items are population-specific; that is, performance may differ qualitatively over different groups of persons. Sometimes the populations are intrinsic to the measure, such as employing different strategies to solve the items. Other times the populations differ in background, such as defined by gender, racial-ethnic background, native language, or clinical status (e.g., handicaps, etc.). IRT models are available not only to assess these differences, but their application can provide solutions.
IRT also has many practical advantages for test development. Unlike classical test theory, IRT item parameters are not biased by the population ability distribution. In contrast, the classical test theory indices for item difficulty and discrimination (i.e., p values and biserial correlations) are directly influenced by ability distributions. Furthermore, greater flexibility in test calibration, using item subsets with varying groups, is possible because IRT readily handles missing data problems.
Unfortunately, textbooks do not elaborate how applying IRT fundamentally changes testing concepts and results. In fact, most general measurement and testing textbooks still emphasize classical test theory. Although texts are available that are exclusively devoted to IRT, many such textbooks are inaccessible due to their statistically oriented presentation. A notable exception is Hambleton, Swaminathan, and Rogerās (1989) textbook, which is quite readable without advanced statistical knowledge. However, the topics covered are pertinent mainly to large scale ability or achievement tests. Yet IRT has important applications to individual ability tests, personality traits, and psychopathology and clinical tests as well as to behavioral rating scales. Unfortunately, prototypic applications to these areas are available only in the specialized journals or readings.
Particularly neglected in readily available intellectual resources is coverage of the special IRT scores that do not require norm referencing for meaningful interpretation. However, descriptions of these scores, and the alternative standards, often are available only in the technical manual (or an appendix of the manual) of specific tests or in isolated technical studies.
Last, but not least, the importance of measurement scale properties in interpreting individual differences in constructs has been mentioned only in passing in most textbooks. Although the importance of obtaining interval-level measurements are mentioned in passing, the potential for IRT scaling to obtain fundamental measurements is inadequately elaborated.
IRT is not the only significant change in the new generation of tests. Generalizability theory is increasingly applied to organize results on the accuracy of measurements over varying conditions. Generalizability theory is readily applicable to both cognitive and personality tests. One could argue that pedological resources for generalizability theory are even fewer for IRT. Not only has generalizability theory received insufficient attention in many textbooks, but also fewer scientific papers are available. Generalizability theory represents a major extension of classical test theory. Traditionally, generalizability theory has been used to analyze the effects of different measurement conditions on the psychometric properties of tests. In most applications of IRT, the impact of varying measurement conditions on trait level is not estimated. Thus, generalizability theory addresses psychometric issues that are not included in IRT and that are important in the new generation of tests.
CHAPTERS AND AUTHORS
Four chapters in the cognitive ability measurement section concern IRT and its role in testing. Robert M. Thorndike examines the future of IRT in the context of its historical origins. Thorndike has unique resources to examine the context in which psychometric methods were developed and applied. His grandfather, E. L. Thorndike, was the pioneering force behind the earliest standardized tests, which required decisions about measurement methods. Further, his father, R. L. Thorndike, is well known for his contributions to ability testing. R. M. Thorndike has authored several books on measurement and statistics, including A Century of Ability Testing (1990). Furthermore, he has published numerous papers and monographs on measurement and multivariate methods. For the last 7 years, he has worked with Riverside Publishing Company in capacities related to the StanfordāBinet.
Daniel elaborates how IRT was applied in some popular individualized cognitive tests, the Kaufmann Adult Intelligence Scale (KAIT; Kaufman & Kaufman, 1993) and the Differential Ability Scales (DAS; Elliot, 1990). Daniel has been actively involved in applying IRT to develop major tests. Currently, Daniel is senior scientist in test development at American Guidance Service (AGS). He has also worked in test development at the Psychological Corporation. Daniel has directed psychometric developments on ability tests (e.g., DAS, KAIT), behavior rating scales, neuropsychological tests, and other instruments.
Wright examines some fundamental measurement issues and how applying a special IRT model, the Rasch model, results in person measurements that have scale properties. Wright is particularly able to comment on measurement scale issues because he is well known for making the family of Rasch models available and meaningful in American psychometrics. Wright has conducted numerous workshops on the Rasch model and directed 70 doctoral students, many of whom are now contemporary leaders in psychometrics. As a former student of both Thurstone and the Danish mathematician Rasch, Wright has published extensively on Rasch measurement models, including nearly 150 scientific papers, 12 books (including Best Test Design, Wright & Stone, 1979, and Rating Scale Analysis, Wright & Masters, 1982), and 2 computer programs for Rasch models, BIGSTEPS (Wright & Linacre, 1997) and FACETS (Linacre & Wright, 1977).
Woodcock shows how indices based on Rasch measurement models can enhance interpretations of a personās performance. Woodcock is particularly able to comment on how these indices aid score interpretations because he pioneered the application of Rasch measurement models in American testing. Woodcock has authored several tests, including Woodcock-Johnson Psycho-Educational Battery (1977), the Woodcock Reading Mastery Tests (1973), the Woodcock Language Proficiency Battery (1991), and the Woodcock-Munoz Language Survey (Woodcock & Munoz-Sandoval, 1993).
The last chapter in the cognitive measurement section concerns generalizability theory. Marcoulides describes how generalizability theory picks up where IRT (e.g., the Rasch model) leaves off. Marcoulides is especially able to elucidate how generalizability can contribute to testing because he has published extensiv...
Table of contents
- Cover
- Half Title
- Title Page
- Copyright Page
- Table of Contents
- Preface
- 1 Issues in the Measurement of Cognitive Abilities
- 2 IRT and Intelligence Testing: Past, Present, and Future
- 3 Behind the Scenes: Using New Measurement Methods on DAS and KAIT
- 4 Fundamental Measurement for Psychology
- 5 What Can Rasch-Based Scores Convey About a Personās Test Performance?
- 6 Generalizability Theory: Picking Up Where the Rasch IRT Model Leaves Off?
- 7 Introduction to Personality Measurement
- 8 The Rorschach: Measurement Concepts and Issues of Validity
- 9 Searching for Structure in the MMPI
- 10 Personality Measurement Issues Viewed Through the Eyes of IRT
- 11 Summary and Future of Psychometric Methods in Testing
- Author Index
- Subject Index