Applying the Rasch Model
eBook - ePub

Applying the Rasch Model

Fundamental Measurement in the Human Sciences, Third Edition

  1. 384 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Applying the Rasch Model

Fundamental Measurement in the Human Sciences, Third Edition

About this book

Cited over 1900 times, this classic text facilitates a deep understanding of the Rasch model. The authors review the crucial properties of the model and demonstrate its use with a variety of examples from education, psychology, and health. A glossary and numerous illustrations aid the reader's understanding. Readers learn how to apply Rasch analysis so they can perform their own analyses and interpret the results. The authors present an accessible overview that does not require a mathematical background.

Highlights of the new edition include:

-More learning tools to strengthen readers' understanding including chapter introductions, boldfaced key terms, chapter summaries, activities, and suggested readings.

-Divided chapters (4, 6, 7 & 8) into basic and extended understanding sections so readers can select the level most appropriate for their needs and to provide more in-depth investigations of key topics.

-A website at www.routledge.com/9780415833424 that features free Rasch software, data sets, an Invariance worksheet, detailed instructions for key analyses, and links to related sources.

-Greater emphasis on the role of Rasch measurement as a priori in the construction of scales and its use post hoc to reveal the extent to which interval scale measurement is instantiated in existing data sets.

-Emphasizes the importance of interval level measurement data and demonstrates how Rasch measurement is used to examine measurement invariance.

-Insights from other Rasch scholars via innovative applications (Ch. 9).

-Extended discussion of invariance now reviews DIF, DPF, and anchoring (ch. 5).

-Revised Rating Scale Model material now based on the analysis of the CEAQ (ch.6).

-Clarifies the relationships between Rasch measurement, True Score Theory, and Item Response Theory by reviewing their commonalities and differences (Ch.13).

-Provides more detail on how to conduct a Rasch analysis so readers can use the techniques on their own (Appendix B).

Intended as a text for graduate courses in measurement, item response theory, (advanced) research methods or quantitative analysis taught in psychology, education, human development, business, and other social and health sciences, professionals in these areas also appreciate the book's accessible introduction.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Applying the Rasch Model by Trevor Bond, Christine M. Fox in PDF and/or ePUB format, as well as other popular books in Education & Education General. We have over one million books available in our catalogue for you to explore.

Information

Publisher
Routledge
Year
2015
eBook ISBN
9781317805250
Edition
3

1

Why Measurement Is Fundamental

Doctoral students in Malaysia report about a group of rather hard-nosed social science professors, who, during dissertation defense, insist on cross-examining candidates on the nature of the data they are analyzing. In particular, they enquire as to whether the data are really interval or merely ordinal in nature. Apparently, this rather old-fashioned disposition has been the undoing of a number of doctoral defenses; candidates who could not argue for the interval nature of their data were required to redo their statistical analyses, replacing Pearson’s r with Spearman’s rho and so on. Most professors in the Western world, at least in education, psychology, and the other human sciences, seem to have given up quibbling about such niceties: Pearson’s r seems to work just as well with all sorts of data—as SPSS doesn’t know where the data come from, and apparently many of its users don’t either. The upside of this difficulty is that many of these hard-nosed professors now realize that measures derived from Rasch analyses may be considered as interval and therefore permit the use of the wide array of statistical calculations that abound in the social sciences. Unfortunately, however, measurement is not routinely taught in standard curricula in the Western world, and the fallback position is to analyze ordinal data as if they were interval measures.
It seems, notwithstanding those old-fashioned professors and a small number of measurement theorists, that for more than half a century, social science researchers have managed to delude themselves about what measurement actually is. In our everyday lives, we rely both explicitly and implicitly on calibrated measurement systems to purchase gasoline, buy water, measure and cut timber, buy lengths of cloth, assemble the ingredients for cooking, and administer appropriate doses of medicine to ailing relatives. So how is it that when we go to university or the testing company to conduct social science research, undertake some psychological investigation, or implement a standardized survey, we then go about treating and analyzing those data as if the requirements for measurement that served us so well at home in the morning no longer apply in the afternoon? Why do we change our definition of and standards for measurement when the human condition is the focus of our attention?
Measurement systems are ignored when we routinely express the results of our research interventions in terms of either probability levels of p < 0.01 or p < 0.05, or—better yet—as effect sizes. Probability levels indicate only how un/likely it is that A is more than B or that C is different from B, and effect size is meant to tell us by how much the two samples under scrutiny differ. Instead of focusing on constructing measures of the human condition, psychologists and others in the human sciences have focused on applying sophisticated statistical procedures to their data. Although statistical analysis is a necessary and important part of the scientific process, and the authors in no way would ever wish to replace the role that statistics play in examining relations between variables, the argument throughout this book is that quantitative researchers in the human sciences are focused too narrowly on statistical analysis and not concerned nearly enough about the nature of the data on which they use these statistics. Therefore, it is not the authors’ purpose to replace quantitative statistics with Rasch measurement but rather to refocus some of the time and energy used for data analysis on the prerequisite construction of quality scientific measures.
Those hard-nosed professors mentioned earlier, of course, recur to the guidelines learned from S. S. Stevens (1946). Every student of Psychometrics 101 or Quantitative Methods 101 has Stevens’s lesson ingrained forever. In short, Stevens defined measurement as the assignment of numbers to objects or events according to a rule and, thereby, some form of measurement exists at each of four levels: nominal, ordinal, interval, and ratio. By now, most of us accept that ratio-level measurement is likely to remain beyond our capacity in the human sciences, yet most of us assume the data that we have collected belong to interval-level scales.
Still, it remains puzzling that those who set themselves up as scientists of the human condition, especially those in psychological, health, and educational research, would accept their ordinal-level ‘measures’ without any apparent critical reflection, when they are not really measures at all. Perhaps we should all read Stevens himself (1946) a little more closely. “As a matter of fact, most of the scales used widely and effectively by psychologists are ordinal scales” (p. 679). He then specified that the only statistics ‘permissible’ for ordinal data were medians and percentiles, leaving means, standard deviations, and correlations appropriate for interval or ratio data only. And, even more surprisingly, “The rank-order correlation coefficient is usually deemed appropriate to an ordinal scale, but actually this statistic assumes equal intervals between successive ranks and therefore calls for an interval scale” (p. 678). Can it be clearer than this: “With the interval scale we come to a form that is ‘quantitative’ in the ordinary sense of the word” (p. 679)? This is also our point: only with ‘interval’ do we get ‘quantitative’ in the ordinary sense, the sense in which we use scientific measures in our everyday lives. So why are social scientists left in a state of confusion?
Unfortunately, in this same seminal article, Stevens then blurred these ordinal/interval distinctions by allowing us to invoke “a kind of pragmatic sanction: In numerous instances it leads to fruitful results” (p. 679). He added a hint of a proviso: “When only rank order of data is known, we should proceed cautiously with our statistics, and especially with the conclusions we draw from them” (p. 679). It appears that his implicit ‘permission’ to treat ordinal data as if they were interval was the only conclusion to reach the social scientists, scientists who were so obviously desperate to use their sophisticated statistics on their profusion of attitude scales.
One reasonably might expect that those who see themselves as social scientists would aspire to be open-minded, reflective, and, most importantly, critical researchers. In empirical science, it would seem that this issue of measurement might be somewhat paramount. However, many attempts to raise these and “whether our data constitute measures” issues result in the abrupt termination of the opportunities for further discussion even in forums specifically identified as focusing on measurement, quantitative methods, or psychometrics. Is the attachment of our field to the (mis?)interpretation of Stevens—the blatant ignorance that ordinal data do not constitute measurement—merely another case of the emperor’s new clothes? (Stone, 2002). Let’s look at the individual components of that tradition: what is routine practice, what the definition of measurement implies, and the status of each of the ubiquitous four levels of measurement.
Under the pretense of measuring, the common practice has been for psychologists to describe the raw data at hand. They report how many people answered the item correctly (or agreed with the prompt), how highly related one response is to another, and what the correlation is between each item and total score. These mere descriptions have chained our thinking to the level of raw data, and raw data are not measures. Although psychologists generally accept counts as ‘measurement’ in the human sciences, this usage cannot replace measurement as it is known in the physical sciences. Instead, the flurry of activity and weight of scientific importance has been unduly assigned to statistical analyses instead of measurement. This misemphasis, coupled with unbounded faith in the attributions of numbers to events as sufficing for measurement, has blinded psychologists, in particular, to the inadequacy of these methods. Michell (1997) is quite blunt about this in his paper, titled ‘Quantitative Science and the Definition of Measurement in Psychology’, in which psychologists’ “sustained failure to cognize relatively obvious methodological facts” is termed “methodological thought disorder” (p. 374). The question remains: Is it possible that as scientists specializing in the human sciences, we might open our minds to the possibility that we haven’t been measuring anything at all? Or if we have, it has been due as much to good intentions and good fortune as to our invocation of appropriate measurement methodology?

Children Can Construct Measures

It is clear that in the social sciences, the term ‘measurement’ has a cachet not shared by the terms ‘quantitative’ or ‘statistics’. Perhaps we could learn something about what measurement really entails if we could look at the development of measurement concepts among those in whom these developments are still taking place: children. Part of Jean Piaget’s research agenda in Geneva was stimulated by discussions he had in Davos, Switzerland, with Albert Einstein during a meeting in 1928 (Ducret, 1990). Einstein counseled Piaget to examine the development of the concepts of speed, distance, and time in young children to see which of those concepts was logically primitive (i.e., if speed = distance/time, which of them could possibly develop before the others?). Piaget went on to examine the progressive construction of the concepts of length (and measurement) in children and reported those findings in 1948 (Piaget, Inhelder, & Szeminska, 1948/1960).
Piaget’s assistants provided children with sets of materials to use in their investigations and, through a series of loosely structured questions, asked children to develop a rudimentary measurement system for the materials (usually wooden dowels) provided. The authors reported the following temporal and logical sequence in the acquisition of lineal measurement concepts in children:
  1. Children classified (grouped) the supplied objects into a class with at least one common attribute suitable for measurement (e.g., wooden rods) and put aside the rest (e.g., tumbler, ball, etc.).
  2. They then seriated (ordered) those selected objects according to the variation of that attribute (e.g., length).
  3. Children then identified an arbitrary unit of difference between two successive lengths (e.g., a small piece of rod, A, such that rod C − rod B = rod A). Their iteration of that unit was used to calculate length relationships, so that rod B = 2 × rod A; and rod B + rod A = rod C, and so forth. Their measurement attempts revealed that the generalized difference between any two adjacent rods, Xn and the next largest X(n + l), was the arbitrary unit, A.
  4. In time, each child realized that iterated unit of measurement must be standardized across all appropriate measurement contexts so all lengths could be measured against a common linear measurement scale.
Of course, it does not take Schopenhauer to detect the parallels between the outcomes of the investigations with young children and what we have learned about levels of measurement from Stevens in our introductory college classes. For Piaget and the children, the hierarchical developmental sequence is classification, seriation, iteration, and then standardization; for Stevens’s levels, nominal, ordinal, interval, and ratio. The interesting but crucial difference—which is well known to mature grade school children and seems to be ignored by many in the human sciences—is that although classification and seriation are necessary precursors to the development of measurement systems, they, in and of themselves, are not sufficient for measurement. The distinctive attribute of the lineal measurement system is the requirement for an arbitrary unit of difference that can be iterated between successive lengths. School children are quick to insist that convenient lineal measurement units such as hand width and foot length are inadequate even for classroom projects; they demand the use of a standard length of stick, at least.
It is on this latter point that proponents of the Rasch models for measurement focus their attention: How do we develop units of measurement, which at first must be arbitrary but can be iterated along a scale of interest so the unit values remain the same? This is the prime focus for Rasch measurement. The cover of a handout from the Rasch measurement Special Interest Group of the American Educational Research Association bore the motto ‘Making Measures’. Each cover of the Journal of Applied Measurement states the same objective in a different way: ‘Constructing Variables’. It might be a very long time before those of us in the human sciences can happily adopt a genuine zero starting point for the measurement of math achievement or cognitive development or decide what zero introversion or quality of life looks like, but those who work painstakingly toward making measures so that the resultant scales have interval measurement properties are making an important contribution to scientific progress. In keeping with the development of instruments in the physical sciences, we need to spend more time investigating our scales than investigating with our scales. These attempts at the construction of measures go beyond merely naming and ordering indicators toward the perhaps unattainable Holy Grail of genuine ratio measures.
In terms of Stevens’s levels, the authors then would conclude that the nominal and ordinal levels are NOT any form of measurement in and of themselves. Of course, we concur that his interval and ratio levels actually would constitute some form of genuine measurement. However, the scales to which we routinely ascribe that measurement status in the human sciences are often merely presumed to have interval-level measurement properties; those measurement properties are almost never tested empirically. It is not good enough to allocate numbers to human behaviors and then merely to assert that this is measurement in the social sciences.
Up to this point in the chapter, the authors have ignored a crucial aspect of Stevens’s definition (because we want to direct particular attention to it). Stevens reminded us that the numerical allocations have to be made ‘according to a rule’, and therein lies the rub. What his definition fails to specify is that scientific measurement requires the allocations to be carried out according to a set of rules that will produce, at minimum, a resultant scale with a unit value that will maintain its value along the whole scale. Numerical allocations made ‘according to just a(ny) rule’ produce many of the very useful indicators of the human condition we habitually use in our research, but only some of those would qualify as ‘measurement’ so defined.

Statistics and/or Measurement

One regrettable consequence of the Stevens tradition, and the position of others on this matter, is that statistical analysis has dominated social sciences to the almost complete exclusion of the concept of measurement. Introductory texts and courses about social science measurement are, routinely, about statis...

Table of contents

  1. Cover
  2. Title
  3. Copyright
  4. Dedication
  5. Contents
  6. Foreword
  7. Preface
  8. About the Authors
  9. 1 Why Measurement Is Fundamental
  10. 2 Important Principles of Measurement Made Explicit
  11. 3 Basic Principles of the Rasch Model
  12. 4 Building a Set of Items for Measurement
  13. 5 Invariance: A Crucial Property of Scientific Measurement
  14. 6 Measurement Using Likert Scales
  15. 7 The Partial Credit Rasch Model
  16. 8 Measuring Facets Beyond Ability and Difficulty
  17. 9 Making Measures, Setting Standards, and Rasch Regression
  18. 10 The Rasch Model Applied Across the Human Sciences
  19. 11 Rasch Modeling Applied: Rating Scale Design
  20. 12 Rasch Model Requirements: Model Fit and Unidimensionality
  21. 13 A Synthetic Overview
  22. Appendix A: Getting Started
  23. Appendix B: Technical Aspects of the Rasch Model
  24. Glossary
  25. Author Index
  26. Subject Index