Measurement Theory in Action
eBook - ePub

Measurement Theory in Action

Case Studies and Exercises

Kenneth S Shultz, David J. Whitney, Michael J Zickar

Share book
  1. 416 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Measurement Theory in Action

Case Studies and Exercises

Kenneth S Shultz, David J. Whitney, Michael J Zickar

Book details
Book preview
Table of contents
Citations

About This Book

Measurement Theory in Action, Third Edition, helps readers apply testing and measurement theories and features 22 self-contained modules which instructors can match to their courses. Each module features an overview of a measurement issue and a step-by-step application of that theory. Best Practices provide recommendations for ensuring the appropriate application of the theory. Practical Questions help students assess their understanding of the topic. Students can apply the material using real data in the Exercises, some of which require no computer access, while others involve the use of statistical software to solve the problem. Case Studies in each module depict typical dilemmas faced when applying measurement theory followed by Questions to Ponder to encourage critical examination of the issues noted in the cases. The book's website houses the data sets, additional exercises, PowerPoints, and more.Other features includesuggested readings to further one's understanding of the topics, a glossary, and a comprehensive exercise in Appendix A that incorporates many of the steps in the development of a measure of typical performance.

Updated throughout to reflect recent changes in the field, the new edition also features:

  • Recent changes in understanding measurement, with over 50 new and updated references


  • Explanations of why each chapter, article, or book in each module's Further Readings section is recommended


  • Instructors will find suggested answers to the book's questions and exercises; detailed solutions to the exercises; test bank with 10 multiple choice and 5 short answer questions for each module; and PowerPoint slides. Students and instructors can access SPSS data sets; additional exercises; the glossary; and additional information helpful in understanding psychometric concepts.


It is ideal as a text for any psychometrics or testing and measurement course taught in psychology, education, marketing, and management. It is also an invaluable reference for professional researchers in need of a quick refresher on applying measurement theory.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Measurement Theory in Action an online PDF/ePUB?
Yes, you can access Measurement Theory in Action by Kenneth S Shultz, David J. Whitney, Michael J Zickar in PDF and/or ePUB format, as well as other popular books in Psychology & Research & Methodology in Psychology. We have over one million books available in our catalogue for you to explore.

Information

Publisher
Routledge
Year
2020
ISBN
9781000287950
Edition
3
Part I
Introduction

Module 1

Introduction and Overview

Thousands of important, and oftentimes life-altering, decisions are made every day. Who should we hire? Which students should be placed in accelerated or remedial programs? Which defendants should be incarcerated and which paroled? Which treatment regimen will work best for a given client? Should custody of this child be granted to the mother or the father or the grandparents? In each of these situations, a “test” may be used to help provide guidance. There are many vocal opponents to the use of standardized tests to make such decisions. However, the bottom line is that these critical decisions will ultimately be made with or without the use of test information. The question we have to ask ourselves is, “Can a better decision be made with the use of relevant test information?” In many, although not all, instances, the answer will be yes, if a well-developed and appropriate test is used in combination with other relevant, well-justified information available to the decision maker. The opposition that many individuals have to standardized tests is that they are the sole basis for making an important, sometimes life-altering, decision. Thus, it would behoove any decision maker to take full advantage of other relevant, well-justified information, where available, to make the best and most informed decision possible.
A quick point regarding “other relevant and well-justified information” is in order. What one decision maker sees as “relevant” may not seem relevant and well justified to another constituent in the testing process. For example, as one of the reviewers of an earlier edition of this book pointed out, a manager in an organization may be willing to use tests that demonstrate validity and reliability for selecting workers in his organization. However, he may ultimately decide to rely more heavily on what he deems to be “other relevant information,” but in fact is simply his belief in his own biased intuition about people or non-job relevant information obtained from social media profiles. To this supervisor his intuitions, or non-systematic information gathered from social media profiles, are viewed as legitimate “other relevant information” beyond test scores. However, others in the testing process may not view the supervisor’s intuitions, nor non-systematic information obtained from social media profiles, as relevant. Thus, when we say that other relevant information beyond well developed and validated tests should be used when appropriate, we are not talking about information such as intuition (which should be distinguished from professional judgment, which more often than not, is in fact relevant) nor non-systematic information obtained from, say, casually perusing a job applicant’s social media profiles. Rather, we are referring to additional relevant information such as professional references, systematic background checks, structured observations, professional judgments, and the like. That is, additional information that can be well justified, as well as systematically developed, collected, and evaluated. Thus, we are not recommending collecting and using additional information beyond tests simply for the sake of doing so. Rather, any “other relevant information” that is used in addition to test information to make critical decisions should be well justified and supported by professional standards, as well as appropriate for the context it is being proposed for.

What Makes Tests Useful

Tests can take many forms from traditional paper-and-pencil exams to portfolio assessments, job interviews, case histories, behavioral observations, computer adaptive assessments, and peer ratings—to name just a few. The common theme in all of these assessment procedures is that they represent a sample of behaviors from the test taker. Thus, psychological testing is similar to any science in that a sample is taken to make inferences about a population. In this case, the sample consists of behaviors (e.g., test responses on a paper-and-pencil test or performance of physical tasks on a physical ability test) from a larger domain of all possible behaviors representing a construct. For example, the first test we take when we come into the world is called the APGAR test. That’s right, just one minute into the world we get our first test. You probably do not remember your score on your APGAR test, but our guess is your mother does, given the importance this first test has in revealing your initial physical functioning. The purpose of the APGAR test is to assess a newborn’s general functioning right after birth. Table 1.1 displays the five categories that newborn infants are tested on at one and five minutes after birth: Appearance, Pulse, Grimace, Activity, and Respiration (hence, the acronym APGAR). A score is obtained by summing the newborn infant’s assessed value on each of the dimensions. Scores can range from 0 to 10. A score of 7–10 is considered normal. A score of 4–6 indicates that the newborn infant may require some resuscitation, while a score of 3 or less means the newborn would require immediate and intensive resuscitation. The infant is then assessed again at five minutes, and if the score still is below a 7, the infant may be assessed again at 10 minutes. If the infant’s APGAR score is 7 or above five minutes after birth, which is typical, then no further intervention is called for. Hence, by taking a relatively small sampling of behavior, we are (or at least a competent obstetrics nurse or doctor is) able to quickly, and quite accurately, assess the functioning of a newborn infant to determine if resuscitation interventions are required to help the newborn function properly.
Table 1.1 The APGAR Test Scoring Table
Sign Points
0 1 2
Appearance (color) Pale or blue Body pink, extremities blue Pink (normal for non-Caucasian)
Pulse (heartbeat) Not detectible Lower than 100 bpm Higher than 100 bpm
Grimace (reflex) No response Grimace Lusty cry
Activity (muscle tone) Flaccid Some movement A lot of activity
Respiration (breathing) None Slow, irregular Good (crying)
The utility of any assessment device, however, will depend on the qualities of the test and the intended use of the test. Test information can be used for a variety of purposes from making predictions about the likelihood that a patient will commit suicide to making personnel selection decisions by determining which entry-level workers to hire. Tests can also be used for classification purposes, as when students are designated as remedial, gifted, or somewhere in between. Tests can also be used for evaluation purposes, as in the use of a classroom test to evaluate performance of students in a given subject matter. Counseling psychologists routinely use tests to assess clients for emotional adjustment problems or possibly for help in providing vocational and career counseling. Finally, tests can also be used for research-only purposes such as when an experimenter uses a test to prescreen study participants to assign each one to an experimental condition. If the test is not used for its intended purpose, however, it will not be very useful and, in fact, may actually be harmful. As Anastasi and Urbina (1997) note, “Psychological tests are tools 
 Any tool can be an instrument of good or harm, depending on how it is used” (p. 2).
For example, most American children in grades 2–12 are required to take standardized tests on a yearly basis. These tests were initially intended for the sole purpose of assessing students’ learning outcomes. Over time, however, a variety of other misuses for these tests have emerged. For instance, they are frequently used to determine school funding and, in some cases, teachers’ or school administrators’ “merit” pay. However, given that determining the pay levels of educational employees was not the intended use of such standardized educational tests when they were developed, they almost always serve poorly in this capacity. Thus, a test that was developed with good (i.e., appropriate) intentions can be (mis)used for inappropriate purposes, limiting the usefulness of the test. In this instance, however, not only is the test of little use in setting pay for teachers and administrators, it may actually be causing harm to students by coercing teachers to “teach to the test,” thereby trading long-term gains in learning for short-term increases in standardized test performance.
In addition, no matter how the test is used, it will only be useful if it meets certain psychometric and practical requirements. From a psychometric or measurement standpoint, we want to know if the test is accurate, standardized, and reliable; if it demonstrates evidence of validity; and if it is free of both measurement and predictive bias. Procedures for determining these psychometric qualities form the core of the rest of this book. From a practical standpoint, the test must be cost effective as well as relatively easy to administer and score. Reflecting on our earlier example, we would surmise that the APGAR meets most of these qualities of being practical. Trained doctors and nurses in a hospital delivery room can administer the APGAR quickly and efficiently. Our key psychometric concern in this situation may be how often different doctors and nurses are able to provide similar APGAR scores in a given situation (i.e., the inter-rater reliability of the APGAR).

Individual Differences

Ultimately, when it comes right down to it, those interested in applied psychological measurement are usually interested in some form of individual differences (i.e., how individuals differ on test scores and the underlying traits being measured by those tests). If there are no differences in how target individuals score on the test, then the test will have little value to us. For example, if we give a group of elite athletes the standard physical ability test given to candidates for a police officer job, there will likely be very little variability in scores with all the athletes scoring extremely high on the test. Thus, the test data would provide little value in predicting which athletes would make good police officers. On the other hand, if we had a more typical group of job candidates who passed previous hurdles in the personnel selection process for police officer (e.g., cognitive tests, background checks, psychological evaluations) and administered them the same physical ability test, we would see much wider variability in scores. Thus, the test would at least have the potential to be a useful predictor of job success, as we would have at least some variability in the observed test scores.
Individual differences on psychological tests can take several different forms. Typically, we look at inter-individual differences where we examine differences on the same construct across individuals. In such cases, the desire is usually prediction. That is, how well does the test predict some criterion of interest? For example, in the preceding scenario, we would u...

Table of contents