eBook - ePub

Handbook of Test Security

Name: Handbook of Test Security
ISBN: 9781136747991

362 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Handbook of Test Security

About this book

High stakes tests are the gatekeepers to many educational and professional goals. As such, the incentive to cheat is high. This Handbook is the first to offer insights from experts within the testing community, psychometricians, and policymakers to identify and develop best practice guidelines for the design of test security systems for a variety of testing genres. Until now this information was scattered and often resided inside testing companies. As a result, rather than being able to learn from each other's experiences, each testing entity was left to re-create their own test security wheel.

As a whole the book provides invaluable insight into the prevalence of cheating and "best practices" for designing security plans, training personnel, and detecting and investigating misconduct, to help develop more secure testing systems and reduce the likelihood of future security breaches. Actual case studies from a variety of settings bring to life how security systems really work. Examples from both domestic and international programs are provided. Highlights of coverage include:• Best practices for designing secure tests
• Analysis of security vulnerabilities for all genres of testing
• Practical cheating prevention and detection strategies
• Lessons learned in actual security violations in high profile testing programs.

Part I focuses on how tests are delivered for paper-and-pencil, technology-based, and classroom testing and writing assessment. Each chapter addresses the prevalence of the problem and threats to security, prevention, and detection. Part II addresses issues essential to maintaining a secure testing program such as planning and monitoring, physical security, the detection of group-based cheating, investigating misconduct, and communicating about security-related issues. Part III examines actual examples of cheating-- how the cheating was done, how it was detected, and the lessons learned. Part III provides insight into security issues within each of the Association of Test Publishers' four divisions: certification/licensure, clinical, educational, and industrial/organizational testing. Part III's conclusion revisits the issues addressed in the case studies and identifies common themes.

Intended for organizations, professionals, educators, policy makers, researchers, and advanced students that design, develop, or use high stakes tests, this book is also ideal for graduate level courses on test development, educational measurement, or educational policy.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.

No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.

Perlego offers two plans: Essential and Complete

Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.

Both plans are available with monthly, semester, or annual billing cycles.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.

Yes, you can access Handbook of Test Security by James A. Wollack, John J. Fremer, James A. Wollack,John J. Fremer in PDF and/or ePUB format, as well as other popular books in Education & Business Education. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Year

Print ISBN

eBook ISBN

Topic

Subtopic

Introduction

The Test Security Threat

James A. Wollack and John J. Fremer

Introduction

As a result of the public’s seemingly insatiable hunger for scandal, it is difficult to pick up a national newspaper or watch a national news program in which we are not learning about athletes taking performance enhancing drugs, teams spying on other teams’ practices or trying to steal their signals, individuals falsifying tax documents to avoid paying taxes, stock brokers engaged in insider trading, investment firms conducting Ponzi schemes, seemingly happily married folk cheating on their partners, or, as is the focus in the Handbook, examinees, educators, or entrepreneurs cheating or helping others to cheat on standardized examinations. In light of the popular media’s penchant for sensationalizing stories, it is easy to become desensitized to test fraud. However, cheating on tests is very real, and the impact it is having on test score interpretations, the public’s confidence in the testing industry, and our economy ought not to be underestimated. In this chapter, we discuss the magnitude of the security problem in all areas of testing, and attempt to set the stage for the chapters that follow.

How Prevalent is Cheating?

As long as there have been tests for which important, high-stakes decisions are made, there have been people endeavoring to find a means for artificially inflating their scores. Indeed, cheating on tests was detected in the Keju Chinese civil service exams which began in AD 606. These exams were used to find the “best” individuals to serve in the administration of the country, were extremely challenging, and very few individuals passed. At the height of the Chinese civil service testing program at the end of the 19th century, it is estimated that, at most, one in 250,000 people sitting for the exams would ever achieve the marks necessary to become eligible for an official government appointment (Suen & Yu, 2006). Because the tests were so selective and so critical to their mission, the Chinese government went to great lengths to protect the integrity of the exams, including instituting a number of preventive measures, such as restricting clothing and resources allowed into the testing rooms, subjecting examinees to body searches, and sequestering examinees in heavily guarded, prison-like exam compounds. In addition, the government established severe sanctions for any individual caught cheating, including stripping examinees of all previously earned credentials, caning, or even execution (Suen & Yu, 2006; Taylor & Taylor, 1995). However, the rewards bestowed upon those who passed (and their families) were tremendous: power, fortune, fame. As a result, in spite of the government’s best efforts, cheating was still rampant.

Many of the methods used to cheat were the same methods used today: bringing crib notes and cheat sheets, writing notes on clothing, body parts, or other “materials,” and collaborating with co-conspirators (either inside or outside the testing room). Perhaps the most frequent form of cheating was impersonation, or as it has come to be known, proxy testing. Because authentication of examinee identity consisted of verbal descriptions of candidates, an estimated 30–40 percent of candidates sitting for the first phase of exams were believed to be paid body-doubles (Suen & Yu, 2006).

Today, though impersonation is by no means extinct, advances with respect to video surveillance equipment and biometric technologies, such as fingerprinting, retinal scans, and keystroke analytics, provide the tools to successfully combat proxy test taking, even if such measures are utilized primarily in large international testing programs. Unfortunately, the same cannot be said for other types of cheating. In fact, it is almost certainly the case that cheating on tests is more prevalent now than ever before.

Making matters more challenging is the fact that cheating approaches continue to evolve and expand. Cohen & Wollack (2006) emphasized that an unintended consequence of improving technology is that there are many new and more sophisticated ways to cheat on tests. Pagers, cell phones, personal digital assistants, voice recorders, iPads, MP3 players, laptop and tablet computers, advanced calculators, two-way radios, and tiny wireless microphones packaged with earpieces and transmitters all make routine communicating with individuals outside (and inside) the testing room. Many of these devices may also be used to access the Internet and provide access to a huge supply of information, which can include extremely elaborate notes and other handy resources that an examinee may have posted to his/her website prior to the exam. Video cameras disguised as jewelry, pens, or shirt buttons can be used to reproduce exact copies of a test, which can then be transmitted instantly to locations all around the world. Written assessments, particularly in the case of high school or college students who are asked to complete their papers outside of class, are now easier than ever to plagiarize, because the Internet exposes students to an unlimited number of untraceable references, not to mention countless sites where completed papers of any length, format, and quality may be ordered up, in much the same way that one goes on-line to order a computer that is customized to meet certain specifications.

Furthermore, because of the perceived administrative and psychometric advantages, an increasing number of tests are now being delivered on computer, be it computer-based linear testing (CBT), computer adaptive testing (CAT), or web-based Internet testing. In the third edition of Educational Measurement, it was argued that computer-based delivery systems, which were then in their infancy, offered improved test security relative to traditional paper-and-pencil tests (Bunderson, Inouye, & Olsen, 1989). After all, they said, computer tests result in no paper copies of exams or keys, and may be stored electronically in encrypted or password-protected files which grant access to only authorized parties, and can automatically shuffle items and corresponding keys to make it more difficult for a student to follow the screen of a neighboring examinee. In the case of CAT, they added, because each examinee receives a different equated test, copying is near impossible and “[i]t would be difficult to steal and memorize each of the hundred or so items for each of several such tests” (p. 386). As it turns out, in the quarter century since the third edition of Educational Measurement was published, we have learned the hard way that cheating on computer-based tests is not as “difficult” as was originally believed. In fact, although many of the specifics of what Bunderson et al. (1989) noted about the security of CBT and CAT were true, the validity issues surrounding item and test compromise have proven so monumental that it would be difficult in light of the experience of computer-based programs to conclude that computerized tests are more secure than paper-and-pencil exams.

So where, exactly, does that leave us? High-stakes tests are not just for evaluating students in the classroom any more. Testing is a vitally important part of our culture. Tests are routinely used to evaluate people for graduation, admission to universities or graduate/professional programs, scholarships, employment, promotion, and licensure/certification. They are used to evaluate how well individuals perform their jobs, such as evaluating teachers, administrators, schools, districts, and states as part of the No Child Left Behind Act (2001) accountability criteria. They are used to award college credit or to exempt students from graduation requirements. And they are used to diagnose educational and psychological disabilities or relative strengths and weaknesses so that education may be tailored to suit individuals’ needs. The world-wide emphasis on testing, and particularly in the United States, has increased at such a staggering pace that psychometrics is regarded as one of the hottest growing fields in America (Herszenhorn, 2006).

But all this testing can be undermined if test users and developers cannot vouch for the validity of the scores for their designated purposes. In recent years, many major testing programs have had to deal with extensive organized cheating scandals that have caused the validity of the scores for large sets of examinees to be questioned.

In 2002, Educational Testing Service (ETS) discovered Chinese- and Koreanlanguage braindump websites, in which students posted questions and answers they had memorized from the computerized version of the Graduate Record Examination (GRE), causing average test scores to increase by as much as 100 points (Steinberg, 2002).
In 2008, Advanced Placement (AP) scores for nearly 400 students in Orange County, CA, were voided because the administrative oversight and proctoring of the exams was particularly poor. Although not all examinees were known to have benefitted from the lax proctoring, all students were required to retake their exams a couple of weeks later. According to reports, students were allowed to use cell phones during the original administration of the exam to send text messages, were seated too close together, and were tested in configurations that were inappropriate (such as facing one another). It was also reported that the number of proctors was inadequate for the number of students testing, and that many proctors were inattentive, including a few who allegedly were reading, fell asleep, or left the room (Mehta, 2008).
In 2010, the U.S. Justice Department found that 22 Federal Bureau of Investigation (FBI) agents cheated on an exam assessing knowledge of counterterrorism procedures. Although examinees were allowed to use their book and notes during the exam, examinees were also found to have consulted with supervisors and a legal advisor (Stein, 2010).
In 2011, 15 high school students in New York were arrested (although up to 50 were believed to be involved) for hiring proxy examinees, for between $500 and $3,600 each, to take the SAT and ACT assessments for them (Anderson, 2012). This scandal prompted the New York State legislature to hold a hearing on ways these testing programs could improve their security (Phillips, 2012), and ultimately led to ACT and SAT requiring students to provide photographs of themselves as part of the registration process.
In 2010, the American Board of Internal Medicine (ABIM) accused 140 doctors of having acquired or assisted in the acquisition of preknowledge of live questions for the Board’s certification exams (Hobson, 2010). Examinees were accused of having either shared test content following an exam, or actively seeking out such content (including some being paid to acquire actual test questions). At both the point of exam registration and again immediately before taking the exam, ABIM requires exam candidates to agree to adhere to a Board policy strictly forbidding the sharing of test content. A year and a half later, in early 2012, a CNN investigation revealed a widespread practice within the radiology community of residents preparing for their ABIM Board exams using “recalls” or large banks of memorized test questions (Zamost, Griffin, & Ansari, 2012). According to the report, these recall banks were maintained and provided by the residency programs, which encouraged its soon-to-test residents to memorize items to contribute to the banks. Approximately half the items on the radiology exam had appeared on previous forms. Within some programs, CNN found over 15 years’ worth of questions and answers, neatly prepared by the training program as PowerPoint presentations.

Similar scandals have been seen with increasing frequency on State Accountability Testing programs. In 2003, two-thirds of the elementary school teachers in a troubled Dallas school district were found to have improperly helped their students on the Texas Assessment of Knowledge and Skills (TAKS), resulting in nearly perfect scores for many students (Benton, 2006). Two years later, nearly 20,000 TAKS booklets went missing following the test (Benton, 2005). This snafu prompted an investigation which conservatively estimated that there were statistical anomalies consistent with cheating in 8.6 percent of Texas’ schools (Benton, 2006). The cheating on the TAKS was not the first reported incident of educators cheating on behalf of their students, but it was the first highly publicized case in the No Child Left Behind era, and was clearly a sign of things to come.

In 2011 and 2012, there were media reports of educator cheating in many large cities across the United States. In March 2011, six charter schools in Los Angeles were closed after teachers and principals opened sealed copies of the state’s exam so that they could prepare students with the actual test questions (Blume, 2011). Later that month, USA Today published a story revealing staggeringly high numbers of erasures, as well as math and reading gain scores that seemed too good to be true, throughout the Washington, DC, school system from 2006 to 2010 (Gillum & Bello, 2011). Then in early July 2011, a team of special investigators appointed by the Governor of Georgia to probe allegations of test misconduct throughout the Atlanta Public School System published its findings. In their report, the investigators identified 178 educators in at least 44 schools who engaged in cheating in 2008–9 (Vogell, 2011).¹ It is unquestionably the most thoroughly investigated and quite possibly the most widespread instance of school-based cheating uncovered to date. Immediately on the heels of Atlanta, 89 schools in Pennsylvania, including 28 in Philadelphia, were identified as suspicious in July 2011, also based on patterns of erasures and gain scores (Winerip, 2011). Using open records laws to obtain state test data for all states, in March 2012, the Atlanta Journal Constitution (AJC) published the results of an analysis looking into anomalous score gains (and drops) across 69,000 public schools across the country (Vogell, Perry, Judd, & Pell, 2012). The AJC report concluded that approximately 200 school districts had test score patterns that very much resembled those found in Atlanta, and suggested that test scores for some tens of thousands of students in 2010 alone may have been invalid. A follow-up analysis published in the AJC a month later revealed what was described as an unusual tendency for schools that had received the prestigious Blue Ribbon Award – given annually to the schools that achieve at the highest levels or that demonstrate the largest growth despite serving largely disadvantaged students – to be dramatically over-represented on the list of anomalous schools (Judd, Vogell, & Perry, 2012b). In many cases, the unusually large gains were followed by equally unusual score drops in the year immediately following the bestowing of the Blue Ribbon Award (Judd, Vogell, & Perry, 2012a).

And such problems are not limited to testing programs in the United States. Cheating is a huge problem on college entrance exams in China and Vietnam. These national exams identify the students who may enroll in a four-year university course. Because job prospects are much improved by having a college degree, many students take these tests. However, because space in the universities is limited, pass rates are relatively low, often around 25 percent. The Chinese Ministry of Education estimated that 3,000 students cheated on the 2006 college entrance exam. Common types of cheating included exchanging information with fellow students and carrying mobile phones (People’s Daily Online, 2006). In Vietnam, immediately following the administration of the college entrance exams, the halls and floors are u...

Cover
Half Title
Title Page
Copyright
Contents
List of Figures
List of Tables
About the Editors
Preface
1. Introduction: The Test Security Threat
Part I: Designing Secure Delivery Systems
Part II: Important Issues in Test Security
Part III: Lessons Learned from Practice: Case Studies from a Variety of Disciplines
Author Index
Subject Index

About this book

Frequently asked questions

Information

Table of contents