Computerized Adaptive Testing
eBook - ePub

Computerized Adaptive Testing

A Primer

Howard Wainer, Neil J. Dorans, Ronald Flaugher, Bert F. Green, Robert J. Mislevy

  1. 360 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Computerized Adaptive Testing

A Primer

Howard Wainer, Neil J. Dorans, Ronald Flaugher, Bert F. Green, Robert J. Mislevy

Book details
Book preview
Table of contents
Citations

About This Book

This celebrated primer presents an introduction to all of the key ingredients in understanding computerized adaptive testing technology, test development, statistics, and mental test theory. Based on years of research, this accessible book educates the novice and serves as a compendium of state-of-the-art information for professionals interested in computerized testing in the areas of education, psychology, and other related social sciences. A hypothetical test taken as a prelude to employment is used as a common example throughout to highlight this book's most important features and problems. Changes in the new edition include:
*a completely rewritten chapter 2 on the system considerations needed for modern computerized adaptive testing;
*a revised chapter 4 to include the latest in methodology surrounding online calibration and in the modeling of testlets; and
*a new chapter 10 with helpful information on how test items are really selected, usage patterns, how usage patterns influence the number of new items required, and tools for managing item pools.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Computerized Adaptive Testing an online PDF/ePUB?
Yes, you can access Computerized Adaptive Testing by Howard Wainer, Neil J. Dorans, Ronald Flaugher, Bert F. Green, Robert J. Mislevy in PDF and/or ePUB format, as well as other popular books in Éducation & Éducation générale. We have over one million books available in our catalogue for you to explore.

Information

Publisher
Routledge
Year
2000
ISBN
9781135660819

1



Introduction and History



Howard Wainer

PROLOGUE

As we approach the end of the twentieth century we see the influence of computers all around us. In the 1970s computers worked behind the scenes to balance books, write paychecks, prepare weather reports, and do any number of tasks whose characteristics usually included odious repetitive operations. In the 1980s there was a change. Computers came out of the basement. The bank’s computer began to deal with the customer first hand, without the human intervention of bank employees. On most desks was a personal computer that processed both words and data, and could be connected with others through telephone networks, which themselves were run by computers. Tasks that computers now do are starting to get more complex. Machine intelligence, Inference engines, and Expert Systems are terms that are increasingly in vogue.
The use of computers within the context of mental testing has paralleled this development. In the 1970s large testing programs used computers to score tests and process score reports. In the 1980s we have begun to see computers administer exams. The increasingly broad availability of high-powered computing has made possible the administration of types of exam questions that were previously impractical. Moreover, exams could be individualized to suit the person taking them. Of course the development of procedures that adapt to the proficiency of the examinee required the solution of many difficult statistical and psychometric problems. These problems have presented challenges that have only now been solved sufficiently well for practical large-scale application. This volume is a description of how to build, maintain, and use a computerized adaptive testing system (a CAT).
Aristotle, in his Metaphysics, pointed out, “We understand best those things we see grow from their very beginnings.” We agree. Thus, our description of what we believe is the future of testing begins with a brief glimpse into its past.

THE FIRST FOUR MILLENNIA OF MENTAL TESTING

The use of mental tests appears to be almost as ancient as western civilization. The Bible (Judges 12:4–6) provides an early reference in western culture. It describes a short verbal test that the Gileadites used to uncover the fleeing Ephraimites hiding in their midst. The test was one item long. Candidates had to pronounce the word shibboleth; Ephraimites apparently pronounced the initial sh as s. Although the consequences of this test were quite severe (the banks of the Jordan were strewn with the bodies of the 42,000 who failed), there is no record of any validity study.
Some rudimentary proficiency testing that took place in China around 2200 B.C. predated the bibJical program by almost a thousand years. The emperor of China is said to have examined his officials every third year. This set a precedent for periodic exams in China that was to persist for a very long time. In 1115 B.C., at the beginning of the Chan dynasty, formal testing procedures were instituted for candidates for office. Job sample tests were used, with proficiency required in archery, arithmetic, horsemanship, music, writing, and skill in the rites and ceremonies of public and social life.
The Chinese discovered that a relatively small sample of an individual’s performance, measured under carefully controlled conditions, could yield an accurate picture of that individual’s ability to perform under much broader conditions for a longer period of time. The procedures developed by the Chinese (Têng, 1943) are quite similar to the canons of good testing practice used today. For example, they required objectivity—candidates’ names were concealed to insure anonymity; they sometimes went so far as to have the answers redrafted by another individual to hide the handwriting. Tests were often read by two independent examiners, with a third brought in to adjudicate differences. Test conditions were as uniform as could be managed—proctors watched over the exams given in special examination halls that were large permanent structures consisting of hundreds of small cells. Sometimes candidates died during the course of the exams.
This testing program was augmented and modified through the years and has been praised by many western scholars. Voltaire and Quesnay advocated its use in France, where it was adopted in 1791 only to be (temporarily) abolished by Napoleon. It was cited by British reformers as their model for the system set up in 1833 to select trainees for the Indian civil service—the precursor to the British civil service. The success of the British system influenced Senator Charles Sumner and Representative Thomas Jenckes in developing the examination system they introduced into Congress in 1868. There was a careful description of the British and Chinese system in Jenckes’ report “Civil Service in the United States,” which laid the foundation for the establishment of the Civil Service Act passed in January 1883.
Universities lagged far behind in their efforts to install examination systems. The first appears to be the formal exams begun at the University of Bologna in 1219. This was exclusively an oral exam. This structure was also described by Robert de Sorbon, the chaplain of Louis IX, as being used in that court. It was adopted for use in 1257 in the community of scholars that evolved into the Sorbonne. Written tests within universities seem to have their genesis much later with the sixteenth century Jesuits. The first pioneering effort at the development of formal test standards came from this order. In 1599, after several preliminary drafts, eleven rules for the conduct of exams were published. These rules (see McGucken, 1932) are almost indistinguishable from those used today.
The tradition of oral exams spread quickly and by mid-seventeenth century were a standard part of an Oxford education. Written exams were also used and by the middle of the nineteenth century were widely applied in the United States and Western Europe. By the beginning of the twentieth century, serious research efforts had begun on the use and usefulness of various testing procedures. These were done in the United States by Cattell, Farrand (later president of Cornell), Jastrow, Thorndike, Wissler, and Witmer (who founded the first psychological clinic) and in Europe, where Kraepelin (one of Wundt’s first students) and Ebbinghaus did important work that eventually led to Binet’s intelligence test and Terman’s use of it to study “Genius and Stupidity” in his dissertation.
The flurry of activity in testing at the beginning of the twentieth Century spanned a broader range of disciplines than just psychology. One of the most crucial contributions was from statistics, when Spearman provided the rudiments of psychometrics. He invented reliability coefficients and much of the ancillary statistical machinery that allowed their estimation and interpretation.
Tests of all descriptions began to appear to measure performance on such diverse tasks as verbal analogies (devised by Burt, 1911), shoving various shapes through holes (Woodworm, 1910), solving mazes (Porteus, 1915), and drawing a man (Goodenough, 1926). A major change in test administration was occurring at this same time, when there was a shift in practice from individualized to mass administration. This had positive and negative aspects. It allowed much more efficient testing and provided the possibility of a homogeneous testing environment. But it also increased the possibility of examinees not following the directions properly or for some other reason not performing up to their ability.
As the group administered test was evolving, the multiple choice format became increasingly widespread. E. L. Thorndike, at Columbia, and L. L. Thurstone, at Chicago, arranged test material so that items could be scored with a key. Otis, working with Terman at Stanford, was the first to develop an intelligence test that could be scored completely objectively. Prior to the formal publication of Otis’ test, the United States entered World War I; nevertheless Otis’ test became the prototype of the Army Alpha—the instrument that inaugurated large-scale mental testing.

THE ORIGINS OF MENTAL TESTING
IN THE U.S. MILITARY

Robert M. Yerkes, president of the American Psychological Association, took the lead in involving psychologists in the war effort. One major contribution was the implementation of a program for the psychological examination of recruits. Yerkes formed a committee for this purpose which met in May of 1917 at the Vineland Training School. His committee included: W. V. Bingham, H. H. Goddard, T. H. Haines, L. M. Terman, F. L. Wells, and G. M. Whipple. This group debated the relative merits of very brief individual tests versus longer group tests. For reasons of objectivity, uniformity and reliability, they decided to develop a group test of intelligence.
The criteria they adopted (from DuBois, 1970, p. 62) for the development of the new group test were:
1. Adaptability for group use.
2. Correlation with measures of intelligence known to be valid.
3. Measurement of a wide range of ability.
4. Objectivity of scoring, preferably by stencils.
5. Rapidity of scoring.
6. Possibility of many alternate forms so as to discourage coaching.
7. Unfavorableness of malingering.
8. Unfavorableness to cheating.
9. Independence of school training.
10. Minimum of writing in making responses.
11. Material intrinsically interesting.
12. Economy of time.
In just 7 working days they constructed ten subtests with enough items for ten different forms. They then prepared one form for printing and experimental administration. The pilot testing was done with fewer than 500 subjects. These subjects were broadly sampled, coming from such diverse sources as a school for the retarded, a psychopathic hospital, a reformatory, some aviation recruits, some men in an officers’ training camp, 60 high school students and 114 Marines at a Navy yard. They also administered either the Stanford-Binet intelligence test or an abbreviated form of it. The researchers found that their test correlated .9 with the Stanford-Binet and .8 with the abbreviated Binet.
The items and instructions were then edited, time limits revised, and scoring formulas developed to maximize the correlation of the total score with the Binet. Items within each subtest were ordered by difficulty and four alternate forms were prepared for mass administration.
By August, statistical workers under Thomdike’s direction had analyzed the results of the revised test after it had been administered to 3,129 soldiers and 372 inmates of institutions for mental defectives. The results prompted Thorndike to call this the “best group test ever devised.” It yielded good distributions of scores, correlated about .7 with schooling and .5 with ratings by superior officers. This test was dubbed Examination a.
In December of the same year, Examination a was revised once again. It became the famous Army Alpha. This version had only eight subtests; two of the original ten were dropped because of low correlation w...

Table of contents

Citation styles for Computerized Adaptive Testing

APA 6 Citation

Wainer, H., Dorans, N., Flaugher, R., Green, B., & Mislevy, R. (2000). Computerized Adaptive Testing (2nd ed.). Taylor and Francis. Retrieved from https://www.perlego.com/book/1553661/computerized-adaptive-testing-a-primer-pdf (Original work published 2000)

Chicago Citation

Wainer, Howard, Neil Dorans, Ronald Flaugher, Bert Green, and Robert Mislevy. (2000) 2000. Computerized Adaptive Testing. 2nd ed. Taylor and Francis. https://www.perlego.com/book/1553661/computerized-adaptive-testing-a-primer-pdf.

Harvard Citation

Wainer, H. et al. (2000) Computerized Adaptive Testing. 2nd edn. Taylor and Francis. Available at: https://www.perlego.com/book/1553661/computerized-adaptive-testing-a-primer-pdf (Accessed: 14 October 2022).

MLA 7 Citation

Wainer, Howard et al. Computerized Adaptive Testing. 2nd ed. Taylor and Francis, 2000. Web. 14 Oct. 2022.