Signal Detection Theory and ROC Analysis in Psychology and Diagnostics
eBook - ePub

Signal Detection Theory and ROC Analysis in Psychology and Diagnostics

Collected Papers

  1. 324 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Signal Detection Theory and ROC Analysis in Psychology and Diagnostics

Collected Papers

About this book

Signal detection theory--as developed in electrical engineering and based on statistical decision theory--was first applied to human sensory discrimination 40 years ago. The theoretical intent was to provide a valid model of the discrimination process; the methodological intent was to provide reliable measures of discrimination acuity in specific sensory tasks. An analytic method of detection theory, called the relative operating characteristic (ROC), can isolate the effect of the placement of the decision criterion, which may be variable and idiosyncratic, so that a pure measure of intrinsic discrimination acuity is obtained. For the past 20 years, ROC analysis has also been used to measure the discrimination acuity or inherent accuracy of a broad range of practical diagnostic systems. It was widely adopted by methodologists in the field of information retrieval, is increasingly used in weather forecasting, and is the generally preferred method in clinical medicine, primarily in radiology. This book attends to both themes, ROC analysis in the psychology laboratory and in practical diagnostic settings, and to their essential unity.

The focus of this book is on detection and recognition as fundamental tasks that underlie most complex behaviors. As defined here, they serve to distinguish between two alternative, confusable stimulus categories, which may be perceptual or cognitive categories in the psychology laboratory, or different states of the world in practical diagnostic tasks.

This book on signal detection theory in psychology was written by one of the developers of the theory, who co-authored with D.M. Green the classic work published in this area in 1966 (reprinted in 1974 and 1988). This volume reviews the history of the theory in engineering, statistics, and psychology, leading to the separate measurement of the two independent factors in all discrimination tasks, discrimination acuity and decision criterion. It extends the previous book to show how in several areas of psychology--in vigilance and memory--what had been thought to be discrimination effects were, in reality, effects of a changing criterion.

The book shows that data plotted in terms of the relative operating characteristic have essentially the same form across the wide range of discrimination tasks in psychology. It develops the implications of this ROC form for measures of discrimination acuity, pointing up the valid ones and identifying several common, but invalid, ones. The area under the binormal ROC is seen to be supported by the data; the popular measures d' and percent correct are not. An appendix describes the best, current programs for fitting ROCs and estimating their parameters, indices, and standard errors.

The application of ROC analysis to diagnostic tasks is also described. Diagnostic accuracy in a wide range of tasks can be expressed in terms of the ROC area index. Choosing the appropriate decision criterion for a given diagnostic setting--rather than considering some single criterion to be natural and fixed--has a major impact on the efficacy of a diagnostic process or system. Illustrated here by separate chapters are diagnostic systems in radiology, information retrieval, aptitude testing, survey research, and environments in which imminent dangerous conditions must be detected. Data from weather forecasting, blood testing, and polygraph lie detection are also reported. One of these chapters describes a general approach to enhancing the accuracy of diagnostic systems.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Signal Detection Theory and ROC Analysis in Psychology and Diagnostics by John A. Swets in PDF and/or ePUB format, as well as other popular books in Psychology & Cognitive Psychology & Cognition. We have over one million books available in our catalogue for you to explore.
I THEORY, DATA, AND MEASURES
The three chapters of Part I describe: (a) the origins of signal detection theory and ROC analysis in statistics and engineering and the relation of these concepts to historical concepts in psycho-physics and psychology; (b) experimental data in the form of empirical ROCs that support signal detection theory and ROC analysis in psychology and diagnostics; and (c) the implications of those ROC data both for psychological theory and for the several measures of discrimination performance that have been used in psychology and in diagnostic fields.
Chapter 1 describes the relevant psychophysical theory beginning with Gustav Fechner in 1860. It acknowledges Louis Leon Thurstone’s 1920s conception of the two stimulus categories to be distinguished as leading to two overlapping (bell-shaped) distributions on an observation variable. In Thurstone’s theory, the two stimuli are symmetrical as far as distinguishing between them is concerned, and so a criterion is set on the observation variable where their distributions cross one another. This chapter goes on to show how H. Richard Blackwell in the 1950s extended the conception of the overlapping distributions from Thurstone’s consideration of the “paired-comparison,” or recognition, task (which Blackwell termed “two-alternative forced-choice”) to the “yes-no,” or detection, task. This extension was made in the interests of threshold theory, which detection theory replaces, but it was a step along the way, inasmuch as the yes-no task lies at the heart of signal detection theory and is the basis for the ROC. As the last piece of relevant history, this chapter shows how statistical theory developed by Jerzy Neyman and Egon Pearson in 1933, and extended by Abraham Wald in 1950, formed the basis for signal detection theory. In statistical theory, the two overlapping distributions are statistical hypotheses—a null hypothesis and an alternative. In classical hypothesis testing, a decision criterion is selected to yield some small probability of rejecting the null hypothesis when it is true (that is, of making a false-positive decision or Type I error)—usually .05 or .01. Similarly, Blackwell assumed a fixed sensory threshold that would lead to a negligible proportion of positive responses when only noise is present. In going from hypothesis testing to a broader class of statistical decisions, Wald made it clear that a decision criterion could be set anywhere along a decision variable. This was the same variable—the likelihood ratio—for any task and for any definition of the optimal criterion. (A detailed treatment of the statistical heritage of detection theory is given by Gigerenzer, G., and Murray, D. J., Cognition as intuitive statistics. Hillsdale, NJ: Lawrence Erlbaum Associates, 1987.)
Chapter 1 points up that the signal detection theory of interest here was developed in the early 1950s by Wesley Peterson and Theodore Birdsall, who were then graduate students in electrical engineering at the University of Michigan. Wilson Tanner and I, graduate students there in psychology, joined them in research and made the first application of the theory to human observers in a study of visual discrimination. Though unaware of Wald’s work a few years earlier, Peterson and Birdsall also conceived of a decision criterion that could vary across the range of a decision variable that is the likelihood ratio. To show the consequences in performance of a variable criterion, they devised the ROC. The ROC shows, for a given discrimination acuity (and a given signal strength), how the true-positive proportion (TPP) varies with the false-positive proportion (FPP) as the criterion, or the observer’s willingness to make the positive response, is varied. On ordinary arithmetic scales, the ROC extends from 0 to 1.0 on each scale, concave downward, that is, with decreasing slope. An ROC lying on the positive diagonal, with TPP = FPP, shows zero discrimination; an ROC following the left and upper borders, with TPP = 1 for all FPP, shows perfect discrimination. Statisticians sometimes remark that the ROC is simply the “power function” of statistical theory, but the two functions differ fundamentally. In fact, the power function—which shows how TPP increases with increasing signal strength for some selected, small, fixed FPP—is the century-old “psychometric function” of psychological theory.
Chapter 1 proceeds to describe computational procedures for the index of discrimination acuity called d′, as popularized in the early applications of signal detection theory in psychology. This chapter anticipates the diminished value of d′ suggested by the accumulating data. Specifically, it shows the theoretical ROC on a bivariate-normal graph, that is, on normal-deviate scales that provide a linear ROC. The measure d′ is appropriate for ROCs of slope = 1, but empirical data show ROCs of other slopes, varying primarily between 0.5 and 1.0. This chapter mentions the area under the ROC as a “non-parametric” discrimination index appropriate to varying ROC slopes; it does not anticipate the later prominence of an area measure based on bivariate-normal distributions. Chapter 2 shows the robustness of the linear ROC on the binormal graph, by displaying dozens of empirical ROCs that are fitted well by a linear function, with varying slope. The appropriate index is termed Az—the A for “area” and the “z” to connote the normal-deviate scales of the ROC plot. This index varies from .50 at chance performance to 1.0 at perfect performance. It is now the index of general choice in diagnostic applications of ROC theory and also should be, I suggest, in psychology.
The index of the decision criterion called β (beta) is also described in Chapter 1. It is defined as the criterion value on the likelihood-ratio decision variable and also as the slope of the tangent to the ROC (on ordinary arithmetic scales) at the point that is generated by the given criterion. In contrast to d′, the index β has held up well in my opinion (but see Macmillan, N. A., and Creelman, C. D. Response bias: Characteristics of detection theory, threshold theory, and “non-parametric” indexes. Psychological Bulletin, 1990, 107(3), 401–413). A strong point is that optimal decision criteria can be specified by β.
Chapter 1 concludes with a review of conclusions drawn from applications of the ROC in psychology, highlighting areas in which the ability to separate discrimination and decision processes led to revised psychological conceptions. An example is sensory vigilance, in which performance effects long thought to represent declines in discrimination acuity were found in most instances to represent a change in the placement of the decision criterion. Similarly, many established findings thought to represent effects of memory and forgetting in recognition tasks were shown to be effects of differences in the decision criterion.
The empirical ROCs of Chapter 2 sample the psychological topics of human visual detection, recognition memory for odors and for words, conceptual judgment, and animal learning. The chapter’s ROCs from diagnostic applications include some from medical imaging, information retrieval, weather forecasting, aptitude testing, and polygraph lie detection. They demonstrate the use of the “rating” task to calculate ROCs based on the adoption of several decision criteria simultaneously, as opposed to successive adoption of single criteria in successive conditions of a yes-no task. The conclusion to be drawn from the survey of empirical ROCs is not that deviations from the linear binormal form never appear, but that the few deviant ROCs do not show any apparent pattern and hence do not support any other particular form. For practical purposes, the linear binormal ROC is apparently adequate and satisfactory and the discrimination index Az is simple and generally useful. For conceptual calibration, it may help to know that Az is theoretically equal to the percentage of correct responses in a paired-comparison, or two-alternative forced-choice, task. That is, an observer represented in a yes-no or rating task by an Az = .80 will state correctly on 80% of the trials which of a pair of stimuli is signal (vs. noise) or Signal A (vs. Signal B).
Chapter 3 is fundamental to measurement of discrimination acuity. It shows that non-ROC indices of discrimination acuity drawn from a 2-by-2 table of stimulus and response are invalid. Included are the percentage of correct responses (that is, the overall percentage of correct positive and negative responses); the true-positive (or “hit”) probability corrected for chance success (corrected in either of two ways); the measure of association called the Kappa statistic, used also as a measure of observer agreement; the correlation coefficient derived from 2-by-2 tables, called phi; and an index representing those developed in the field of weather forecasting, called the skill test. In addition to invalidity, these indices suffer from the inconvenience of not accounting for a variable decision criterion; their use assumes that the criterion placement on which they are based is fixed.
The percentage of correct responses is probably the index most difficult to give up. It seems close to the data, unencumbered by theoretical considerations. Yet, it is the easiest to dismiss, even on arithmetic grounds. And it can be shown to make strong theoretical assumptions, as strong as those made by d′ and Az. If empirical ROCs for given observers and tasks look anything like the ROCs shown in Chapter 2 to be representative, the percentage of correct responses will be a highly variable and undependable index of discrimination acuity for those observers and tasks.
To illustrate, the percentage of correct responses, P(C), varies substantially with the prior probabilities (or base rates) of the stimuli. Indeed, it may be defined as the prior probability of a positive stimulus, P(S+), times the conditional probability of a positive response given a positive stimulus, P(R+|S +)—hence, P(S+) P(R+|S+)—added to the p...

Table of contents

  1. Cover
  2. Half Title
  3. Title Page
  4. Copyright Page
  5. Table of Contents
  6. Preface
  7. Acknowledgments
  8. Introduction
  9. Part I. Theory, Data, and Measures
  10. Part II. Accuracy and Efficacy of Diagnoses
  11. Part III. Applications in Various Diagnostic Fields
  12. Appendix: Computer Programs for Fitting ROCs
  13. Author Index
  14. Subject Index