eBook - ePub

About this book

Featuring a timely presentation of total survey error (TSE), this edited volume introduces valuable tools for understanding and improving survey data quality in the context of evolving large-scale data sets

This book provides an overview of the TSE framework and current TSE research as related to survey design, data collection, estimation, and analysis. It recognizes that survey data affects many public policy and business decisions and thus focuses on the framework for understanding and improving survey data quality. The book also addresses issues with data quality in official statistics and in social, opinion, and market research as these fields continue to evolve, leading to larger and messier data sets. This perspective challenges survey organizations to find ways to collect and process data more efficiently without sacrificing quality. The volume consists of the most up-to-date research and reporting from over 70 contributors representing the best academics and researchers from a range of fields. The chapters are broken out into five main sections: The Concept of TSE and the TSE Paradigm, Implications for Survey Design, Data Collection and Data Processing Applications, Evaluation and Improvement, and Estimation and Analysis. Each chapter introduces and examines multiple error sources, such as sampling error, measurement error, and nonresponse error, which often offer the greatest risks to data quality, while also encouraging readers not to lose sight of the less commonly studied error sources, such as coverage error, processing error, and specification error. The book also notes the relationships between errors and the ways in which efforts to reduce one type can increase another, resulting in an estimate with larger total error.

This book:

• Features various error sources, and the complex relationships between them, in 25 high-quality chapters on the most up-to-date research in the field of TSE

• Provides comprehensive reviews of the literature on error sources as well as data collection approaches and estimation methods to reduce their effects

• Presents examples of recent international events that demonstrate the effects of data error, the importance of survey data quality, and the real-world issues that arise from these errors

• Spans the four pillars of the total survey error paradigm (design, data collection, evaluation and analysis) to address key data quality issues in official statistics and survey research

Total Survey Error in Practice is a reference for survey researchers and data scientists in research areas that include social science, public opinion, public policy, and business. It can also be used as a textbook or supplementary material for a graduate-level course in survey research methods.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Total Survey Error in Practice by Paul P. Biemer, Edith D. de Leeuw, Stephanie Eckman, Brad Edwards, Frauke Kreuter, Lars E. Lyberg, N. Clyde Tucker, Brady T. West, Paul P. Biemer,Edith de Leeuw,Stephanie Eckman,Brad Edwards,Frauke Kreuter,Lars E. Lyberg,N. Clyde Tucker,Brady T. West, Paul P. Biemer, Edith de Leeuw, Stephanie Eckman, Brad Edwards, Frauke Kreuter, Lars E. Lyberg, N. Clyde Tucker, Brady T. West in PDF and/or ePUB format, as well as other popular books in Social Sciences & Research & Methodology in Psychology. We have over one million books available in our catalogue for you to explore.

Information

Section 1
The Concept of TSE and the TSE Paradigm

1
The Roots and Evolution of the Total Survey Error Concept

Lars E. Lyberg1 and Diana Maria Stukel2
1 Inizio, Stockholm, Sweden
2 FHI 360, Washington, DC, USA

1.1 Introduction and Historical Backdrop

Photo displaying Sir Ronald Fisher.

Sir Ronald Fisher
Photo displaying Jerzy Neyman.

Jerzy Neyman
In this chapter, we discuss the concept of total survey error (TSE), how it originated and developed both as a mindset for survey researchers and as a criterion for designing surveys. The interest in TSE has fluctuated over the years. When Jerzy Neyman published the basic sampling theory and some of its associated sampling schemes in 1934 onward, it constituted the first building block of a theory and methodology for surveys. However, the idea that a sample could be used to represent an entire population was not new. The oldest known reference to estimating a finite population total on the basis of a sample dates back to 1000 BC and is found in the Indian epic Mahabharata (Hacking, 1975; Rao, 2005). Crude attempts at measuring parts of a population rather than the whole had been used in England and some other European countries quite extensively between 1650 and 1800. The methods on which these attempts were based were referred to as political arithmetic (Fienberg and Tanur, 2001), and they resembled ratio estimation using information of birth rates, family size, and average number of persons living in selected buildings and other observations. In 1895, at an International Statistical Institute meeting, Kiaer argued for developing a representative or partial investigation method (Kiaer, 1897). The representative method aimed at creating a sample that would reflect the composition of the population of interest. This could be achieved by using balanced sampling through purposive selection or various forms of random sampling. During the period 1900–1920, the representative method was used extensively, at least in Russia and the U.S.A. In 1925, the International Statistical Institute released a report on various aspects of random sampling (Rao, 2005, 2013; Rao and Fuller, 2015). The main consideration regarding sampling was likely monetary, given that it was resource‐intensive and time‐consuming to collect data from an entire population. Statistical information compiled using a representative sample was an enormous breakthrough. But it would be almost 40 years after Kiaer’s proposal before Neyman published his landmark paper from 1934 “On the Two Different Aspects of the Representative Method: The Method of Stratified Sampling and the Method of Purposive Selection.” At this time, there existed some earlier work by the Russian statistician Tschuprow (1923a, b) on stratified sampling and optimal allocation. It is not clear whether Neyman was aware of this work when he started to develop the sampling theory in the 1920s (Fienberg and Tanur, 1996) since he did not mention Tschuprow’s work when discussing optimal allocation. Neyman definitely had access to Ronald Fisher’s (1925) ideas on randomization (as opposed to various kinds of purposive selection) and their importance for the design and analysis of experiments, and also to Bowley’s (1926) work on stratified random sampling.
Photo displaying Prasanta Mahalanobis.

Prasanta Mahalanobis
Photo displaying Morris Hansen.

Morris Hansen
Photo displaying Edwards Deming.

Edwards Deming
The sampling methods proposed by Neyman were soon implemented in agencies such as the Indian Statistical Institute and the U.S. Bureau of the Census (currently named the U.S. Census Bureau). Prasanta Mahalanobis, the founder of the Indian Statistical Institute, and Morris Hansen and colleagues at the U.S. Census Bureau, became the main proponents of scientific sampling in a number of surveys in the 1940s. The development was spurred on by Literary Digest’s disastrously inaccurate prediction in the 1936 U.S. presidential election poll that was based on a seriously deficient sampling frame. However, Neyman’s sampling theory did not take into account nonsampling errors and relied on the assumption that sampling was the only major error source that affected estimates of population parameters and associated calculations of confidence intervals or margins of error. However, Neyman and his peers understood that this was indeed an unrealistic assumption that might lead to understated margins of error. The effect of nonsampling errors on censuses was acknowledged and discussed in a German textbook on census methodology relatively early on (Zizek, 1921). The author discussed what he called control of contents and coverage. In addition, Karl Pearson (1902) discussed observer errors much earlier than that. An early example of interviewer influence on survey response was the study on the consumption of hard liquor during the prohibition days in the U.S.A., where Rice (1929) showed that interviewers who were prohibitionists tended to obtain responses that mirrored their own views and that differed from those of respondents that were interviewed by other interviewers.
In 1944, Edwards Deming published the first typology of sources of error beyond sampling. He listed 13 factors that he believed might affect the utility of a survey. The main purpose of the typology was to demonstrate the need for directing efforts to all potential sources in the survey planning process while considering the resources available. This first typology included some error sources that are not frequently referenced today, such as bias of the auspices (i.e., the tendency to indicate a particular response because of the organization sponsoring the study). Others, to which more attention is currently given, such as coverage error, were not included, however. Even though Deming did not explicitly reference TSE, he emphasized the limitations of concentrating on a few error sources only and highlighted the need for theories of bias and variability based on accumulated experience.
Rapid development of the area followed shortly thereafter. Mahalanobis (1946) developed the method of interpenetration, which could be used to estimate the variability generated by interviewers and other data collectors. Another error source recognized early on was nonresponse. Hansen and Hurwitz (1946) published an article in the Journal of the American Statistical Association on follow‐up sampling from the stratum of initial nonrespondents. While the basic assumption of 100% participation in a follow‐up sample was understood not to be realistic, at the time, there were relatively small nonresponse rates, and it was possible to estimate, at least approximately, the characteristics of those in the nonresponse stratum.
Even though it is not explicitly stated, TSE has its roots in cautioning against sole attention focused on sampling error along with possibly one or two other error sources, rather than the entire scope of potential errors. In response, two lines of strategic development occurred. One strategy entailed the identification of specific error sources, coupled with an attempt to control them or at least minimize them. The other strategy entailed the development of the so‐called survey error models, where the TSE was decomposed and the magnitude of different error components, and ultimately the combination of them (i.e., the TSE), could be estimated. The two strategies were intertwined in the sense that a survey model could be applied not only on the entire set of survey operations but also on a subset of specific survey operations.

1.2 Specific Error Sources and Their Control or Evaluation

Photo displaying Leslie Kish.

Leslie Kish
Apart from that of Deming (1944), there are a number of typologies described in the survey literature. Examples include Kish (1965), Groves (1989), Biemer and Lyberg (2003), Groves et al. (2009), Smith (2011), and Pennell et al. (Chapter 9 in this volume). Some of them are explicitly labeled TSE, while others consist of listings of different types of errors; however, all are incomplete. In some cases, known error sources (as well as their interactions with other error sources) are simply omitted, and in other cases, all possible error sources are not known or the sources defy expression. For instance, new error structures have emerged when new data collection modes or new data sources, such as Big Data (...

Table of contents

  1. Cover
  2. Title Page
  3. Table of Contents
  4. Notes on Contributors
  5. Preface
  6. Section 1: The Concept of TSE and the TSE Paradigm
  7. Section 2: Implications for Survey Design
  8. Section 3: Data Collection and Data Processing Applications
  9. Section 4: Evaluation and Improvement
  10. Section 5: Estimation and Analysis
  11. Wiley Series in Survey Methodology
  12. Index
  13. End User License Agreement