This book presents a collection of snapshots from two sides of the Big Data perspective. It assembles an array of tangible tools, methods, and approaches that illustrate how Big Data sources and methods are being used in the survey and social sciences to improve official statistics and estimates for human populations. It also provides examples of how survey data are being used to evaluate and improve the quality of insights derived from Big Data.

Big Data Meets Survey Science: A Collection of Innovative Methods shows how survey data and Big Data are used together for the benefit of one or more sources of data, with numerous chapters providing consistent illustrations and examples of survey data enriching the evaluation of Big Data sources. Examples of how machine learning, data mining, and other data science techniques are inserted into virtually every stage of the survey lifecycle are presented. Topics covered include: Total Error Frameworks for Found Data; Performance and Sensitivities of Home Detection on Mobile Phone Data; Assessing Community Wellbeing Using Google Street View and Satellite Imagery; Using Surveys to Build and Assess RBS Religious Flag; and more.

Presents groundbreaking survey methods being utilized today in the field of Big Data
Explores how machine learning methods can be applied to the design, collection, and analysis of social science data
Filled with examples and illustrations that show how survey data benefits Big Data evaluation
Covers methods and applications used in combining Big Data with survey statistics
Examines regulations as well as ethical and privacy issues

Big Data Meets Survey Science: A Collection of Innovative Methods is an excellent book for both the survey and social science communities as they learn to capitalize on this new revolution. It will also appeal to the broader data and computer science communities looking for new areas of application for emerging methods and data sources.

Trusted by 375,005 students

Access to over 1.5 million titles for a fair monthly price.

Study more efficiently using our study tools.

Publisher

Wiley

Year

2020

Print ISBN

9781118976326

Edition

eBook ISBN

9781118976340

Topic

Social Sciences

Subtopic

Social Science Research & Methodology

Index

Social Sciences

Section 1
The New Survey Landscape

1
Why Machines Matter for Survey and Social Science Researchers: Exploring Applications of Machine Learning Methods for Design, Data Collection, and Analysis

Trent D. Buskirk¹and Antje Kirchner²

¹Applied Statistics and Operations Research Department, College of Business, Bowling Green State University, Bowling Green, OH, USA

²RTI International, Research Triangle Park, NC, USA

1.1 Introduction

The earliest hard drives on personal computers had the capacity to store roughly 5 MB – now typical personal computers can store thousands of times more (around 500 GB). Data and computing are not limited to supercomputing centers, mainframe or personal computers, but have become more mobile and virtual. In fact, the very definition of “computer” is evolving to encompass more and more aspects of our lives from transportation (with computers in our cars and cars that drive themselves) to everyday living with televisions, refrigerators, doorbells, thermostats, and many other devices that are Internet‐enabled and “smart.” The rise of the Internet of things, smart devices, and personal computing power relegated to mobile environments and devices, like our smartphone, has certainly created an unprecedented opportunity for survey and social researchers and government officials to track, measure, and better understand public opinion, social phenomena, and the world around us.

A review of the current social science and survey research literature reveals that these fields are indeed at a crossroads transitioning from methods of active observation and data collection to a new landscape where researchers are exploring, considering, and using some of these new data sources for measuring public opinion and social phenomena. Connelly et al. (2016) comment that “whilst there may be a ‘Big Data revolution’ underway, it is not the size or quantity of these data that is revolutionary. The revolution centers on the increased availability of new types of data which have not previously been available for social research.” In our view, advances in this new landscape are taking place in seven major dimensions including

(1) reimagining traditional survey research by leveraging new machine learning methods (MLMs) that improve efficiencies of traditional survey data collection, processing, and analysis;
(2) augmenting traditional survey data with nonsurvey data (administrative, social media, or other Big Data sources) to improve estimates of public opinion and official statistics;
(3) enhancing official statistics or estimates of public opinion derived from Big Data or other nonsurvey data;
(4) comparing estimates of public opinion and official statistics derived from survey data sources to those generated from Big Data or other nonsurvey data exclusively;
(5) exploring new methods for enhancing survey and nonsurvey data collection and gathering, processing, and analysis;
(6) adapting and modifying current methods for use with new data sources and developing new techniques suitable for design and model‐based inference with these data sources;
(7) contributing survey data, methods, and techniques to the Big Data ecosystem.

Weaved within this new landscape is the perspective of assimilating these new data sources into the process as substitutes, augments, or auxiliary to existing survey data. Computational social science continues to evolve as a social sciences subfield that uses computational methods, models, and advanced information technology to understand social phenomena. This subfield is particularly suited to take advantage of alternative data sources including social media and other sources of Big Data like digital footprint data generated as part of social online activities. Survey researchers are exploring the potential of these alternate data sources, especially social media data. Most recently, Burke et al. (2018) explored how social media data create opportunities for not only sampling and recruiting specific populations but also for understanding a growing proportion of the population who are active on social media sites by mining the data on such sites.

No matter the Big Data source, we need data science methods and approaches that are well suited to deal with these types of data. Although some have made the case that administrative data be considered as Big Data (Connelly et al. 2016), the general consensus in both the data science and survey research communities is that they are not. However, with the increased collection of paradata and the increased use of sensors and other similar peripherals used in data collection, one could argue that survey data are getting bigger – not in the number of cases, but in the number of variables that are available per case. Put another way, some survey datasets are getting bigger because they are “wider” rather than “longer.” So we could also make the case that surveys themselves are creating bigger data and could benefit from such applying these types of methods.

This chapter explores how techniques from data science, known collectively as MLMs, can be leveraged in each phase of the survey research process. Success in the new landscape will require adapting some of the analytic and data collection approaches traditionally used by survey and social scientists to handle this more data rich reality. In his recent MIT Press' Essential Knowledge book entitled Machine Learning, computer engineering professor Ethem Alpaydin notes, “Machine learning will help us make sense of an increasingly complex world. Already we are exposed to more data than what our sensors can cope with or our brains can process.” Although the use of MLMs seems fairly new in survey research, machines and technology have long been part of the survey and social sciences' DNA. At the 1957 AAPOR conference Frederick Stephan pointed out that “Computers will tax the ingenuity, judgment and skill of technically proficient people to (a) put the job on the machine and (b) put the results in form for comprehension of human beings; and determine the courses of action we might take based on what the machines have told us” in the context of data collection. This adage still holds today, but we are moving from machines to MLMs.

This chapter provides a brief overview of MLMs and a deeper exploration of how these data science techniques are applied to the social sciences from the perspective of the survey research process including sample design and constructing sampling frames; questionnaire design and evaluation; survey recruitment and data collection; survey data coding and processing; sample weighting and survey adjustment; and data analysis and es...

Cover
Table of Contents
List of Contributors
Introduction
Section 1: The New Survey Landscape
Section 2: Total Error and Data Quality
Section 3: Big Data in Official Statistics
Section 4: Combining Big Data with Survey Statistics: Methods and Applications
Section 5: Combining Big Data with Survey Statistics: Tools
Section 6: The Fourth Paradigm, Regulations, Ethics, Privacy
Index
End User License Agreement

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription

No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1.5 million books across 990+ topics, we’ve got you covered! Learn about our mission

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud

Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app

Yes, you can access Big Data Meets Survey Science by Craig A. Hill, Paul P. Biemer, Trent D. Buskirk, Lilli Japec, Antje Kirchner, Stas Kolenikov, Lars E. Lyberg, Craig A. Hill,Paul P. Biemer,Trent D. Buskirk,Lilli Japec,Antje Kirchner,Stas Kolenikov,Lars E. Lyberg in PDF and/or ePUB format, as well as other popular books in Social Sciences & Social Science Research & Methodology. We have over 1.5 million books available in our catalogue for you to explore.

Big Data Meets Survey Science

A Collection of Innovative Methods