eBook - ePub

Statistical Methods for Handling Incomplete Data

Name: Statistical Methods for Handling Incomplete Data
ISBN: 9781000466348

Jae Kwang Kim,

Jun Shao,

364 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Statistical Methods for Handling Incomplete Data

Jae Kwang Kim,

Jun Shao,

About this book

Due to recent theoretical findings and advances in statistical computing, there has been a rapid development of techniques and applications in the area of missing data analysis. Statistical Methods for Handling Incomplete Data covers the most up-to-date statistical theories and computational methods for analyzing incomplete data.

Features

Uses the mean score equation as a building block for developing the theory for missing data analysis

Provides comprehensive coverage of computational techniques for missing data analysis

Presents a rigorous treatment of imputation techniques, including multiple imputation fractional imputation

Explores the most recent advances of the propensity score method and estimation techniques for nonignorable missing data

Describes a survey sampling application

Updated with a new chapter on Data Integration

Now includes a chapter on Advanced Topics, including kernel ridge regression imputation and neural network model imputation

The book is primarily aimed at researchers and graduate students from statistics, and could be used as a reference by applied researchers with a good quantitative background. It includes many real data examples and simulated examples to help readers understand the methodologies.

Trusted by 375,005 students

Access to over 1.5 million titles for a fair monthly price.

Study more efficiently using our study tools.

Publisher

Chapman and Hall/CRC

Year

2021

Print ISBN

9781032118130

9780367280543

eBook ISBN

9781000466348

Edition

Topic

Matematica

Subtopic

Probabilità e statistica

1 Introduction

DOI: 10.1201/9780429321740-1

1.1 Introduction

Missing data, or incomplete data, is frequently encountered in many disciplines. Statistical analysis with missing data has been an area of considerable interest in the statistical community. Many tools, generic or tailor-made, have already been developed, and many more will be forthcoming to handle missing data problems. Missing data is particularly useful because many statistical issues can be treated as special cases of the missing data problem. For example, data with measurement error can be viewed as a special case of missing data where an imperfect measurement is available instead of true measurement. Two-phase sampling can also be viewed as a planned missing data problem where the key items are observed only in the second-phase sample by design. Many statistical problems employing a latent variable can also be viewed as missing data problems. Furthermore, the advances in statistical computing have made the computational aspects of the missing data analysis techniques more feasible. This book aims to cover the most up-to-date statistical theories and computational methods of the missing data analysis.

Generally speaking, let z be the study variable with density function

f (z; θ)

. We are interested in estimating the parameter θ. If z were observed throughout the sample, then θ would be able to be estimated by the maximum likelihood method. Instead of observing z, however, we only observe

y = T (z, δ)

and δ, where

y = T (z, δ)

is an incomplete version of z satisfying

T (z, δ = 1) = z

and δ is an indicator function that takes either one or zero, depending on the response status. Parameter estimation of θ from the observation of

(y, δ)

is the core of the problem in missing data analyses.

To handle this problem, the marginal density function of

(y, δ)

needs to be expressed as a function of the original distribution

f (z; θ)

. Maximum likelihood estimation can be obtained under some identifying assumptions and statistical theories can be developed for the maximum likelihood estimator obtained from the observed sample. Computational tools for producing the maximum likelihood estimator need to be introduced. How to assess the uncertainty of the resulting maximum likelihood estimator is also an important topic.

When z is a vector, there will be more complications. Because several random variables are subject to missingness, the missing data pattern can figure in to simply modeling and estimation. The monotone missing pattern refers to the situation where the set of respondents in one variable is always a subset of the set of respondents for another variable, which may host further subsetting. See Table 1.1 for an illustration of the monotone missing pattern.

**TABLE 1.1** Monotone Missing Pattern
Y₁	Y₂	Y₃

1.2 Outline

Maximum likelihood estimation with missing data serves as the starting point of this book. Chapter 2 is about defining the observed likelihood function from the marginal density of the observed part of the data, finding the maximum of the observed likelihood by solving the mean score equation, and obtaining the observed information matrix from the observed likelihood. Chapter 3 deals with computational tools to arrive at the maximum likelihood estimator, especially the EM algorithm.

Imputation, covered in Chapter 4, is also a popular tool for handling missing data. Imputation can be viewed as a computational technique for the Monte Carlo approximation of the conditional expectation of the original complete-sample estimator given the observed data. As for variance estimation of the imputation estimator, an important subject in missing data analyses, the Taylor linearization or replication method can be used. Multiple imputation, introduced in Chapter 5, has been proposed as a general tool for imputation and simplified variance estimation but it requires some special conditions, called congeniality and self-efficiency. Fractional imputation is an alternative general-purpose estimation tool for imputation and is covered in Chapter 6.

Propensity score weighting, covered in Chapter 7, is another tool for handling missing data. Basically the responding units are assigned with propensity score weights so that the weighted analysis can lead to valid inference. The propensity score weighting method is often based on an assumption about the response mechanism and the resulting estimator can be made more efficient by properly taking into account of the auxiliary information available ...

Cover
Title Page
Half Title
Copyright Page
Dedication
Contents
List of Figures
List of Tables
Preface
1 Introduction
2 Likelihood-Based Approach
3 Computation
4 Imputation
5 Multiple Imputation
6 Fractional Imputation
7 Propensity Scoring Approach
8 Nonignorable Missing Data
9 Longitudinal and Clustered Data
10 Application to Survey Sampling
11 Data Integration
12 Advanced Topics
Bibliography
Index

Frequently asked questions

Can I cancel at any time?

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription

Can I download books?

No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline

What is the difference between the pricing plans?

Perlego offers two plans: Essential and Complete

Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.5M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.

Both plans are available with monthly, semester, or annual billing cycles.

How does Perlego work?

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1.5 million books across 990+ topics, we’ve got you covered! Learn about our mission

Do you support text-to-speech?

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud

Can I read on my tablet or smartphone?

Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app

Is Statistical Methods for Handling Incomplete Data an online PDF/ePUB?

Yes, you can access Statistical Methods for Handling Incomplete Data by Jae Kwang Kim,Jun Shao in PDF and/or ePUB format, as well as other popular books in Matematica & Probabilità e statistica. We have over 1.5 million books available in our catalogue for you to explore.

Statistical Methods for Handling Incomplete Data

Statistical Methods for Handling Incomplete Data

About this book

Trusted by 375,005 students

Information

1

Introduction

1.1 Introduction

1.2 Outline

Table of contents

Frequently asked questions