Geospatial Health Data
eBook - ePub

Geospatial Health Data

Modeling and Visualization with R-INLA and Shiny

  1. 274 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Geospatial Health Data

Modeling and Visualization with R-INLA and Shiny

About this book

Geospatial health data are essential to inform public health and policy. These data can be used to quantify disease burden, understand geographic and temporal patterns, identify risk factors, and measure inequalities. Geospatial Health Data: Modeling and Visualization with R-INLA and Shiny describes spatial and spatio-temporal statistical methods and visualization techniques to analyze georeferenced health data in R. The book covers the following topics:



  • Manipulating and transforming point, areal, and raster data,


  • Bayesian hierarchical models for disease mapping using areal and geostatistical data,


  • Fitting and interpreting spatial and spatio-temporal models with the integrated nested Laplace approximation (INLA) and the stochastic partial differential equation (SPDE) approaches,


  • Creating interactive and static visualizations such as disease maps and time plots,


  • Reproducible R Markdown reports, interactive dashboards, and Shiny web applications that facilitate the communication of insights to collaborators and policymakers.

The book features fully reproducible examples of several disease and environmental applications using real-world data such as malaria in The Gambia, cancer in Scotland and USA, and air pollution in Spain. Examples in the book focus on health applications, but the approaches covered are also applicable to other fields that use georeferenced data including epidemiology, ecology, demography or criminology. The book provides clear descriptions of the R code for data importing, manipulation, modelling, and visualization, as well as the interpretation of the results. This ensures contents are fully reproducible and accessible for students, researchers and practitioners.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Geospatial Health Data by Paula Moraga in PDF and/or ePUB format, as well as other popular books in Medicina & Probabilità e statistica. We have over one million books available in our catalogue for you to explore.

Information

Part II

Modeling and visualization

5

Areal data

Areal or lattice data arise when a fixed domain is partitioned into a finite number of subregions at which outcomes are aggregated. Examples of areal data are the number of cancer cases in counties, the number of road accidents in provinces, and the proportion of people living in poverty in census tracts. Often, disease risk models aim to obtain disease risk estimates within the same areas where data are available. A simple measure of disease risk in areas is the standardized incidence ratio (SIR) which is defined as the ratio of the observed to the expected counts. However, in many situations small areas may present extreme SIRs due to low population sizes or small samples. In these situations, SIRs may be misleading and insufficiently reliable for reporting, and it is preferred to estimate disease risk by using Bayesian hierarchical models that enable to borrow information from neighboring areas and incorporate covariates information resulting in the smoothing or shrinking of extreme values.
A popular spatial model is the Besag-York-Mollié (BYM) model (Besag et al., 1991) which takes into account that data may be spatially correlated and observations in neighboring areas may be more similar than observations in areas that are farther away. This model includes a spatial random effect that smoothes the data according to a neighborhood structure, and an unstructured exchangeable component that models uncorrelated noise . In spatio-temporal settings where disease counts are observed over time, spatio-temporal models that account not only for spatial structure but also for temporal correlations and spatio-temporal interactions are used.
This chapter shows how to compute neighborhood matrices, expected counts, and SIRs. Then it shows how fit spatial and spatio-temporal disease risk models using the R-INLA package (Rue et al., 2018). The examples in this chapter use data of lung cancer in Pennsylvania counties, USA, obtained from the SpatialEpi package (Kim and Wakefield, 2018), and show results with maps created with the ggplot2 package (Wickham et al., 2019a). At the end of the chapter, areal data issues are discussed including the Misaligned Data Problem (MIDP) which occurs when spatial data are analyzed at a scale different from that at which they were originally collected, and the Modifiable Areal Unit Problem (MAUP) and the ecological fallacy whereby conclusions may change if one aggregates the same underlying data to a new level of spatial aggregation.

5.1 Spatial neighborhood matrices

The concept of spatial neighborhood or proximity matrix is useful in the exploration of areal data. The (i,j)th element of a spatial neighborhood matrix W, denoted by Wj, spatially connects areas i and j in some fashion, i,j ∈ {1,…n}. W defines a neighborhood structure over the entire study region, and its elements can be viewed as weights. More weight is associated with j’s closer to i than those farther away from i. The simplest neighborhood definition is provided by the binary matrix where wij = 1 if regions i and j share some common boundary, perhaps a vertex, and wij = 0 otherwise. Customarily, wii is set to 0 for i = 1, … , n. Note that this choice of neighborhood definition results in a symmetric spatial neighborhood matrix.
The code below sh...

Table of contents

  1. Cover
  2. Half Title
  3. Series Page
  4. Title Page
  5. Copyright Page
  6. Dedication
  7. Table of Contents
  8. Preface
  9. About the author
  10. I Geospatial health data and INLA
  11. II Modeling and visualization
  12. III Communication of results
  13. Appendix
  14. Bibliography
  15. Index