Spatial Statistics and Geostatistics
eBook - ePub

Spatial Statistics and Geostatistics

Theory and Applications for Geographic Information Science and Technology

  1. 200 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Spatial Statistics and Geostatistics

Theory and Applications for Geographic Information Science and Technology

About this book

"Ideal for anyone who wishes to gain a practical understanding of spatial statistics and geostatistics. Difficult concepts are well explained and supported by excellent examples in R code, allowing readers to see how each of the methods is implemented in practice"
- Professor Tao Cheng, University College London

Focusing specifically on spatial statistics and including components for ArcGIS, R, SAS and WinBUGS, this book illustrates the use of basic spatial statistics and geostatistics, as well as the spatial filtering techniques used in all relevant programs and software. It explains and demonstrates techniques in:

  • spatial sampling
  • spatial autocorrelation
  • local statistics
  • spatial interpolation in two-dimensions
  • advanced topics including Bayesian methods, Monte Carlo simulation, error and uncertainty.

It is a systematic overview of the fundamental spatial statistical methods used by applied researchers in geography, environmental science, health and epidemiology, population and demography, and planning.

A companion website includes digital R code for implementing the analyses in specific chapters and relevant data sets to run the R codes.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Spatial Statistics and Geostatistics by Yongwan Chun,Daniel A Griffith in PDF and/or ePUB format. We have over one million books available in our catalogue for you to explore.

Information

1

Introduction

1.1. Spatial statistics and geostatistics

As its title indicates, the theme of this book is spatial statistics and geostatistics, with emphasis on selected classical topics from these two subdisciplines in order to highlight theory and applications for geographic information science and technology. Computer code windows in almost all chapters report R code for implementing many of the techniques discussed – the criteria employed to select this software are as follows: it is free; and it contains modules for most techniques as well as geographic information system (GIS) capabilities. Computer code windows in Section 9.1 are the exception; these windows report WinBUGS code because this software package is free and is widely used for Bayesian analysis. When possible, analyses were verified with SAS and ArcGIS implementations. A sizable amount of human and physical geography data was assembled for the main island of Puerto Rico and used to formulate the empirical examples. These include remotely sensed (Landsat 7 ETM+) data, digital elevation model (DEM) data, climatological station precipitation and temperature data, United States socio-economic/demographic and agricultural census data, and computer-generated pseudo-random number data. Each chapter ends with a set of relevant references.
The set of eight chapters surveys a wide range of techniques, all of which have spatial autocorrelation as their common factor. Probability models range from conventional (bell-shaped) normal curve theory to generalized linear models (e.g., Poisson and binomial probability models). Techniques range from graphically portraying spatial autocorrelation, through spatial autoregression and semi-variogram analysis, to eigenvector spatial filtering. Selected special topics, such as determining effective geographic sample size and multiple testing for local spatial autocorrelation statistics, are interspersed with these standard topics. One goal in many chapters is to uncover, illustrate, and exploit impacts of spatial autocorrelation in georeferenced data analysis. Perhaps missing-value imputation, albeit via kriging or regression-based predicted values, best underscores the importance and utility of spatial autocorrelation.
Chapter 2 reviews the foundation indices quantifying the nature and degree of spatial autocorrelation. One unique feature of this chapter is its summary of simulation experiment results exemplifying the relationship between the join count statistics and the Moran coefficient. This chapter also presents the basic graphical portrayals and illustrates salient impacts of spatial autocorrelation. Foremost is variance inflation, a long-recognized property for the bell-shaped curve, which is extended here to Poisson and binomial random variables. Inclusion of an empirical comparison between a semi-variogram and a Geary ratio-based correlogram is another distinctive feature of this chapter. It concludes with an overview of the well-known statistical distribution theory for linear regression residual spatial autocorrelation.
Chapter 3 reviews the basic random sampling designs for collecting spatial data: unconstrained, systematic, geographic stratified, and cluster. Each of these is applied to the Puerto Rico DEM, which has a population comprising nearly 10,000 elevation locations. Sample results are compared with those for the non-probability sample of existing climatological weather station locations. A summary of simulation experiment results calls attention to selected properties of these sampling designs. One original feature of this chapter is its implementation of the hexagon tessellation stratified random sampling design: partial hexagons materializing along the boundary of the island are treated in a way that merges subsets of them for sampling purposes into comparable single hexagons. This chapter also describes how to implement the bootstrap and jackknife resampling techniques. It concludes by outlining the concept of effective geographic sample size, with special reference to its Puerto Rico DEM elevation sampling exercise.
Chapter 4 presents differentiations between homogeneity and heterogeneity in spatial data, in terms of both a mean response and its variance. This chapter’s discussion is couched in terms of Box–Cox power transformations to normality, as well as their accompanying back-transformations, analysis of variance (ANOVA), and model-based inference. One innovative feature of this chapter is its inclusion of the extremely accurate approximate back-transformation equation. This chapter also describes common ways to quantify geographic contiguity, and then introduces the eigenvector spatial filtering methodology, differentiating between static geographic distributions and spatial interaction cases. It concludes with a discussion of anisotropy.
The focus of Chapter 5 is on converting the specifications of conventional statistical models into ones that take spatial autocorrelation into account. It begins by presenting a reformulation of linear regression models to spatial autoregressive and eigenvector spatial filter versions; in doing so, this chapter extends the discussion of eigenvector spatial filtering methodology initiated in Chapter 4. Eigenvector spatial filtering conceptualizations enable respecifications of binomial/logistic and Poisson regression models to account for spatial autocorrelation. One original feature of this chapter is a detailed presentation of how to implement eigenvector spatial filtering with the R package. Comparisons between results obtained with conventional and spatial model specifications corroborate the importance of accounting for spatial autocorrelation in georeferenced data analyses. This chapter concludes with a further differentiation between static geographic distributions and spatial interaction cases, with specific reference to a journey-to-work example.
Global statistics and global analyses establish the foundation for Chapters 2–5. Chapter 6 shifts attention to local geographic statistics. Its unique feature is an innovative treatment of multiple testing based upon effective sample size. Quantification of small-scale spatial clustering is achieved with both local indices of spatial association and the Getis–Ord statistics. A presentation of spatially varying coefficients extends this perspective, helping to relate local to global measures of spatial autocorrelation, and illustrating how bivariate relationships can vary across geographic space.
Geostatistics constitutes the theme of Chapter 7. The presentation casts semi-variogram models as a tool to quantify spatial variance, and co-kriging as a tool to quantify spatial covariance. One special feature of this chapter is a comparison of co-kriging results with increasing resolution of a covariate: first 112 weather stations; then 9,181 DEM rasters; and, finally, 8,987,017 satellite pixels. The next presentation is of techniques (e.g., Cochrane–Orcutt type spatial filtering) that differentiate between spatial and aspatial variance, followed by techniques that differentiate between spatial and aspatial covariance. Treatment of these latter topics includes comparisons between data analyses ignoring and accounting for spatial autocorrelation. The product moment correlation coefficient decomposition is a novel feature of this chapter. Overall, this chapter allows spatial scientists to answer the following two questions: What is special about spatial data? Do spatial autocorrelation/dependency effects matter? Model diagnostic statistics can change, statistical decisions can change, correlation coefficients (including their signs) can change, and factor structure can change.
Chapter 8 reviews the principal use of geostatistics, which is to predict unknown values of an attribute at some locations from known values of the same attribute at other locations; in other words, interpolation. Kriging is the best linear unbiased predictor (BLUP), exploiting the sufficient statistics of a sample, and hence is equivalent to an expectation–maximization (EM) solution. The importance of this chapter lies in many spatial analysts being interested in filling holes/gaps in their maps created by incomplete data (i.e., small-area estimation). This chapter reviews the equivalence between the kriging and the spatial autoregressive missing-value imputation solutions. It also presents connections between imputation and regression prediction of new observation values. In either case, the techniques discussed support map generalization, especially from a small sample of locationally tagged values (i.e., a massive number of imputations). An innovative feature of this chapter is the extension of eigenvector spatial filtering methodology to calculate imputations for missing georeferenced values. In all cases, techniques are presented for calculating the accompanying uncertainty for mean response imputation. Although this chapter focuses on imputations, mathematical spatial statistics often dismisses this problem. Rather, the theoretical problem of interest is to estimate parameters in the presence of missing values. Chapter 8 addresses this topic, too.
Chapter 9, the final chapter, introduces three additional, more advanced topics in spatial statistics. Although many additional topics could be selected for treatment in a final chapter, this chapter spotlights three that currently possess very high profiles. Foremost is Bayesian map analysis. WinBUGS, and more specifically its GeoBUGS module, are used to present this methodology. Of note is that SAS increasingly supports this type of analysis. The R package contains modules, such as MCMC, necessary for this type of analysis, and can be used to interface with WinBUGS. Another prominent topic outlined and demonstrated in this chapter is the designing of Monte Carlo spatial simulation experiments. This tool is particularly useful in spatial statistics, where many problems defy analytical solution. This discussion complements the bootstrap and jackknife discussions appearing in Chapter 3. The Monte Carlo experiment investigating eigenvector selection from a restricted candidate set of vectors to construct a spatial filter is an exclusive feature of this chapter. The final section of the chapter presents an overview of spatial error and uncertainty, in order to raise awareness of these contemporary topics.

1.2. R basics

R is a free software environment for statistical computing and graphics. Because R is an implementation of the S programming language developed by Bell Laboratories, it is similar to S-Plus marketed by TIBCO Software Inc., which is a commercialized implementation of the S language. R is available free of charge from the R project website (www.r-project.org) under the GNU public license terms, and its installation file can be downloaded from this site. Also, R source codes are open to the public as an open-source environment.
R comprises packages. The R core system includes about eight packages. One of these, the base package, supports basic procedures. Many more R packages are available from the Comprehensive R Archive Network (CRAN) mirror sites. Specific procedures require additional packages to be installed and loaded. For example, the spdep package contains functions for spatial data analysis, such as spatial weights management, spatial autocorrelation quantifications, and spatial regression. R can be further extended with packages; new packages are developed, and old packages are upgraded regularly and then provided through CRAN. The great popularity of R among members of academic societies promotes its development with new functions and packages from researchers in various fields.
A user needs to know selected basics to start working with R. Fi...

Table of contents

  1. Cover Page
  2. Title Page
  3. Copyright
  4. Dedeication
  5. Table of Contents
  6. About the Authors
  7. Preface
  8. 1 Introduction
  9. 2 Spatial Autocorrelation
  10. 3 Spatial Sampling
  11. 4 Spatial Composition and Configuration
  12. 5 Spatially Adjusted Regression and Related Spatial Econometrics
  13. 6 Local Statistics: Hot and Cold Spots
  14. 7 Analyzing Spatial Variance and Covariance with Geostatistics and Related Techniques
  15. 8 Methods for Spatial Interpolation in Two Dimensions
  16. 9 More Advanced Topics in Spatial Statistics
  17. References
  18. Index