Geospatial Data Science With R: Applications in Environmental Geography
We are living in a time of unprecedented environmental change, driven by the effects of fossil fuels on the Earth’s climate and the expanding footprint of human land use. To mitigate and adapt to these changes, there is a need to understand their myriad impacts on human and natural systems. Achieving this goal requires geospaital data on a wide variety of environmental factors, including climate, vegetation, biodiversity, soils, terrain, water, and human land use. Large volumes of these data are collected by Earth-observing satellite and ground-based sensors. But the data alone are not enough. Using them effectively requires tools for appropriate manipulation and analysis.
The burgeoning field of data science has provided a wealth of techniques for analyzing large and complex datasets, including methods for descriptive, explanatory, and predictive analysis. However, actually applying these methods is typically a small part of the overall data science workflow. Other critical tasks including screening for suspect data values, handling missing data, combining data from multiple sources, summarizing variables for analysis, and visualizing data and analysis results. Although there are many books available on analytical methods, there are far fewer that cover the overall process of working with geospatial data to address scientific questions and develop practical applications.
The purpose of this book is to fill this gap by providing a series of tutorials aimed at outlining best practices for using geospatial data to address problems in environmental geography. It is based on the R language and environment, which currently provides the best option for working with a diverse spatial and non-spatial data in a single platform. The book is not intended to provide a comprehensive overview of R. Instead, it uses an example-based approach to present what I believe are the most pratcial and useful approaches for working with geospatial data.
The first chapter provides a brief overview of important concepts in R that can serve as an introduction for readers who have not used R before, or as a refresher for more experienced readers. The subsequent chapters each focus on a particular topic and build upon the material in preceding chapters. The methods presented make extensive use of the tidyverse collection of R package, including ggplot2, dplyr, and tidyr. For geospatial data, the sf package is primarily used for vector data and the raster package is used for raster data. The examples draw upon a variety of data sources, including meterological station data, gridded climate data, classified land cover data, and digital elevation models. Each chapter ends with a set of practice questions that use the same data and methods covered in the chapter.