Chapter 4 Mapping with ggplot2

In addition to creating other types of scientific graphs, the ggplot() function can be used to generate a variety of maps by writing relatively short blocks of code. This chapter will use the ggplot2 package along with several other packages that provide functions for processing geospatial and tabular datasets. If these packages are not already installed, you can install them using the install.packages() function or by selecting Tools > Install Packages in the RStudio menu.

library(sf)           # Objects and functions for geospatial data
library(rgdal)        # Functions for spatial data input/output
library(ggplot2)      # Graphing functions
library(dplyr)        # Functions for processing tabular data
library(tidyr)        # Functions for processing tabular data
library(scales)       # Additional graphics functions
library(RColorBrewer) # Color ramps for graphs and maps
library(gridExtra)    # Functions for arranging multiple plots on a page
library(readr)        # Functions for reading data

4.1 Importing shapefiles

There are several packages for importing, processing, and analyzing vector GIS data in R. This book mainly uses the sf package, which interfaces well with ggplot2 and other tidyverse packages. The biggest advantage of sf is that it stores geographic data in data frames, so we can continue to use all of the tools that we have learned to far to work with geospatial data. In this example, a shapefile of counties in South Dakota is imported to an sf object using the st_read() function. This shapefile has a very large number of columns. For simplicity, the select() function to narrow it down to a much smaller subset of columns.

  • sdcounty is the county name
  • `FIPS is the 5-digit county FIPS code (2 digits for the state plus three digits for the county)
  • c2003 is the number of West Nile virus cases reported to the CDC in 2003
  • p2003 is the total population in 2003
sdcounty <- st_read(dsn = "SD_counties.shp")
## Reading layer `SD_counties' from data source `E:\work\projects\gdswr-book\SD_counties.shp' using driver `ESRI Shapefile'
## Simple feature collection with 66 features and 18 fields
## geometry type:  POLYGON
## dimension:      XY
## bbox:           xmin: -167412.2 ymin: -264522.5 xmax: 456081.5 ymax: 118174.9
## projected CRS:  Custom
class(sdcounty)
## [1] "sf"         "data.frame"
summary(sdcounty)
##     STATE              COUNTY              FIPS               C2002    
##  Length:66          Length:66          Length:66          Min.   :0.0  
##  Class :character   Class :character   Class :character   1st Qu.:0.0  
##  Mode  :character   Mode  :character   Mode  :character   Median :0.0  
##                                                           Mean   :0.5  
##                                                           3rd Qu.:1.0  
##                                                           Max.   :7.0  
##      C2003            C2004            C2005           C2006       
##  Min.   :  1.00   Min.   :0.0000   Min.   : 0.00   Min.   : 0.000  
##  1st Qu.:  5.00   1st Qu.:0.0000   1st Qu.: 1.00   1st Qu.: 0.000  
##  Median : 10.00   Median :0.0000   Median : 2.00   Median : 1.000  
##  Mean   : 15.74   Mean   :0.7727   Mean   : 3.47   Mean   : 1.712  
##  3rd Qu.: 18.00   3rd Qu.:1.0000   3rd Qu.: 4.00   3rd Qu.: 2.000  
##  Max.   :139.00   Max.   :5.0000   Max.   :43.00   Max.   :17.000  
##      C2007            P2002            P2003            P2004       
##  Min.   : 0.000   Min.   :  1116   Min.   :  1081   Min.   :  1079  
##  1st Qu.: 1.000   1st Qu.:  2872   1st Qu.:  2881   1st Qu.:  2892  
##  Median : 2.000   Median :  5652   Median :  5576   Median :  5556  
##  Mean   : 3.136   Mean   : 11520   Mean   : 11574   Mean   : 11670  
##  3rd Qu.: 4.000   3rd Qu.:  9772   3rd Qu.:  9918   3rd Qu.: 10003  
##  Max.   :34.000   Max.   :152681   Max.   :154758   Max.   :157198  
##      P2005            P2006          Perimeter             Area          
##  Min.   :  1050   Min.   :  1067   Min.   : 5349759   Min.   :1.677e+12  
##  1st Qu.:  2891   1st Qu.:  2901   1st Qu.: 6850569   1st Qu.:2.498e+12  
##  Median :  5566   Median :  5540   Median : 8448195   Median :3.749e+12  
##  Mean   : 11741   Mean   : 11847   Mean   : 9409556   Mean   :4.691e+12  
##  3rd Qu.:  9983   3rd Qu.: 10072   3rd Qu.:11499945   3rd Qu.:6.105e+12  
##  Max.   :160051   Max.   :163281   Max.   :21090183   Max.   :1.398e+13  
##      Acres            Hectares               geometry 
##  Min.   : 267346   Min.   :108191   POLYGON      :66  
##  1st Qu.: 398257   1st Qu.:161169   epsg:NA      : 0  
##  Median : 597746   Median :241899   +proj=aea ...: 0  
##  Mean   : 747906   Mean   :302667                     
##  3rd Qu.: 973312   3rd Qu.:393885                     
##  Max.   :2228239   Max.   :901737
head(sdcounty)
## Simple feature collection with 6 features and 18 fields
## geometry type:  POLYGON
## dimension:      XY
## bbox:           xmin: -160176.7 ymin: 4272.666 xmax: 315294.6 ymax: 111714.9
## projected CRS:  Custom
##   STATE           COUNTY  FIPS C2002 C2003 C2004 C2005 C2006 C2007 P2002 P2003
## 1    SD   Perkins County 46105     0    18     0     4     0     2  3280  3205
## 2    SD    Corson County 46031     0     7     1     1     0     1  4267  4308
## 3    SD   Harding County 46063     0     4     0     0     0     2  1295  1292
## 4    SD  Campbell County 46021     0     6     0     2     0     1  1714  1658
## 5    SD McPherson County 46089     0     5     1     1     0     0  2782  2754
## 6    SD     Brown County 46013     4    90     5    43    17    34 34874 34764
##   P2004 P2005 P2006 Perimeter         Area     Acres Hectares
## 1  3110  3047  3025  13825361 1.163562e+13 1854980.4 750683.9
## 2  4364  4327  4288  14980700 1.016352e+13 1620294.0 655709.7
## 3  1237  1204  1205  13150608 1.071416e+13 1708077.9 691234.6
## 4  1601  1549  1494   8469958 3.078621e+12  490801.5 198620.3
## 5  2680  2638  2565   9104449 4.609571e+12  734869.4 297391.1
## 6 34709 34684 34645  10671400 6.960392e+12 1109643.1 449056.6
##                         geometry
## 1 POLYGON ((40.80017 105012.2...
## 2 POLYGON ((115305.8 106021.4...
## 3 POLYGON ((-77157.53 105450....
## 4 POLYGON ((115305.8 106021.4...
## 5 POLYGON ((176602.8 107250.8...
## 6 POLYGON ((253607.5 109345.3...
#sdcounty <- select(sdcounty, FIPS, C2003, P2003, p3001, p3002, p3003, p3004,
#                   p3005, p3006, p3007, p3008, p3009, p3010, p3011, p3012)
#names(sdcounty)
#head(sdcounty)

The sf object contains a column called geometry. This column remains in the data frame after running the select() function, even though it was not one of the columns selected. This is a special columns that contains the geographic data characterizing each feature (either a point, line, or polygon). Don’t try to modify this column directly - use the functions in the sf package for geospatial data processing.

Next, we important a CSV file that contains climate variables for each county into new a new data frame called sdclimate. The columns p3001-p3012 contain the 30-year mean precipitation totals for each month (1-12) and will be used to generate precipitation maps later in the chapter. The sdclimate data frame does not contain a geometry columns, so to map precipitaton it will need to be jointed to sdcounty. Both sdcounty and sdclimate have a column that contains a unique code for each county and can be used to join the tables. This column is called “FIPS” in sdcounty and “fips” (lowercase) in sdclimate. The left_join() function retains all the rows in sdcountyand joins any matching rows from the new tablesdclimate`.

sdclimate <- read_csv(file = "sd_climate.csv")
summary(sdclimate)
##       fips           t3001            t3002               t3003      
##  Min.   :46003   Min.   :-676.9   Min.   :-358.5140   Min.   :316.8  
##  1st Qu.:46036   1st Qu.:-478.3   1st Qu.:-186.6987   1st Qu.:473.5  
##  Median :46068   Median :-289.0   Median :   0.0325   Median :609.9  
##  Mean   :46068   Mean   :-289.5   Mean   :  -2.6165   Mean   :601.1  
##  3rd Qu.:46101   3rd Qu.:-123.4   3rd Qu.: 156.9818   3rd Qu.:718.9  
##  Max.   :46137   Max.   : 169.4   Max.   : 438.3400   Max.   :900.5  
##      t3004          t3005          t3006          t3007          t3008     
##  Min.   :1144   Min.   :1692   Min.   :2242   Min.   :2719   Min.   :2661  
##  1st Qu.:1380   1st Qu.:2067   1st Qu.:2564   1st Qu.:2970   1st Qu.:2892  
##  Median :1501   Median :2135   Median :2667   Median :3068   Median :2983  
##  Mean   :1477   Mean   :2130   Mean   :2650   Mean   :3050   Mean   :2971  
##  3rd Qu.:1581   3rd Qu.:2217   3rd Qu.:2739   3rd Qu.:3141   3rd Qu.:3064  
##  Max.   :1633   Max.   :2272   Max.   :2800   Max.   :3241   Max.   :3175  
##      t3009          t3010          t3011           t3012            t30sum    
##  Min.   :2089   Min.   :1417   Min.   :388.9   Min.   :-360.8   Min.   :1227  
##  1st Qu.:2317   1st Qu.:1599   1st Qu.:543.0   1st Qu.:-209.8   1st Qu.:1334  
##  Median :2431   Median :1698   Median :659.7   Median : -40.8   Median :1462  
##  Mean   :2417   Mean   :1696   Mean   :651.3   Mean   : -41.3   Mean   :1443  
##  3rd Qu.:2519   3rd Qu.:1797   3rd Qu.:769.4   3rd Qu.: 106.9   3rd Qu.:1540  
##  Max.   :2603   Max.   :1871   Max.   :869.4   Max.   : 312.0   Max.   :1646  
##      p3001            p3002            p3003          p3004     
##  Min.   : 906.8   Min.   : 952.8   Min.   :1877   Min.   :4192  
##  1st Qu.:1005.9   1st Qu.:1266.3   1st Qu.:3102   1st Qu.:4942  
##  Median :1144.7   Median :1380.7   Median :3454   Median :5365  
##  Mean   :1198.6   Mean   :1379.6   Mean   :3542   Mean   :5599  
##  3rd Qu.:1279.5   3rd Qu.:1461.7   3rd Qu.:4099   3rd Qu.:6391  
##  Max.   :2227.6   Max.   :2291.8   Max.   :4920   Max.   :7146  
##      p3005           p3006           p3007          p3008          p3009     
##  Min.   : 6570   Min.   : 6921   Min.   :5225   Min.   :3355   Min.   :2788  
##  1st Qu.: 7246   1st Qu.: 7820   1st Qu.:6755   1st Qu.:4953   1st Qu.:3589  
##  Median : 7967   Median : 8465   Median :7381   Median :5673   Median :4757  
##  Mean   : 8028   Mean   : 8532   Mean   :7339   Mean   :5840   Mean   :4729  
##  3rd Qu.: 8612   3rd Qu.: 9153   3rd Qu.:8123   3rd Qu.:6918   3rd Qu.:5858  
##  Max.   :10019   Max.   :10184   Max.   :8923   Max.   :8077   Max.   :6588  
##      p3010          p3011          p3012            p30sum     
##  Min.   :3252   Min.   :1437   Min.   : 884.8   Min.   :397.8  
##  1st Qu.:3878   1st Qu.:1738   1st Qu.:1004.1   1st Qu.:478.9  
##  Median :4256   Median :2199   Median :1062.9   Median :540.5  
##  Mean   :4256   Mean   :2348   Mean   :1141.4   Mean   :538.8  
##  3rd Qu.:4663   3rd Qu.:2898   3rd Qu.:1214.9   3rd Qu.:605.0  
##  Max.   :5118   Max.   :3770   Max.   :2531.5   Max.   :673.8
sdclimate <- mutate(sdclimate, fips = as.character(fips))
sdcounty <- left_join(sdcounty, sdclimate, by = c("FIPS" = "fips"))
names(sdcounty)
##  [1] "STATE"     "COUNTY"    "FIPS"      "C2002"     "C2003"     "C2004"    
##  [7] "C2005"     "C2006"     "C2007"     "P2002"     "P2003"     "P2004"    
## [13] "P2005"     "P2006"     "Perimeter" "Area"      "Acres"     "Hectares" 
## [19] "t3001"     "t3002"     "t3003"     "t3004"     "t3005"     "t3006"    
## [25] "t3007"     "t3008"     "t3009"     "t3010"     "t3011"     "t3012"    
## [31] "t30sum"    "p3001"     "p3002"     "p3003"     "p3004"     "p3005"    
## [37] "p3006"     "p3007"     "p3008"     "p3009"     "p3010"     "p3011"    
## [43] "p3012"     "p30sum"    "geometry"

The FIPS column in sdcounty is a character object, and the fips column in sdclimate is a numeric object. To be able to join these data frames, the fips column is converted from numeric to character using the mutate() function.

Another package for working with geospatial data in R is sp. This package has been around for much longer than sf, and there are several other geospatial packages that require data to be input as an sp object. The following examples show how to import a shapefile to an sp object, and how to convert between sf and sp objects. Note that the rgdal package is required to import shapefiles to an sp object.

sdcounty_sp <- readOGR(dsn = ".", layer = "SD_counties")
## OGR data source with driver: ESRI Shapefile 
## Source: "E:\work\projects\gdswr-book", layer: "SD_counties"
## with 66 features
## It has 18 fields
class(sdcounty_sp)
## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"
sdcounty_sf <- st_as_sf(sdcounty_sp)
class(sdcounty_sf)
## [1] "sf"         "data.frame"
sdcounty_sp2 <- as(sdcounty, 'Spatial')
class(sdcounty_sp2)
## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"

To generate a map of counties using ggplot() with a sf object, use the geom_sf() function. Key arguments include the variable to be used to fill in the polygons and the color and size of the polygon borders. By default, ggplot() will use a default color ramp (light to dark blue) to represent the number of cases in each polygon.

ggplot(data = sdcounty) +
  geom_sf(aes(fill = C2003), color = "black", size = 0.25) 

4.2 Modifying the appearance of the map

Additional ggplot2 functions can be added to improve the appearance of the map. The scale_fill_distiller() function allows the specification of a different color ramp. In this example, the “YlGn” palette from the RColorBrewer package and the pretty_breaks() function from the scales package to automatically select a set of breaks for the legend. The theme_bw() eliminates the gray background grid to produce a nicer looking map layout. Finally, a title is added using the labs() function.

ggplot(data = sdcounty) +
  geom_sf(aes(fill = C2003), size = 0.25) + 
  scale_fill_distiller(name="Cases", 
                       palette = "YlGn", 
                       breaks = pretty_breaks()) +
  theme_bw() + 
  labs(title="2003 West Nile Virus Cases in South Dakota")

To have more control over how the map looks, the theme() function can be used with additional arguments. For example, the following version of the map removes the bounding box and the axis ticks and labels entirely.

ggplot(data = sdcounty) +
  geom_sf(aes(fill = C2003), size = 0.25) + 
  scale_fill_distiller(name="Cases", palette = "YlGn", breaks = pretty_breaks()) +
  theme(axis.text.x = element_blank(),
        axis.text.y = element_blank(),
        axis.ticks = element_blank(),
        panel.background = element_rect(fill = "white", color = NA)) +
  labs(title="2003 West Nile Virus Cases in South Dakota")

The RColorBrewer package provides a variety of color palettes that are effective for choropleth mapping. The following functions can be used to explore the different palettes that are available. More information is available at http://colorbrewer2.org.

display.brewer.all()

display.brewer.pal(5, "YlGnBu")

display.brewer.pal(7, "Reds")

The geographic pattern of West Nile virus cases reflects underlying risk factors related to the different physical and social environments in each county. However, case numbers are also related to the size of the human population in each county. Even if the risk of WNV was homogeneous across the state, we would still expect to see more cases in counties with higher population. To explore these relationships, we will plot the population of each county using varying-sized dots located on the county centroids.

To generate a point map, a new data frame containing the latitude and longitude of each county along with the county-level attributes must be created. Next, the geom_point() function is used to plot the points sizes proportional to the county populations. The scale_size() function generates a legend for the population points. In this example, the data argument is no longer provided in the initial call to ggplot(). Instead, two data frames are provided - one in each call to geom_sf().

sdcntrd = st_centroid(sdcounty)
head(sdcntrd)
## Simple feature collection with 6 features and 44 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: -116807.9 ymin: 54524.71 xmax: 284344.1 ymax: 89006.6
## projected CRS:  Custom
##   STATE           COUNTY  FIPS C2002 C2003 C2004 C2005 C2006 C2007 P2002 P2003
## 1    SD   Perkins County 46105     0    18     0     4     0     2  3280  3205
## 2    SD    Corson County 46031     0     7     1     1     0     1  4267  4308
## 3    SD   Harding County 46063     0     4     0     0     0     2  1295  1292
## 4    SD  Campbell County 46021     0     6     0     2     0     1  1714  1658
## 5    SD McPherson County 46089     0     5     1     1     0     0  2782  2754
## 6    SD     Brown County 46013     4    90     5    43    17    34 34874 34764
##   P2004 P2005 P2006 Perimeter         Area     Acres Hectares    t3001    t3002
## 1  3110  3047  3025  13825361 1.163562e+13 1854980.4 750683.9 -279.557    3.004
## 2  4364  4327  4288  14980700 1.016352e+13 1620294.0 655709.7 -450.446 -139.509
## 3  1237  1204  1205  13150608 1.071416e+13 1708077.9 691234.6 -199.345   64.800
## 4  1601  1549  1494   8469958 3.078621e+12  490801.5 198620.3 -570.374 -259.721
## 5  2680  2638  2565   9104449 4.609571e+12  734869.4 297391.1 -645.038 -351.214
## 6 34709 34684 34645  10671400 6.960392e+12 1109643.1 449056.6 -632.268 -310.498
##     t3003   t3004   t3005   t3006   t3007   t3008   t3009   t3010   t3011
## 1 537.272 1376.88 2032.19 2547.50 3037.38 2999.96 2363.36 1612.88 592.336
## 2 465.169 1381.77 2063.46 2560.29 3015.03 2958.07 2328.10 1597.46 524.725
## 3 543.073 1338.58 1951.09 2474.08 2997.77 2959.60 2313.91 1581.34 617.346
## 4 408.672 1380.07 2080.63 2572.33 2985.49 2923.18 2315.87 1581.31 484.309
## 5 335.534 1324.12 2030.93 2517.76 2917.22 2857.97 2263.05 1527.08 422.059
## 6 385.872 1384.75 2087.71 2587.21 2959.30 2889.27 2310.16 1574.48 459.825
##      t3012  t30sum    p3001    p3002   p3003   p3004   p3005   p3006   p3007
## 1  -43.555 1398.35  920.638 1127.050 2299.75 4618.26 7095.08 7365.01 5788.26
## 2 -170.649 1344.50  969.241 1139.210 2514.39 4433.97 6797.22 7630.82 5973.71
## 3   38.097 1390.07  981.429  952.777 1877.16 4192.13 7060.06 7410.31 5224.71
## 4 -264.731 1303.14  963.588 1141.570 2502.02 4291.78 6569.56 7822.22 6500.67
## 5 -349.982 1237.50 1142.690 1256.730 2925.70 4580.94 6884.06 8235.93 7048.24
## 6 -307.957 1282.36 1276.680 1254.370 3113.37 4762.04 6938.38 8541.65 7729.54
##     p3008   p3009   p3010   p3011    p3012  p30sum                   geometry
## 1 3888.68 3064.84 3474.34 1436.55  958.519 419.877 POINT (-37256.53 54524.71)
## 2 4511.33 3292.64 3736.00 1469.21 1031.160 434.495  POINT (62487.36 79145.04)
## 3 3355.43 2916.34 3459.64 1450.20  953.584 397.843 POINT (-116807.9 65558.57)
## 4 5328.33 3603.61 3920.38 1651.33  899.942 451.451    POINT (151277 87684.18)
## 5 5628.95 4180.97 4009.65 1952.56  918.467 487.154     POINT (215717 89006.6)
## 6 6184.40 4745.59 4167.08 2136.99 1025.430 518.259  POINT (284344.1 71966.24)
ggplot() +
  geom_sf(data = sdcounty, aes(fill = C2003), color = "black", size = 0.25) + 
  scale_fill_distiller(name="Cases", palette = "YlGn", breaks = pretty_breaks())+
  geom_sf(data = sdcntrd, aes(size = P2003), color = "black", show.legend = "point") +
  scale_size(name = "Population") + 
  theme_bw() +
  labs(title="2003 West Nile Virus Cases in South Dakota")

Intead of mapping case numbers, it preferable map the incidence rate, which is the number of cases per unit of population (often per 100,000 population) and time period (usually per year). A new incidence variable can be calculated and added to the data frame using the mutate() function from the dplyr package.

Rather than using continuous scales for color and size, it is usually recommended to aggregate the data into a relatively small number of classes (typically 3-6). To accomplish this step, we need to add a couple of new classified variables using mutate(). The cut() function is used to split the incidence rate into four quantiles, and to split population into three classes using manually-selected breaks. Note that a different scale() functions is now needed to handle the discrete variables instead of continuous variables.

sdcounty <- mutate(sdcounty, 
                   inc2003 = 100000 * C2003/P2003, 
                   inc2003_c1 = cut(inc2003, breaks = quantile(inc2003),
                                    include.lowest = T))
sdcntrd <- mutate(sdcntrd,
                  pop2003_c1 = cut(P2003, breaks = c(0, 10000, 50000, 200000),
                                   include.lowest = T, dig.lab=6))

ggplot() +
  geom_sf(data = sdcounty, aes(fill = inc2003_c1), color = "black", size = 0.25) + 
  scale_fill_brewer(name="Incidence", palette = "YlGn",
                    guide = guide_legend(override.aes = list(linetype = "blank", shape = NA))) +
  geom_sf(data = sdcntrd, aes(size = pop2003_c1), color = "black", show.legend = "point") +
  scale_size_discrete(name="Population") + 
  theme_bw() +
  labs(title="2003 West Nile Virus Incidence in South Dakota")

4.3 Layouts with multiple maps

It is often useful to be able to generate mutipanel map layouts, similar to the multipanel graphs created in the previous chapter In this example, we will use the county-level climatology data for SD to generate maps of the 30-year average precipitation by month.

The attribute table imported with the shapefile is in a ‘wide’ format. This means that there is one row for every observation (county) and a separate column for each variable, so the twelve monthly precipitation variables that we want to map are in twelve different columns. To map them using facets with ggplot(), they need to be converted into ‘long’ format, where all the precipitation variables are stored in a single column (the ‘values_to’ column) and there is an additional column that indicates which month is associated with each record (the ‘names_to’ column). As illustrated in the preceding chapter, this transformation can be accomplished using the pivot_long() function in the tidyr package. The subset of variables required for mapping is chosen using select(), and the key variables, monthvar, is converted into a factor that has month names as labels using the mutate() function.

monthly_prec <- sdcounty %>%
  pivot_longer(sdcounty,
               cols = p3001:p3012,
               values_to = "precip",
               names_to = "monthvar") %>%
  select(FIPS, monthvar, precip, geometry) %>%
  mutate(monthvar = factor(monthvar, labels = month.name))
## Warning in gsub(paste0("^", names_prefix), "", names(cols)): argument 'pattern'
## has length > 1 and only the first element will be used
## Warning in val_cols[col_id] <- unname(as.list(data[cols])): number of items to
## replace is not a multiple of replacement length
glimpse(monthly_prec)
## Rows: 792
## Columns: 4
## $ FIPS     <chr> "46105", "46105", "46105", "46105", "46105", "46105", "461...
## $ monthvar <fct> January, February, March, April, May, June, July, August, ...
## $ precip   <dbl> 920.638, 1127.050, 2299.750, 4618.260, 7095.080, 7365.010,...
## $ geometry <POLYGON [m]> POLYGON ((40.80017 105012.2..., POLYGON ((40.80017...

As of the writing of this book, the pivot_longer() and pivot_wider() function have the undesirable side effect of removing the sfclass from the output object, even through the geometry data is still present. As a workaround, be sure to use st_as_sf() function to convert the object back to an sf class.

class(monthly_prec)
## [1] "tbl_df"     "tbl"        "data.frame"
monthly_prec <- st_as_sf(monthly_prec)
class(monthly_prec)
## [1] "sf"         "tbl_df"     "tbl"        "data.frame"

Now ggplot() can be used with the facet_wrap() function to generate a layout with the twelve monthly maps arranged in four columns.

ggplot() +
  geom_sf(data = monthly_prec, aes(fill = precip), color = "black", size = 0.25) + 
  scale_fill_distiller(name="Cases", palette = "Blues", breaks = pretty_breaks())+
  facet_wrap(~ monthvar, ncol = 4) +
  labs(title="Monthly Climatology of Precipitaton in South Dakota")

The appearance of the strips containing the facet labels can be changed using the theme() function.

ggplot() +
  geom_sf(data = monthly_prec, aes(fill = precip), color = "black", size = 0.25) + 
  scale_fill_distiller(name="Cases", palette = "Blues", breaks = pretty_breaks())+
  facet_wrap(~ monthvar, ncol = 4) +
  theme(strip.text.x = element_text(size=8, face="bold"),
        strip.background = element_rect(colour="black", fill="white")) +
  labs(title="Monthly Climatology of Precipitaton in South Dakota")

Another way to create a layout with multiple maps is to make multiple calls to ggplot() and then arrange the resulting maps using the grid.arrange() function from the gridExtra package. This approach is necessary to combine different types of charts and/or maps. This example generates a layout with two maps: a choropleth map of West Nile virus incidence and a graduated symbol map of population.

map1 <- ggplot() +
  geom_sf(data = sdcounty, 
               aes(fill = inc2003_c1), color = "black", size = 0.25) + 
  scale_fill_brewer(name="Incidence", palette = "YlGn")+
  theme_bw() +
  labs(title="A) 2003 West Nile Virus Incidence (per 100,000)")

map2 <- ggplot() +
  geom_sf(data = sdcounty, fill = "white", color = "black", size = 0.25) + 
  geom_sf(data = sdcntrd, aes(size = pop2003_c1), color = "blue", show.legend = "point") +
  scale_size_discrete(name="Population") + 
  theme_bw() +
  labs(title="B) 2003 County Population")

class(map1)
## [1] "gg"     "ggplot"
class(map2)
## [1] "gg"     "ggplot"
grid.arrange(map1, map2, ncol=1)

When the ggplot() function is called without assigning the output to an object, the resulting chart or map is plotted to the current graphics device. However, the function output can also be assgned to create a new object of class “ggplot”. In the preceding example, ggplot() output is assigned to create two new objects called map1 and map2, and these objects are then provided as arguments to grid.arrange() to generate a two-map layout.

4.4 Output to Graphics Devices

By default, maps and charts generated using ggplot() are output to the Plots tab in the lower right-hand corner of the RStudio GUI. However, it is sometimes necessary to print maps to other graphics devices and explicitly specify the dimensions and resolution of the resulting files. This is often the case when generating graphics for publications. To output the map to a new graphics window, the windows() function is used to open a new device with the specified size.

windows(width = 6, height=4)
map1 +
  theme(plot.margin = unit(c(0.3, 0.3, 0.3, 0.3), "in"))

In this example, the windows() function is used to open a 6 inch wide by 4 inch high graphics window. As with other types of R objects, entering the object name invokes the default print() method and plots the chart to the current graphics device.

Alternately, the png() function can be used to write the map directly to a PNG image. There are also bmp(), jpeg(), and tiff() functions for writing to other image formats. There are various ways to specify image size and resolution. In the example below, the figure size is 6 inches wide by 4 inches high, and the resolution is 300 pixels per inch. Thus, the resulting image will be 6 x 300 = 1,800 pixels wide by 4 x 300 = 1,200 pixels high.

png(filename = "testgraph.png", width = 6, height=4, units="in", res=300)
map1 +
  theme(plot.margin = unit(c(0.3, 0.3, 0.3, 0.3), "in"))
dev.off()
## png 
##   2

The png() graphics device needs to be turned off using the dev.off() function if you want subsequent plots to revert to the default RStudio graphics window.

It is also possible to export a map directly to a pdf file using the pdf() function.

pdf(file = "testgraph.pdf", width=8.5, height=11)
newmap1 <- map1 + theme(plot.margin = unit(c(1,1,0.1,1), "in"))
newmap2 <- map1 + theme(plot.margin = unit(c(0.1,1,1,1), "in"))
grid.arrange(newmap1, newmap2, ncol=1 )
dev.off()
## png 
##   2

4.5 Practice

  1. Experiment with different versions of the West Nile virus incidence map. Try different color ramps from the RColorBrewer package and different numbers and breakpoints for classes in the choropleth maps.

  2. Generate mutli-panel map of mean monthly temperatures. The monthly temperature variables are t3001, t3002… t3012.