Chapter 4 Mapping with ggplot2
In addition to creating other types of scientific graphs, the ggplot()
function can be used to generate a variety of maps by writing relatively short blocks of code. This chapter will use the ggplot2 package along with several other packages that provide functions for processing geospatial and tabular datasets. If these packages are not already installed, you can install them using the install.packages()
function or by selecting Tools > Install Packages in the RStudio menu.
library(sf) # Objects and functions for geospatial data
library(rgdal) # Functions for spatial data input/output
library(ggplot2) # Graphing functions
library(dplyr) # Functions for processing tabular data
library(tidyr) # Functions for processing tabular data
library(scales) # Additional graphics functions
library(RColorBrewer) # Color ramps for graphs and maps
library(gridExtra) # Functions for arranging multiple plots on a page
library(readr) # Functions for reading data
4.1 Importing shapefiles
There are several packages for importing, processing, and analyzing vector GIS data in R. This book mainly uses the sf package, which interfaces well with ggplot2 and other tidyverse packages. The biggest advantage of sf is that it stores geographic data in data frames, so we can continue to use all of the tools that we have learned to far to work with geospatial data. In this example, a shapefile of counties in South Dakota is imported to an sf object using the st_read()
function. This shapefile has a very large number of columns. For simplicity, the select()
function to narrow it down to a much smaller subset of columns.
sdcounty
is the county name- `FIPS is the 5-digit county FIPS code (2 digits for the state plus three digits for the county)
c2003
is the number of West Nile virus cases reported to the CDC in 2003p2003
is the total population in 2003
st_read(dsn = "SD_counties.shp") sdcounty <-
## Reading layer `SD_counties' from data source `E:\work\projects\gdswr-book\SD_counties.shp' using driver `ESRI Shapefile'
## Simple feature collection with 66 features and 18 fields
## geometry type: POLYGON
## dimension: XY
## bbox: xmin: -167412.2 ymin: -264522.5 xmax: 456081.5 ymax: 118174.9
## projected CRS: Custom
class(sdcounty)
## [1] "sf" "data.frame"
summary(sdcounty)
## STATE COUNTY FIPS C2002
## Length:66 Length:66 Length:66 Min. :0.0
## Class :character Class :character Class :character 1st Qu.:0.0
## Mode :character Mode :character Mode :character Median :0.0
## Mean :0.5
## 3rd Qu.:1.0
## Max. :7.0
## C2003 C2004 C2005 C2006
## Min. : 1.00 Min. :0.0000 Min. : 0.00 Min. : 0.000
## 1st Qu.: 5.00 1st Qu.:0.0000 1st Qu.: 1.00 1st Qu.: 0.000
## Median : 10.00 Median :0.0000 Median : 2.00 Median : 1.000
## Mean : 15.74 Mean :0.7727 Mean : 3.47 Mean : 1.712
## 3rd Qu.: 18.00 3rd Qu.:1.0000 3rd Qu.: 4.00 3rd Qu.: 2.000
## Max. :139.00 Max. :5.0000 Max. :43.00 Max. :17.000
## C2007 P2002 P2003 P2004
## Min. : 0.000 Min. : 1116 Min. : 1081 Min. : 1079
## 1st Qu.: 1.000 1st Qu.: 2872 1st Qu.: 2881 1st Qu.: 2892
## Median : 2.000 Median : 5652 Median : 5576 Median : 5556
## Mean : 3.136 Mean : 11520 Mean : 11574 Mean : 11670
## 3rd Qu.: 4.000 3rd Qu.: 9772 3rd Qu.: 9918 3rd Qu.: 10003
## Max. :34.000 Max. :152681 Max. :154758 Max. :157198
## P2005 P2006 Perimeter Area
## Min. : 1050 Min. : 1067 Min. : 5349759 Min. :1.677e+12
## 1st Qu.: 2891 1st Qu.: 2901 1st Qu.: 6850569 1st Qu.:2.498e+12
## Median : 5566 Median : 5540 Median : 8448195 Median :3.749e+12
## Mean : 11741 Mean : 11847 Mean : 9409556 Mean :4.691e+12
## 3rd Qu.: 9983 3rd Qu.: 10072 3rd Qu.:11499945 3rd Qu.:6.105e+12
## Max. :160051 Max. :163281 Max. :21090183 Max. :1.398e+13
## Acres Hectares geometry
## Min. : 267346 Min. :108191 POLYGON :66
## 1st Qu.: 398257 1st Qu.:161169 epsg:NA : 0
## Median : 597746 Median :241899 +proj=aea ...: 0
## Mean : 747906 Mean :302667
## 3rd Qu.: 973312 3rd Qu.:393885
## Max. :2228239 Max. :901737
head(sdcounty)
## Simple feature collection with 6 features and 18 fields
## geometry type: POLYGON
## dimension: XY
## bbox: xmin: -160176.7 ymin: 4272.666 xmax: 315294.6 ymax: 111714.9
## projected CRS: Custom
## STATE COUNTY FIPS C2002 C2003 C2004 C2005 C2006 C2007 P2002 P2003
## 1 SD Perkins County 46105 0 18 0 4 0 2 3280 3205
## 2 SD Corson County 46031 0 7 1 1 0 1 4267 4308
## 3 SD Harding County 46063 0 4 0 0 0 2 1295 1292
## 4 SD Campbell County 46021 0 6 0 2 0 1 1714 1658
## 5 SD McPherson County 46089 0 5 1 1 0 0 2782 2754
## 6 SD Brown County 46013 4 90 5 43 17 34 34874 34764
## P2004 P2005 P2006 Perimeter Area Acres Hectares
## 1 3110 3047 3025 13825361 1.163562e+13 1854980.4 750683.9
## 2 4364 4327 4288 14980700 1.016352e+13 1620294.0 655709.7
## 3 1237 1204 1205 13150608 1.071416e+13 1708077.9 691234.6
## 4 1601 1549 1494 8469958 3.078621e+12 490801.5 198620.3
## 5 2680 2638 2565 9104449 4.609571e+12 734869.4 297391.1
## 6 34709 34684 34645 10671400 6.960392e+12 1109643.1 449056.6
## geometry
## 1 POLYGON ((40.80017 105012.2...
## 2 POLYGON ((115305.8 106021.4...
## 3 POLYGON ((-77157.53 105450....
## 4 POLYGON ((115305.8 106021.4...
## 5 POLYGON ((176602.8 107250.8...
## 6 POLYGON ((253607.5 109345.3...
#sdcounty <- select(sdcounty, FIPS, C2003, P2003, p3001, p3002, p3003, p3004,
# p3005, p3006, p3007, p3008, p3009, p3010, p3011, p3012)
#names(sdcounty)
#head(sdcounty)
The sf object contains a column called geometry
. This column remains in the data frame after running the select()
function, even though it was not one of the columns selected. This is a special columns that contains the geographic data characterizing each feature (either a point, line, or polygon). Don’t try to modify this column directly - use the functions in the sf package for geospatial data processing.
Next, we important a CSV file that contains climate variables for each county into new a new data frame called sdclimate
. The columns p3001-p3012 contain the 30-year mean precipitation totals for each month (1-12) and will be used to generate precipitation maps later in the chapter. The sdclimate
data frame does not contain a geometry columns, so to map precipitaton it will need to be jointed to sdcounty
. Both sdcounty
and sdclimate
have a column that contains a unique code for each county and can be used to join the tables. This column is called “FIPS” in sdcounty
and “fips” (lowercase) in sdclimate
. The left_join()
function retains all the rows in sdcounty
and joins any matching rows from the new table
sdclimate`.
read_csv(file = "sd_climate.csv")
sdclimate <-summary(sdclimate)
## fips t3001 t3002 t3003
## Min. :46003 Min. :-676.9 Min. :-358.5140 Min. :316.8
## 1st Qu.:46036 1st Qu.:-478.3 1st Qu.:-186.6987 1st Qu.:473.5
## Median :46068 Median :-289.0 Median : 0.0325 Median :609.9
## Mean :46068 Mean :-289.5 Mean : -2.6165 Mean :601.1
## 3rd Qu.:46101 3rd Qu.:-123.4 3rd Qu.: 156.9818 3rd Qu.:718.9
## Max. :46137 Max. : 169.4 Max. : 438.3400 Max. :900.5
## t3004 t3005 t3006 t3007 t3008
## Min. :1144 Min. :1692 Min. :2242 Min. :2719 Min. :2661
## 1st Qu.:1380 1st Qu.:2067 1st Qu.:2564 1st Qu.:2970 1st Qu.:2892
## Median :1501 Median :2135 Median :2667 Median :3068 Median :2983
## Mean :1477 Mean :2130 Mean :2650 Mean :3050 Mean :2971
## 3rd Qu.:1581 3rd Qu.:2217 3rd Qu.:2739 3rd Qu.:3141 3rd Qu.:3064
## Max. :1633 Max. :2272 Max. :2800 Max. :3241 Max. :3175
## t3009 t3010 t3011 t3012 t30sum
## Min. :2089 Min. :1417 Min. :388.9 Min. :-360.8 Min. :1227
## 1st Qu.:2317 1st Qu.:1599 1st Qu.:543.0 1st Qu.:-209.8 1st Qu.:1334
## Median :2431 Median :1698 Median :659.7 Median : -40.8 Median :1462
## Mean :2417 Mean :1696 Mean :651.3 Mean : -41.3 Mean :1443
## 3rd Qu.:2519 3rd Qu.:1797 3rd Qu.:769.4 3rd Qu.: 106.9 3rd Qu.:1540
## Max. :2603 Max. :1871 Max. :869.4 Max. : 312.0 Max. :1646
## p3001 p3002 p3003 p3004
## Min. : 906.8 Min. : 952.8 Min. :1877 Min. :4192
## 1st Qu.:1005.9 1st Qu.:1266.3 1st Qu.:3102 1st Qu.:4942
## Median :1144.7 Median :1380.7 Median :3454 Median :5365
## Mean :1198.6 Mean :1379.6 Mean :3542 Mean :5599
## 3rd Qu.:1279.5 3rd Qu.:1461.7 3rd Qu.:4099 3rd Qu.:6391
## Max. :2227.6 Max. :2291.8 Max. :4920 Max. :7146
## p3005 p3006 p3007 p3008 p3009
## Min. : 6570 Min. : 6921 Min. :5225 Min. :3355 Min. :2788
## 1st Qu.: 7246 1st Qu.: 7820 1st Qu.:6755 1st Qu.:4953 1st Qu.:3589
## Median : 7967 Median : 8465 Median :7381 Median :5673 Median :4757
## Mean : 8028 Mean : 8532 Mean :7339 Mean :5840 Mean :4729
## 3rd Qu.: 8612 3rd Qu.: 9153 3rd Qu.:8123 3rd Qu.:6918 3rd Qu.:5858
## Max. :10019 Max. :10184 Max. :8923 Max. :8077 Max. :6588
## p3010 p3011 p3012 p30sum
## Min. :3252 Min. :1437 Min. : 884.8 Min. :397.8
## 1st Qu.:3878 1st Qu.:1738 1st Qu.:1004.1 1st Qu.:478.9
## Median :4256 Median :2199 Median :1062.9 Median :540.5
## Mean :4256 Mean :2348 Mean :1141.4 Mean :538.8
## 3rd Qu.:4663 3rd Qu.:2898 3rd Qu.:1214.9 3rd Qu.:605.0
## Max. :5118 Max. :3770 Max. :2531.5 Max. :673.8
mutate(sdclimate, fips = as.character(fips))
sdclimate <- left_join(sdcounty, sdclimate, by = c("FIPS" = "fips"))
sdcounty <-names(sdcounty)
## [1] "STATE" "COUNTY" "FIPS" "C2002" "C2003" "C2004"
## [7] "C2005" "C2006" "C2007" "P2002" "P2003" "P2004"
## [13] "P2005" "P2006" "Perimeter" "Area" "Acres" "Hectares"
## [19] "t3001" "t3002" "t3003" "t3004" "t3005" "t3006"
## [25] "t3007" "t3008" "t3009" "t3010" "t3011" "t3012"
## [31] "t30sum" "p3001" "p3002" "p3003" "p3004" "p3005"
## [37] "p3006" "p3007" "p3008" "p3009" "p3010" "p3011"
## [43] "p3012" "p30sum" "geometry"
The FIPS column in sdcounty
is a character object, and the fips column in sdclimate
is a numeric object. To be able to join these data frames, the fips column is converted from numeric to character using the mutate()
function.
Another package for working with geospatial data in R is sp. This package has been around for much longer than sf, and there are several other geospatial packages that require data to be input as an sp object. The following examples show how to import a shapefile to an sp object, and how to convert between sf and sp objects. Note that the rgdal package is required to import shapefiles to an sp object.
readOGR(dsn = ".", layer = "SD_counties") sdcounty_sp <-
## OGR data source with driver: ESRI Shapefile
## Source: "E:\work\projects\gdswr-book", layer: "SD_counties"
## with 66 features
## It has 18 fields
class(sdcounty_sp)
## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"
st_as_sf(sdcounty_sp)
sdcounty_sf <-class(sdcounty_sf)
## [1] "sf" "data.frame"
as(sdcounty, 'Spatial')
sdcounty_sp2 <-class(sdcounty_sp2)
## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"
To generate a map of counties using ggplot()
with a sf object, use the geom_sf()
function. Key arguments include the variable to be used to fill in the polygons and the color and size of the polygon borders. By default, ggplot()
will use a default color ramp (light to dark blue) to represent the number of cases in each polygon.
ggplot(data = sdcounty) +
geom_sf(aes(fill = C2003), color = "black", size = 0.25)
4.2 Modifying the appearance of the map
Additional ggplot2 functions can be added to improve the appearance of the map. The scale_fill_distiller()
function allows the specification of a different color ramp. In this example, the “YlGn” palette from the RColorBrewer package and the pretty_breaks()
function from the scales package to automatically select a set of breaks for the legend. The theme_bw()
eliminates the gray background grid to produce a nicer looking map layout. Finally, a title is added using the labs()
function.
ggplot(data = sdcounty) +
geom_sf(aes(fill = C2003), size = 0.25) +
scale_fill_distiller(name="Cases",
palette = "YlGn",
breaks = pretty_breaks()) +
theme_bw() +
labs(title="2003 West Nile Virus Cases in South Dakota")
To have more control over how the map looks, the theme()
function can be used with additional arguments. For example, the following version of the map removes the bounding box and the axis ticks and labels entirely.
ggplot(data = sdcounty) +
geom_sf(aes(fill = C2003), size = 0.25) +
scale_fill_distiller(name="Cases", palette = "YlGn", breaks = pretty_breaks()) +
theme(axis.text.x = element_blank(),
axis.text.y = element_blank(),
axis.ticks = element_blank(),
panel.background = element_rect(fill = "white", color = NA)) +
labs(title="2003 West Nile Virus Cases in South Dakota")
The RColorBrewer package provides a variety of color palettes that are effective for choropleth mapping. The following functions can be used to explore the different palettes that are available. More information is available at http://colorbrewer2.org.
display.brewer.all()
display.brewer.pal(5, "YlGnBu")
display.brewer.pal(7, "Reds")
The geographic pattern of West Nile virus cases reflects underlying risk factors related to the different physical and social environments in each county. However, case numbers are also related to the size of the human population in each county. Even if the risk of WNV was homogeneous across the state, we would still expect to see more cases in counties with higher population. To explore these relationships, we will plot the population of each county using varying-sized dots located on the county centroids.
To generate a point map, a new data frame containing the latitude and longitude of each county along with the county-level attributes must be created. Next, the geom_point()
function is used to plot the points sizes proportional to the county populations. The scale_size()
function generates a legend for the population points. In this example, the data
argument is no longer provided in the initial call to ggplot()
. Instead, two data frames are provided - one in each call to geom_sf()
.
st_centroid(sdcounty)
sdcntrd =head(sdcntrd)
## Simple feature collection with 6 features and 44 fields
## geometry type: POINT
## dimension: XY
## bbox: xmin: -116807.9 ymin: 54524.71 xmax: 284344.1 ymax: 89006.6
## projected CRS: Custom
## STATE COUNTY FIPS C2002 C2003 C2004 C2005 C2006 C2007 P2002 P2003
## 1 SD Perkins County 46105 0 18 0 4 0 2 3280 3205
## 2 SD Corson County 46031 0 7 1 1 0 1 4267 4308
## 3 SD Harding County 46063 0 4 0 0 0 2 1295 1292
## 4 SD Campbell County 46021 0 6 0 2 0 1 1714 1658
## 5 SD McPherson County 46089 0 5 1 1 0 0 2782 2754
## 6 SD Brown County 46013 4 90 5 43 17 34 34874 34764
## P2004 P2005 P2006 Perimeter Area Acres Hectares t3001 t3002
## 1 3110 3047 3025 13825361 1.163562e+13 1854980.4 750683.9 -279.557 3.004
## 2 4364 4327 4288 14980700 1.016352e+13 1620294.0 655709.7 -450.446 -139.509
## 3 1237 1204 1205 13150608 1.071416e+13 1708077.9 691234.6 -199.345 64.800
## 4 1601 1549 1494 8469958 3.078621e+12 490801.5 198620.3 -570.374 -259.721
## 5 2680 2638 2565 9104449 4.609571e+12 734869.4 297391.1 -645.038 -351.214
## 6 34709 34684 34645 10671400 6.960392e+12 1109643.1 449056.6 -632.268 -310.498
## t3003 t3004 t3005 t3006 t3007 t3008 t3009 t3010 t3011
## 1 537.272 1376.88 2032.19 2547.50 3037.38 2999.96 2363.36 1612.88 592.336
## 2 465.169 1381.77 2063.46 2560.29 3015.03 2958.07 2328.10 1597.46 524.725
## 3 543.073 1338.58 1951.09 2474.08 2997.77 2959.60 2313.91 1581.34 617.346
## 4 408.672 1380.07 2080.63 2572.33 2985.49 2923.18 2315.87 1581.31 484.309
## 5 335.534 1324.12 2030.93 2517.76 2917.22 2857.97 2263.05 1527.08 422.059
## 6 385.872 1384.75 2087.71 2587.21 2959.30 2889.27 2310.16 1574.48 459.825
## t3012 t30sum p3001 p3002 p3003 p3004 p3005 p3006 p3007
## 1 -43.555 1398.35 920.638 1127.050 2299.75 4618.26 7095.08 7365.01 5788.26
## 2 -170.649 1344.50 969.241 1139.210 2514.39 4433.97 6797.22 7630.82 5973.71
## 3 38.097 1390.07 981.429 952.777 1877.16 4192.13 7060.06 7410.31 5224.71
## 4 -264.731 1303.14 963.588 1141.570 2502.02 4291.78 6569.56 7822.22 6500.67
## 5 -349.982 1237.50 1142.690 1256.730 2925.70 4580.94 6884.06 8235.93 7048.24
## 6 -307.957 1282.36 1276.680 1254.370 3113.37 4762.04 6938.38 8541.65 7729.54
## p3008 p3009 p3010 p3011 p3012 p30sum geometry
## 1 3888.68 3064.84 3474.34 1436.55 958.519 419.877 POINT (-37256.53 54524.71)
## 2 4511.33 3292.64 3736.00 1469.21 1031.160 434.495 POINT (62487.36 79145.04)
## 3 3355.43 2916.34 3459.64 1450.20 953.584 397.843 POINT (-116807.9 65558.57)
## 4 5328.33 3603.61 3920.38 1651.33 899.942 451.451 POINT (151277 87684.18)
## 5 5628.95 4180.97 4009.65 1952.56 918.467 487.154 POINT (215717 89006.6)
## 6 6184.40 4745.59 4167.08 2136.99 1025.430 518.259 POINT (284344.1 71966.24)
ggplot() +
geom_sf(data = sdcounty, aes(fill = C2003), color = "black", size = 0.25) +
scale_fill_distiller(name="Cases", palette = "YlGn", breaks = pretty_breaks())+
geom_sf(data = sdcntrd, aes(size = P2003), color = "black", show.legend = "point") +
scale_size(name = "Population") +
theme_bw() +
labs(title="2003 West Nile Virus Cases in South Dakota")
Intead of mapping case numbers, it preferable map the incidence rate, which is the number of cases per unit of population (often per 100,000 population) and time period (usually per year). A new incidence variable can be calculated and added to the data frame using the mutate()
function from the dplyr package.
Rather than using continuous scales for color and size, it is usually recommended to aggregate the data into a relatively small number of classes (typically 3-6). To accomplish this step, we need to add a couple of new classified variables using mutate()
. The cut()
function is used to split the incidence rate into four quantiles, and to split population into three classes using manually-selected breaks. Note that a different scale()
functions is now needed to handle the discrete variables instead of continuous variables.
mutate(sdcounty,
sdcounty <-inc2003 = 100000 * C2003/P2003,
inc2003_c1 = cut(inc2003, breaks = quantile(inc2003),
include.lowest = T))
mutate(sdcntrd,
sdcntrd <-pop2003_c1 = cut(P2003, breaks = c(0, 10000, 50000, 200000),
include.lowest = T, dig.lab=6))
ggplot() +
geom_sf(data = sdcounty, aes(fill = inc2003_c1), color = "black", size = 0.25) +
scale_fill_brewer(name="Incidence", palette = "YlGn",
guide = guide_legend(override.aes = list(linetype = "blank", shape = NA))) +
geom_sf(data = sdcntrd, aes(size = pop2003_c1), color = "black", show.legend = "point") +
scale_size_discrete(name="Population") +
theme_bw() +
labs(title="2003 West Nile Virus Incidence in South Dakota")
4.3 Layouts with multiple maps
It is often useful to be able to generate mutipanel map layouts, similar to the multipanel graphs created in the previous chapter In this example, we will use the county-level climatology data for SD to generate maps of the 30-year average precipitation by month.
The attribute table imported with the shapefile is in a ‘wide’ format. This means that there is one row for every observation (county) and a separate column for each variable, so the twelve monthly precipitation variables that we want to map are in twelve different columns. To map them using facets with ggplot()
, they need to be converted into ‘long’ format, where all the precipitation variables are stored in a single column (the ‘values_to’ column) and there is an additional column that indicates which month is associated with each record (the ‘names_to’ column). As illustrated in the preceding chapter, this transformation can be accomplished using the pivot_long()
function in the tidyr package. The subset of variables required for mapping is chosen using select()
, and the key variables, monthvar
, is converted into a factor that has month names as labels using the mutate()
function.
sdcounty %>%
monthly_prec <- pivot_longer(sdcounty,
cols = p3001:p3012,
values_to = "precip",
names_to = "monthvar") %>%
select(FIPS, monthvar, precip, geometry) %>%
mutate(monthvar = factor(monthvar, labels = month.name))
## Warning in gsub(paste0("^", names_prefix), "", names(cols)): argument 'pattern'
## has length > 1 and only the first element will be used
## Warning in val_cols[col_id] <- unname(as.list(data[cols])): number of items to
## replace is not a multiple of replacement length
glimpse(monthly_prec)
## Rows: 792
## Columns: 4
## $ FIPS <chr> "46105", "46105", "46105", "46105", "46105", "46105", "461...
## $ monthvar <fct> January, February, March, April, May, June, July, August, ...
## $ precip <dbl> 920.638, 1127.050, 2299.750, 4618.260, 7095.080, 7365.010,...
## $ geometry <POLYGON [m]> POLYGON ((40.80017 105012.2..., POLYGON ((40.80017...
As of the writing of this book, the pivot_longer()
and pivot_wider()
function have the undesirable side effect of removing the sf
class from the output object, even through the geometry data is still present. As a workaround, be sure to use st_as_sf()
function to convert the object back to an sf
class.
class(monthly_prec)
## [1] "tbl_df" "tbl" "data.frame"
st_as_sf(monthly_prec)
monthly_prec <-class(monthly_prec)
## [1] "sf" "tbl_df" "tbl" "data.frame"
Now ggplot()
can be used with the facet_wrap()
function to generate a layout with the twelve monthly maps arranged in four columns.
ggplot() +
geom_sf(data = monthly_prec, aes(fill = precip), color = "black", size = 0.25) +
scale_fill_distiller(name="Cases", palette = "Blues", breaks = pretty_breaks())+
facet_wrap(~ monthvar, ncol = 4) +
labs(title="Monthly Climatology of Precipitaton in South Dakota")
The appearance of the strips containing the facet labels can be changed using the theme()
function.
ggplot() +
geom_sf(data = monthly_prec, aes(fill = precip), color = "black", size = 0.25) +
scale_fill_distiller(name="Cases", palette = "Blues", breaks = pretty_breaks())+
facet_wrap(~ monthvar, ncol = 4) +
theme(strip.text.x = element_text(size=8, face="bold"),
strip.background = element_rect(colour="black", fill="white")) +
labs(title="Monthly Climatology of Precipitaton in South Dakota")
Another way to create a layout with multiple maps is to make multiple calls to ggplot()
and then arrange the resulting maps using the grid.arrange()
function from the gridExtra package. This approach is necessary to combine different types of charts and/or maps. This example generates a layout with two maps: a choropleth map of West Nile virus incidence and a graduated symbol map of population.
ggplot() +
map1 <- geom_sf(data = sdcounty,
aes(fill = inc2003_c1), color = "black", size = 0.25) +
scale_fill_brewer(name="Incidence", palette = "YlGn")+
theme_bw() +
labs(title="A) 2003 West Nile Virus Incidence (per 100,000)")
ggplot() +
map2 <- geom_sf(data = sdcounty, fill = "white", color = "black", size = 0.25) +
geom_sf(data = sdcntrd, aes(size = pop2003_c1), color = "blue", show.legend = "point") +
scale_size_discrete(name="Population") +
theme_bw() +
labs(title="B) 2003 County Population")
class(map1)
## [1] "gg" "ggplot"
class(map2)
## [1] "gg" "ggplot"
grid.arrange(map1, map2, ncol=1)
When the ggplot()
function is called without assigning the output to an object, the resulting chart or map is plotted to the current graphics device. However, the function output can also be assgned to create a new object of class “ggplot”. In the preceding example, ggplot()
output is assigned to create two new objects called map1
and map2
, and these objects are then provided as arguments to grid.arrange()
to generate a two-map layout.
4.4 Output to Graphics Devices
By default, maps and charts generated using ggplot()
are output to the Plots tab in the lower right-hand corner of the RStudio GUI. However, it is sometimes necessary to print maps to other graphics devices and explicitly specify the dimensions and resolution of the resulting files. This is often the case when generating graphics for publications. To output the map to a new graphics window, the windows()
function is used to open a new device with the specified size.
windows(width = 6, height=4)
+
map1 theme(plot.margin = unit(c(0.3, 0.3, 0.3, 0.3), "in"))
In this example, the windows()
function is used to open a 6 inch wide by 4 inch high graphics window. As with other types of R objects, entering the object name invokes the default print()
method and plots the chart to the current graphics device.
Alternately, the png()
function can be used to write the map directly to a PNG image. There are also bmp()
, jpeg()
, and tiff()
functions for writing to other image formats. There are various ways to specify image size and resolution. In the example below, the figure size is 6 inches wide by 4 inches high, and the resolution is 300 pixels per inch. Thus, the resulting image will be 6 x 300 = 1,800 pixels wide by 4 x 300 = 1,200 pixels high.
png(filename = "testgraph.png", width = 6, height=4, units="in", res=300)
+
map1 theme(plot.margin = unit(c(0.3, 0.3, 0.3, 0.3), "in"))
dev.off()
## png
## 2
The png()
graphics device needs to be turned off using the dev.off()
function if you want subsequent plots to revert to the default RStudio graphics window.
It is also possible to export a map directly to a pdf file using the pdf()
function.
pdf(file = "testgraph.pdf", width=8.5, height=11)
map1 + theme(plot.margin = unit(c(1,1,0.1,1), "in"))
newmap1 <- map1 + theme(plot.margin = unit(c(0.1,1,1,1), "in"))
newmap2 <-grid.arrange(newmap1, newmap2, ncol=1 )
dev.off()
## png
## 2
4.5 Practice
Experiment with different versions of the West Nile virus incidence map. Try different color ramps from the RColorBrewer package and different numbers and breakpoints for classes in the choropleth maps.
Generate mutli-panel map of mean monthly temperatures. The monthly temperature variables are t3001, t3002… t3012.