Chapter 12 Imagery and Classification Models
This chapter explores the use of satellite imagery, such as this Sentinel-2 scene (from 28 June 2021) of part of the northern Sierra Nevada, CA, to Pyramid Lake, NV, including display methods (such as the “false color” above) and imagery analysis methods. After we’ve learned how to read the data and cropped it to fit our study area in Red Clover Valley, we’ll try some machine language classifiers to see if we can make sense of the patterns of hydric to xeric vegetation in a meadow under active restoration.
Image analysis methods are important for environmental research, and are covered much more thoroughly in the discipline of remote sensing. Some key things to learn more about are the nature of the electromagnetic spectrum and especially bands of that spectrum that are informative for land cover and especially vegetation detection, and the process of imagery classification. We’ll be using satellite imagery that includes a range of the electromagnetic spectrum from visible to short-wave infrared.
12.1 Reading and displaying Sentinel-2 imagery
We’ll work with 10 and 20 m Sentinel-2 data downloaded from Copernicus Open Access Hub (n.d.) as an entire scene from 20210602 and 20210628, providing cloud-free imagery of Red Clover Valley in the northern Sierra Nevada, where we are doing research on biogeochemical cycles and carbon sequestration from meadow restoration with The Sierra Fund (Clover Valley Ranch Restoration, the Sierra Fund (n.d.)).
Download the image zip files from…
https://sfsu.box.com/s/zx9rvbxdss03rammy7d72rb3uumfzjbf
…and extract into the folder ~/sentinel2/
– the ~
referring to your Home folder (typically Documents
). (See the code below to see how one of the two images is read.) Make sure the folders stored there have names ending in .SAFE
. The following are the two we’ll be using:
S2B_MSIL2A_20210603T184919_N0300_R113_T10TGK_20210603T213609.SAFE
S2A_MSIL2A_20210628T184921_N0300_R113_T10TGK_20210628T230915.SAFE
library(stringr)
library(terra)
<- paste0("S2A_MSIL2A_20210628T184921_N0300_R113_T10TGK_20210628T230915.",
imgFolder "SAFE\\GRANULE\\L2A_T10TGK_A031427_20210628T185628")
<- paste0("~/sentinel2/",imgFolder,"\\IMG_DATA\\R20m")
img20mFolder <- str_sub(imgFolder,12,26)
imgDateTime imgDateTime
## [1] "20210628T184921"
<- str_sub(imgDateTime,1,8) imgDate
As documented at Copernicus Open Access Hub (n.d.), Sentinel-2 imagery is collected at three resolutions, with the most bands at the coarsest (60m) resolution. The bands added at that coarsest resolution are not critical for our work as they relate to oceanographic and atmospheric research, and our focus will be on land cover and vegetation in a terrestrial area. So we’ll work with four bands at 10m and an additional six bands at 20m resolution:
10 m bands
- B02 - Blue 0.490 \(\mu\)m
- B03 - Green 0.560 \(\mu\)m
- B04 - Red 0.665 \(\mu\)m
- B08 - NIR 0.842 \(\mu\)m
20 m bands
- B02 - Blue 0.490 \(\mu\)m
- B03 - Green 0.560 \(\mu\)m
- B04 - Red 0.665 \(\mu\)m
- B05 - Red Edge 0.705 \(\mu\)m
- B06 - Red Edge 0.740 \(\mu\)m
- B07 - Red Edge 0.783 \(\mu\)m
- B8A - NIR 0.865 \(\mu\)m
- B11 - SWIR 1.610 \(\mu\)m
- B12 - SWIR 2.190 \(\mu\)m
<- paste0("B",c("02","03","04","05","06","07","8A","11","12"))
sentinelBands <- paste0(img20mFolder,"/T10TGK_",imgDateTime,"_",sentinelBands,"_20m.jp2")
sentinelFiles <- rast(sentinelFiles)
sentinel sentinelBands
## [1] "B02" "B03" "B04" "B05" "B06" "B07" "B8A" "B11" "B12"
names(sentinel) <- sentinelBands
12.1.1 Individual bands
The Sentinel-2 data are in a 12-bit format, ranging as integers from 0 to 4095, but are preprocessed to represent reflectances at the base of the atmosphere.
We’ll start by looking at 4 key bands – green, red, NIR, SWIR (B11) – and plot these in a gray scale (Figure 12.1).
par(mfrow=c(2,2))
plot(sentinel$B03, main = "Green", col=gray(0:4095 / 4095))
plot(sentinel$B04, main = "Red", col=gray(0:4095 / 4095))
plot(sentinel$B8A, main = "NIR", col=gray(0:4095 / 4095))
plot(sentinel$B11, main = "SWIR (B11)", col=gray(0:4095 / 4095))
12.1.2 Spectral subsets to create 3-band R-G-B & NIR-R-G for visualization
Displaying the imagery 3 bands at a time (displayed as RGB on our computer screen) is always a good place to start (Figure 12.2), and two especially useful band sets are RGB itself – so looking like a conventional color aerial photography – and “false color” that includes a band normally invisible to our eyes, such as near infrared that reflects chlorophyll in healthy plants. In standard “false color” surface-reflected near infrared is displayed as red, reflected red is displayed as green and green is displayed as blue. Some advantages of this false color display include:
- Blue is not included at all, which helps reduce haze
- Water bodies absorb NIR, so appear very dark in the image
- Chlorophyll strongly reflects NIR, so healthy vegetation is bright red.
<- subset(sentinel,3:1)
sentRGB plotRGB(sentRGB,stretch="lin")
We can also spectrally subset to get NIR (B8A, the 7th band of the 20 m multispectral raster we’ve created). See the first figure in this chapter for the NIR-R-G image displayed with plotRGB.
<- subset(sentinel,c(7,3,2)) sentFCC
12.1.3 Crop to study area extent
The Sentinel-2 scene covers a very large area, more than 100 km on each side, 12,000 km2 in area. The area we are interested in – Red Clover Valley – is enclosed in an area of about 100 km2, so we’ll crop the scene to fit this area (Figures 12.3 and 12.4).
<- ext(715680,725040,4419120,4429980)
RCVext <- crop(sentinel,RCVext)
sentRCV <- subset(sentRCV,3:1)
sentRGB plotRGB(sentRGB,stretch="lin")
NIR-R-G:
plotRGB(subset(sentRCV,c(7,3,2)),stretch="lin")
12.1.4 Saving results
Optionally, you may want to save this cropped image to call it up quicker at a later time. We’ll build a simple name with the date and time as part of it.
writeRaster(sentRCV, filename=paste0("~/sentinel2/",imgDateTime,".tif"),overwrite=T)
12.1.5 Band scatter plots
Using scatter plots helps us to to understand how we might employ multiple bands in our analysis. Red vs NIR is particularly intriguing, commonly creating a triangular pattern reflecting the high absorption of NIR by water yet high reflection by healthy vegetation which is visually green, so low in red (Figure 12.5).
pairs(c(sentRCV$B04,sentRCV$B8A), main="Red vs NIR")
12.2 Spectral profiles
A useful way to look at how our imagery responds to known vegetation / land cover at imagery training polygons we’ll use later for supervised classification, we can extract values at points and create a spectral profile (Figure 12.6). We start by deriving points from the polygons we created based on field observations. One approach is to use polygon centroids (with centroids()
), but to get more points from larger polygons, we’ll instead sample the polygons to get a total of ~1000 points.
library(igisci)
<- vect(ex("RCVimagery/train_polys9.shp"))
trainPolys9 <- spatSample(trainPolys9, 1000, method="random")
ptsRCV # for just centroids: ptsRCV <- centroids(trainPolys9)
<- terra::extract(sentRCV, ptsRCV)
extRCV head(extRCV)
## ID B02 B03 B04 B05 B06 B07 B8A B11 B12
## 1 1 194 479 352 890 2467 2954 3312 1800 923
## 2 2 194 479 352 890 2467 2954 3312 1800 923
## 3 3 213 494 410 939 2404 2845 3231 1825 978
## 4 4 276 580 473 991 2436 2882 3247 1944 1060
## 5 5 213 494 410 939 2404 2845 3231 1825 978
## 6 6 228 500 377 909 2407 2862 3225 1780 970
<- aggregate(extRCV[,-1], list(ptsRCV$class), mean)
specRCV rownames(specRCV) <- specRCV[,1]
<- specRCV[,-1]
specRCV library(RColorBrewer)
<- brewer.pal(length(unique(trainPolys9$class)),"Set1")
LCcolors <- as.matrix(specRCV)
specRCV # Start with an emply plot
plot(0, ylim=c(0,4195), xlim=c(1,9),type='n',
xlab="Bands",ylab="Reflectance 12 bit", xaxt='n')
axis(side=1,at=1:9,labels=sentinelBands)
# Add the classes
for (i in 1:nrow(specRCV)){
lines(specRCV[i,],type='l',lwd=4,lty=1, col=LCcolors[i])
}title(main="Spectral signatures Sentinel2 bands, Red Clover Valley")
legend('topleft',rownames(specRCV),cex=0.9,lwd=3,col=LCcolors) #bty='n'
One thing we can observe in the spectral signature is that some vegetation types are pretty similar; for instance ‘mesic’ and ‘mesicRush’ are very similar, as are ‘hydric’ and ‘CARNEB’, so we might want to combine them, and have fewer categories. We also have a 7-level polygon classification, so we’ll use that (later we’ll combine values in code) (Figure 12.7).
<- vect(ex("RCVimagery/train_polys7.shp"))
trainPolys7 <- spatSample(trainPolys7, 1000, method="random")
ptsRCV <- terra::extract(sentRCV, ptsRCV)
extRCV <- aggregate(extRCV[,-1], list(ptsRCV$class), mean)
specRCV rownames(specRCV) <- specRCV[,1]
<- specRCV[,-1]
specRCV <- c("cyan","gold","darkgreen","black","blue","lawngreen","red")
LCcolors <- as.matrix(specRCV)
specRCV plot(0, ylim=c(0,4195), xlim=c(1,9),type='n',
xlab="Bands",ylab="Reflectance 12 bit", xaxt='n')
axis(side=1,at=1:9,labels=sentinelBands)
for (i in 1:nrow(specRCV)){
lines(specRCV[i,],type='l',lwd=4,lty=1, col=LCcolors[i])
}title(main="Spectral signatures Sentinel2 bands, 7 land cover classes", font.main=2)
legend('topleft',rownames(specRCV),cex=0.9,lwd=3,col=LCcolors)
From this spectral signature, we can see that at least at this scale, there’s really no difference between the spectral response ‘hydric’ and ‘willow’. The structure of the willows copses can’t be distinguished from the herbaceous hydric sedges with 20 m pixels. Fortunately, in terms of hydrologic response, willows are also hydric plants so we can just recode willow
as hydric
(Figure 12.8).
<- trainPolys7
trainPolys6 $class=="willow"]$class <- "hydric"
trainPolys6[trainPolys6<- spatSample(trainPolys6, 1000, method="random")
ptsRCV <- terra::extract(sentRCV, ptsRCV)
extRCV <- aggregate(extRCV[,-1], list(ptsRCV$class), mean)
specRCV rownames(specRCV) <- specRCV[,1]
<- specRCV[,-1]
specRCV <- as.matrix(specRCV)
specRCV plot(0, ylim=c(0,4195), xlim=c(1,9),type='n',
xlab="Bands",ylab="Reflectance 12 bit", xaxt='n')
axis(side=1,at=1:9,labels=sentinelBands)
for (i in 1:nrow(specRCV)){
lines(specRCV[i,],type='l',lwd=4,lty=1, col=LCcolors[i])
}title(main="Spectral signatures, 6 land-cover classes, RCV", font.main=2)
legend('topleft',rownames(specRCV),cex=0.9,lwd=3,col=LCcolors)
12.3 Map algebra & vegetation indices
We can also do pixel-wise calculations, the map algebra method (Tomlin (1990)) used in raster GIS. We’ll use it to create vegetation indices, which are widely used as a single quantitative measure to relate to phenological change, moisture and nutrients.
12.3.1 Vegetation indices
The venerable Normalized Difference Vegetation Index (NDVI) uses NIR (Sentinel2 20m: B8A) and RED (Sentinel2 20m: B04) in a ratio that normalizes NIR with respect to visible (usually red) (Figure 12.9).
\[ NDVI = \frac{NIR-RED}{NIR+RED} \]
<- (sentRCV$B8A - sentRCV$B04)/
ndviRCV $B8A + sentRCV$B04)
(sentRCVplot(ndviRCV, col=rev(terrain.colors(10)), main=paste('NDVI',imgDateTime))
12.3.2 Histogram
The histogram for an NDVI is ideally above zero as this one is (Figure 12.10), with the values approaching 1 representing pixels with a lot of reflection in the near infrared, so typically healthy vegetation. Standing water can cause values to go negative, but with 20 m pixels there’s not enough water in the creeks to create a complete pixel of open water.
hist(ndviRCV, main='Normalized Difference Vegetation Index', xlab='NDVI',ylab='Frequency',
col='#6B702B',xlim=c(-1,1),breaks=40,xaxt='n') # -0.5,1
axis(side=1,at=seq(-1,1,0.2), labels=seq(-1,1,0.2))
12.3.3 Other vegetation indices
With the variety of spectral bands available on satellite imagery comes a lot of possibilities for other vegetation (and non-vegetation) indices.
One that has been used to detect moisture content in plants is the Normalized Difference Moisture Index \(NDMI = \frac{NIR-SWIR}{NIR+SWIR}\), which for Sentinel-2 20 m imagery would be \(\frac{B8A-B11}{B8A+B11}\) (Figure 12.11).
<- (sentRCV$B8A - sentRCV$B11)/
ndmiRCV $B8A + sentRCV$B11)
(sentRCVplot(ndmiRCV, col=rev(terrain.colors(10)), main=paste('NDMI',imgDateTime))
In contrast to NDVI, the NDMI histogram extends to negative values (Figure 12.12).
hist(ndmiRCV, main='Normalized Difference Moisture Index', xlab='NDMI',ylab='Frequency',
col='#26547C',xlim=c(-1,1),breaks=40,xaxt='n')
axis(side=1,at=seq(-1,1,0.2), labels=seq(-1,1,0.2))
Another useful index – Normalized Difference Greenness Index (NDGI) uses three bands: NIR, Red, and Green (Yang et al. (2019)), and was first proposed for MODIS but can be varied slightly for Sentinel 2. It has the advantage of using the more widely available 4-band imagery such as is available in the 10 m Sentinel 2 product, 3 m PlanetScope, or multispectral drone imagery from MicaSense which is commonly 5 cm. We’ll stick with the 20 m Sentinel 2 product for now and produce a plot (Figure 12.13) and a histogram (Figure 12.14).
\[ NDGI=\frac{0.616GREEN+0.384NIR-RED}{0.616GREEN+0.384NIR+RED} \]
<- (0.616 * sentRCV$B03 + 0.384 * sentRCV$B8A - sentRCV$B04)/
ndgiRCV 0.616 * sentRCV$B03 + 0.384 * sentRCV$B8A + sentRCV$B04) (
plot(ndgiRCV, col=rev(terrain.colors(10)), main=paste('NDGI',imgDateTime))
hist(ndgiRCV, main='Normalized Difference Greenness Index', xlab='NDGI',ylab='Frequency',
col='#38635A',xlim=c(-1,1),breaks=40,xaxt='n')
axis(side=1,at=seq(-1,1,0.2), labels=seq(-1,1,0.2))
12.4 Unsupervised Classification with k-means
We’ll look at supervised classification next, where we’ll provide the classifier known vegetation or land cover types, but with unsupervised we just provide the number of classes. We’ll use the NDGI we just derived above, and start by creating a vector from the values:
<- values(ndgiRCV)
vals str(vals)
## num [1:254124, 1] 0.356 0.429 0.4 0.359 0.266 ...
## - attr(*, "dimnames")=List of 2
## ..$ : NULL
## ..$ : chr "B03"
We’ll then run the kmeans function to derive clusters:
<- kmeans(na.omit(vals), centers=5, iter.max=500, nstart=5, algorithm="Lloyd") kmc
Now we convert the vector back into a matrix with the same dimensions as the original image…
<- ndgiRCV
kimg values(kimg) <- kmc$cluster
… and map the unsupervised classification (Figure 12.15). For the 5 classes generated, we’d need to follow up by overlaying with known land covers to see what it’s picking up, but we’ll leave that for the supervised classification next.
library(RColorBrewer)
<- brewer.pal(length(unique(values(kimg))),"Set1")
LCcolors plot(kimg, main = 'Unsupervised classification', col = LCcolors, type="classes")
12.5 Machine Learning for Supervised Classification of Imagery
Using machine learning algorithms is one approach to imagery classification, whether that classification is object-based or pixel-based. We’re going to focus on a pixel-based approach to identify land cover from imagery or other continuous raster variables (such as elevation or elevation-derived rasters such as slope, curvature, and roughness), employing samples of those variables called training samples in a “supervised” imagery classification.
It’s useful to realize that the modeling methods we use for this type of imagery classification are really no different at the core from methods we’d use to work with continuous variables that might not even be spatial. For example, if we leave off the classification process for a moment, a machine learning algorithm might be used to predict a response result from a series of predictor variables, like predicting temperature from elevation and latitude, or acceleration might be predicted by some force acting on a body. So the first model might be used to predict the temperature of any location (within the boundary of the study area) given an elevation and latitude; or the second model might predict an acceleration given the magnitude of a force applied to the body (maybe of a given mass). A classification model varies on this by predicting a nominal variable like type of land cover; some other types of responses might be counts (using a Poisson model) or probabilities (using a logistic model).
The imagery classification approach adds to this model an input preparation process and an output prediction process:
- A training set of points and polygons are created that represent areas of known classification such as land cover like forest or wetland, used to identify values of the predictor variables (e.g. imagery bands)
- A predicted raster is created using the model applied to the original rasters.
12.6 Supervised Classification Using Classification and Regression Trees (CART)
We’ll use a machine learning model called Classification and Regression Trees (CART) that is one of the simplest to understand in terms of the results it produces, which includes a display of the resulting classification decision “tree”. We have collected field observations of vegetation/land cover types in our meadow, so can see if we can use these to train a classification model. You’ll recognize our vegetation types (Figure 12.16) from the spectral signatures above. (We’ll start by reading in the data again so this section doesn’t depend on the above code.)
12.6.1 Read imagery and training data and extract sample values for training
library(terra); library(stringr)
<- paste0("S2A_MSIL2A_20210628T184921_N0300_R113_T10TGK_20210628T230915.",
imgFolder "SAFE\\GRANULE\\L2A_T10TGK_A031427_20210628T185628")
<- paste0("~/sentinel2/",imgFolder,"\\IMG_DATA\\R20m")
img20mFolder <- str_sub(imgFolder,12,26)
imgDateTime <- paste0("B",c("02","03","04","05","06","07","8A","11","12"))
sentinelBands <- paste0(img20mFolder,"/T10TGK_",imgDateTime,"_",sentinelBands,"_20m.jp2")
sentinelFiles <- rast(sentinelFiles)
sentinel names(sentinel) <- sentinelBands
<- ext(715680,725040,4419120,4429980)
RCVext <- crop(sentinel,RCVext)
sentRCV <- vect(ex("RCVimagery/train_polys7.shp"))
trainPolys $class=="willow"]$class <- "hydric"
trainPolys[trainPolys<- spatSample(trainPolys, 1000, method="random")
ptsRCV plot(ptsRCV, "class")
<- c("forest","hydric","mesic","rocky","water","xeric")
LCclass <- data.frame(value=1:length(LCclass),names=LCclass)
classdf <- extract(sentRCV, ptsRCV)[,-1] # REMOVE ID
extRCV <- data.frame(class = ptsRCV$class, extRCV) sampdataRCV
12.6.2 Training the CART model
The Recursive Partitioning and Regression Trees package rpart
fits the model.
library(rpart)
<- rpart(as.factor(class)~., data = sampdataRCV, method = 'class', minsplit = 5) cartmodel
Now we can display the decision tree using the rpart.plot
(Figure 12.17). This function clearly shows where each variable (in our case imagery spectral bands) contribute to the classification, and together with the spectral signature analysis above, helps us evaluate how the process works and possibly how to improve our training data where distinction among classes is unclear.
library(rpart.plot)
rpart.plot(cartmodel, fallen.leaves=F)
12.6.3 Prediction using the CART model
Then we can use the model to predict the probabilities of each vegetation class for each pixel (Figure 12.18).
<- predict(sentRCV, cartmodel, na.rm = TRUE)
CARTpred CARTpred
## class : SpatRaster
## dimensions : 543, 468, 6 (nrow, ncol, nlyr)
## resolution : 20, 20 (x, y)
## extent : 715680, 725040, 4419120, 4429980 (xmin, xmax, ymin, ymax)
## coord. ref. : WGS 84 / UTM zone 10N (EPSG:32610)
## source : memory
## names : forest, hydric, mesic, rocky, water, xeric
## min values : 0, 0, 0.000000, 0, 0, 0.00
## max values : 1, 1, 0.990991, 1, 1, 0.96
plot(CARTpred)
Note the ranges of values, each extending to values approaching 1.0. If we had used the 9-category training set described earlier, we would find some categories with very low scores, such that they will never dominate a pixel. We can be sure from the above that we’ll get all categories used in the final output, and this will avoid potential problems in validation.
Now we’ll make a single SpatRaster showing the vegetation/land cover with the highest probability (Figure 12.19).
<- which.max(CARTpred)
LC20m LC20m
## class : SpatRaster
## dimensions : 543, 468, 1 (nrow, ncol, nlyr)
## resolution : 20, 20 (x, y)
## extent : 715680, 725040, 4419120, 4429980 (xmin, xmax, ymin, ymax)
## coord. ref. : WGS 84 / UTM zone 10N (EPSG:32610)
## source : memory
## name : which.max
## min value : 1
## max value : 6
<- names(CARTpred)
cls <- data.frame(id = 1:length(cls), class=cls)
df levels(LC20m) <- df
LC20m
## class : SpatRaster
## dimensions : 543, 468, 1 (nrow, ncol, nlyr)
## resolution : 20, 20 (x, y)
## extent : 715680, 725040, 4419120, 4429980 (xmin, xmax, ymin, ymax)
## coord. ref. : WGS 84 / UTM zone 10N (EPSG:32610)
## source : memory
## categories : class
## name : class
## min value : forest
## max value : xeric
<- c("green4","cyan","gold","grey90","mediumblue","red")
LCcolors plot(LC20m, col=LCcolors)
12.6.4 Validating the model
An important part of the imagery classification process is validation, where we look at how well the model works. The way this is done is pretty easy to understand, and requires have testing data in addition to the training data mentioned above. Testing data can be created in a variety of ways, commonly through field observations but also with finer resolution imagery like drone imagery. Since this is also how we train our data, often you’re selecting some for training, and a separate set for testing.
12.6.4.1 The “overfit model” problem
It’s important to realize that the accuracy we determine is only based on the training and testing data. The accuracy of the prediction of the classification elsewhere will likely be somewhat less than this, and if this is substantial our model is “overfit”. We don’t actually know how overfit a model truly is because that depends on how likely the conditions seen in our training and testing data also occur throughout the rest of the image; if those conditions are common, just not sampled, then the model might actually be pretty well fit.
In thinking about the concept of overfit models and selecting out training (and testing) sets, it’s useful to consider the purpose of our classification and how important it is that our predictions are absolutely reliable. In choosing training sets, accuracy is also important, so we will want to make sure that they are good representatives of the land cover type (assuming that’s our response variable). While some land covers are pretty clear (like streets or buildings), there’s a lot of fuzziness in the world: you might be trying to identify wetland conditions based on the type of vegetation growing, but in a meadow you can commonly find wetland species mixed in with more mesic species – to pick a reliable wetland sample we might want to only pick areas with only wetland species (and this can get tricky since there are many challenges of “obligate wetland” or “facultative wetland” species.) The CART model applied a probability model for each response value, which we then used to derive a single prediction based on the maximum probability.
12.6.4.2 Cross Validation : 20 m Sentinel CART model
In cross-validation, we use one set of data and run the model multiple times, and for each validating with a part of the data not used for training the model. In k-fold cross validation, k represents the number of groups and number of models. The k value can be up to the number of observations, but you do need to consider processing time, and you may not get much more reliable assessments with using all of the observations. We’ll use 5 folds.
The method we’ll use to cross-validate the model and derive various accuracy metrics is based on code provided in the “Remote Sensing with terra” section of https://rspatial.org by Aniruddha Ghosh and Robert J. Hijmans (Hijmans (n.d.)).
set.seed(42)
<- 5 # number of folds
k <- sample(rep(1:k, each = round(nrow(sampdataRCV))/k))
j table(j)
## j
## 1 2 3 4 5
## 71 71 71 71 71
<- list()
x for (k in 1:5) {
<- sampdataRCV[j!= k, ]
train <- sampdataRCV[j == k, ]
test <- rpart(as.factor(class)~., data=train, method = 'class',
cart minsplit = 5)
<- predict(cart, test, na.rm = TRUE)
pclass # assign class to maximum probability
<- apply(pclass, 1, which.max)
pclass # create a data.frame using the reference and prediction
<- cbind(test$class, as.integer(pclass))
x[[k]] }
12.6.4.3 Accuracy and Error Metrics
Now that we have our test data with observed and predicted values, we can derive accuracy metrics. There have been many metrics developed to assess model classification accuracy and error. We can look at accuracy either in terms of how well a model works or what kinds of errors it has; these are complements. We’ll look at Producer’s and User’s accuracies (and their complements omission and commission errors) below, and we’ll start by building a confusion matrix which will allow us to derived the various metrics.
Confusion Matrix One way to look at (and in the process, derive) these metrics is to build a confusion matrix, which puts the observed reference data and predicted classified data in rows and columns. These can be set up in either way, but in both cases the diagonal shows those correctly classified, with the row and column names from the same list of land cover classes that we’ve provided to train the model.
<- do.call(rbind, x)
y <- data.frame(y)
y colnames(y) <- c('observed', 'predicted')
<- table(y) # confusion matrix, a contingency table
conmat # change the name of the classes
colnames(conmat) <- LCclass # sort(unique(sampdataRCV$class))
rownames(conmat) <- LCclass
print(conmat)
## predicted
## observed forest hydric mesic rocky water xeric
## forest 87 0 0 0 1 0
## hydric 0 32 0 0 0 0
## mesic 0 0 107 0 0 6
## rocky 0 0 0 7 0 1
## water 1 0 0 0 20 0
## xeric 0 0 2 0 0 95
Overall Accuracy The overall accuracy is simply the overall ratio of the number correctly classified divided by the total, so the sum of the diagonal divided by n.
<- sum(conmat) # number of total cases
n <- diag(conmat) # number of correctly classified cases per class -- on the diagonal
diag <- sum(diag) / n # overall accuracy
OA OA
## [1] 0.9693593
kappa Another overall measure of accuracy is Cohen’s kappa statistic (Cohen (1960)), which measures inter-rater reliability for categorical data. It’s somewhat similar to the goal of Chi square in that it compares expected vs. observed conditions, only Chi square is limited to two cases or Boolean data.
<- apply(conmat, 1, sum)
rowsums <- rowsums / n # observed (true) cases per class
p <- apply(conmat, 2, sum)
colsums <- colsums / n # predicted cases per class
q <- sum(p*q)
expAccuracy <- (OA - expAccuracy) / (1 - expAccuracy)
kappa kappa
## [1] 0.9594579
Table of Producer’s and User’s accuracies
We can look at accuracy either in terms of how well a model works or what kinds of errors it has; these are complements. Both of these allow us to look at accuracy of classification for each class.
- The so-called “Producer’s Accuracy” refers to how often real conditions on the ground are displayed in the classification, from the perspective of the map producer, and is the complement of errors of omission.
- The so-called “User’s Accuracy” refers to how often the class will actually occur on the ground, so is from the perspective of the user (though this is a bit confusing), and is the complement of errors of commission.
<- diag / colsums # Producer accuracy
PA <- diag / rowsums # User accuracy
UA <- data.frame(producerAccuracy = PA, userAccuracy = UA)
outAcc outAcc
## producerAccuracy userAccuracy
## forest 0.9886364 0.9886364
## hydric 1.0000000 1.0000000
## mesic 0.9816514 0.9469027
## rocky 1.0000000 0.8750000
## water 0.9523810 0.9523810
## xeric 0.9313725 0.9793814
12.7 Classifying with 10 m Sentinel 2 imagery
The 10 m product of Sentinel 2 has much fewer bands (4), but the finer resolution improves its visual precision. The four bands are the most common baseline set used in multispectral imagery: blue (B02, 490 nm), green (B03, 560 nm), red (B04, 665 nm), and NIR (B08, 842 nm). From this we can create color and NIR-R-G images, and the vegetation indices NDVI and NDGI, among others.
For the 10 m product, we won’t repeat many of the exploration steps we looked at above, such as the histograms, but go right to training and classification using CART.
<- paste0("S2A_MSIL2A_20210628T184921_N0300_R113_T10TGK_20210628T230915.",
imgFolder "SAFE\\GRANULE\\L2A_T10TGK_A031427_20210628T185628")
<- paste0("~/sentinel2/",imgFolder,"\\IMG_DATA\\R10m")
img10mFolder <- str_sub(imgFolder,12,26)
imgDateTime imgDateTime
## [1] "20210628T184921"
<- str_sub(imgDateTime,1,8) imgDate
12.7.1 Subset bands (10 m)
The 10-m Sentinel-2 data includes only four bands blue (B02), green (B03), red (B04), and NIR (B08).
<- paste0("B",c("02","03","04","08"))
sentinelBands <- paste0(img10mFolder,"/T10TGK_",imgDateTime,"_",sentinelBands,"_10m.jp2")
sentinelFiles <- rast(sentinelFiles)
sentinel names(sentinel) <- sentinelBands
12.7.2 Crop to RCV extent and extract pixel values
This is similar to what we did for the 20 m data.
<- ext(715680,725040,4419120,4429980)
RCVext <- crop(sentinel,RCVext)
sentRCV <- extract(sentRCV, ptsRCV)[,-1] # REMOVE ID
extRCV <- data.frame(class = ptsRCV$class, extRCV) sampdataRCV
12.7.3 Training the CART model (10 m) and plot the tree
library(rpart); library(rpart.plot)
<- rpart(as.factor(class)~., data = sampdataRCV, method = 'class', minsplit = 5)
cartmodel rpart.plot(cartmodel, fallen.leaves=F)
12.7.4 Prediction using the CART model (10 m)
As with the 20 m prediction, we’ll start by producing a composite of all class probabilities by pixel (Figure 12.20)…
<- predict(sentRCV, cartmodel, na.rm = TRUE)
CARTpred plot(CARTpred)
<- which.max(CARTpred)
LC10m <- names(CARTpred)
cls <- data.frame(id = 1:length(cls), class=cls)
df levels(LC10m) <- df
… and then map the class with the highest probability, for each pixel (Figure 12.21).
plot(LC10m, col=LCcolors)
12.7.4.1 Cross Validation : 10 m Sentinel CART model
set.seed(42)
<- 5 # number of folds
k <- sample(rep(1:k, each = round(nrow(sampdataRCV))/k))
j <- list()
x for (k in 1:5) {
<- sampdataRCV[j!= k, ]
train <- sampdataRCV[j == k, ]
test <- rpart(as.factor(class)~., data=train, method = 'class',
cart minsplit = 5)
<- predict(cart, test, na.rm = TRUE)
pclass # assign class to maximum probablity
<- apply(pclass, 1, which.max)
pclass # create a data.frame using the reference and prediction
<- cbind(test$class, as.integer(pclass))
x[[k]] }
12.7.4.2 Accuracy metrics : 10 m Sentinel CART model
As with the 20 m model, we’ll look at the confusion matrix and the various accuracy metrics.
Confusion Matrix (10 m)
<- do.call(rbind, x)
y <- data.frame(y)
y colnames(y) <- c('observed', 'predicted')
<- table(y) # confusion matrix
conmat # change the name of the classes
colnames(conmat) <- LCclass # sort(unique(sampdataRCV$class))
rownames(conmat) <- LCclass # sort(unique(sampdataRCV$class))
print(conmat)
## predicted
## observed forest hydric mesic rocky water xeric
## forest 87 0 0 0 0 1
## hydric 0 30 0 0 0 2
## mesic 0 0 109 0 0 4
## rocky 0 0 1 5 0 2
## water 1 0 0 0 20 0
## xeric 0 0 0 0 0 97
Overall Accuracy (10 m)
<- sum(conmat) # number of total cases
n <- diag(conmat) # number of correctly classified cases per class -- on the diagonal
diag <- sum(diag) / n # overall accuracy
OA OA
## [1] 0.9693593
kappa (10 m)
<- apply(conmat, 1, sum)
rowsums <- rowsums / n # observed (true) cases per class
p <- apply(conmat, 2, sum)
colsums <- colsums / n # predicted cases per class
q <- sum(p*q)
expAccuracy <- (OA - expAccuracy) / (1 - expAccuracy)
kappa kappa
## [1] 0.9592908
User and Producer accuracy (10 m)
<- diag / colsums # Producer accuracy
PA <- diag / rowsums # User accuracy
UA <- data.frame(producerAccuracy = PA, userAccuracy = UA)
outAcc outAcc
## producerAccuracy userAccuracy
## forest 0.9886364 0.9886364
## hydric 1.0000000 0.9375000
## mesic 0.9909091 0.9646018
## rocky 1.0000000 0.6250000
## water 1.0000000 0.9523810
## xeric 0.9150943 1.0000000
12.8 Using two images to capture spring green-up and summer senescence.
Most plants change their spectral response seasonally, a process referred to as phenology. In these montane meadows, phenological changes start with snow melt, followed by “green-up”, then senescence. 2019 was a particularly dry year, so the period from green-up to senescence was short. We’ll use a 3 June image as green-up, then a 28 June image as senescence.
Spring
library(terra); library(stringr)
<- paste0("S2B_MSIL2A_20210603T184919_N0300_R113_T10TGK_20210603T213609.",
imgFolderSpring "SAFE\\GRANULE\\L2A_T10TGK_A022161_20210603T185928")
<- paste0("~/sentinel2/",imgFolderSpring,"\\IMG_DATA\\R10m")
img10mFolderSpring <- str_sub(imgFolderSpring,12,26) imgDateTimeSpring
Summer
<- paste0("S2A_MSIL2A_20210628T184921_N0300_R113_T10TGK_20210628T230915.",
imgFolderSummer "SAFE\\GRANULE\\L2A_T10TGK_A031427_20210628T185628")
<- paste0("~/sentinel2/",imgFolder,"\\IMG_DATA\\R10m")
img10mFolderSummer <- str_sub(imgFolderSummer,12,26) imgDateTimeSummer
<- paste0("B",c("02","03","04","08"))
sentinelBands <- paste0(img10mFolderSpring,"/T10TGK_",
sentinelFilesSpring "_",sentinelBands,"_10m.jp2")
imgDateTimeSpring,<- rast(sentinelFilesSpring)
sentinelSpring names(sentinelSpring) <- paste0("spring",sentinelBands)
<- paste0(img10mFolderSummer,"/T10TGK_",
sentinelFilesSummer "_",sentinelBands,"_10m.jp2")
imgDateTimeSummer,<- rast(sentinelFilesSummer)
sentinelSummer names(sentinelSummer) <- paste0("summer",sentinelBands)
<- ext(715680,725040,4419120,4429980)
RCVext <- crop(sentinelSpring,RCVext)
sentRCVspring <- crop(sentinelSummer,RCVext) sentRCVsummer
<- sentRCVspring$springB02
blueSp <- sentRCVspring$springB03
greenSp <- sentRCVspring$springB04
redSp <- sentRCVspring$springB08
nirSp <- (0.616 * greenSp + 0.384 * nirSp - redSp)/
ndgiSp 0.616 * greenSp + 0.384 * nirSp + redSp)
(<- sentRCVsummer$summerB02
blueSu <- sentRCVsummer$summerB03
greenSu <- sentRCVsummer$summerB04
redSu <- sentRCVsummer$summerB08
nirSu <- (0.616 * greenSu + 0.384 * nirSu - redSu)/
ndgiSu 0.616 * greenSu + 0.384 * nirSu + redSu) (
12.8.1 Create a 10-band stack from both images
Since we have two imagery dates and also a vegetation index (NDGI) for each, we have a total of 10 variables to work with. So we’ll put these together as a multi-band stack simply using c()
with the various bands.
<- c(blueSp,greenSp,redSp,nirSp,ndgiSp,blueSu,greenSu,redSu,nirSu,ndgiSu)
sentRCV names(sentRCV) <- c("blueSp","greenSp","redSp","nirSp","ndgiSp",
"blueSu","greenSu","redSu","nirSu","ndgiSu")
12.8.2 Extract the training data (10 m spring + summer)
We’ll repeat some of the steps where we read in the training data and set up the land cover classes, reducing the 7 to 6 classes, then extract from our new 10-band image stack the training data.
<- vect(ex("RCVimagery/train_polys7.shp"))
trainPolys <- spatSample(trainPolys, 1000, method="random")
ptsRCV <- c("forest","hydric","mesic","rocky","water","willow","xeric")
LCclass <- data.frame(value=1:length(LCclass),names=LCclass)
classdf <- vect(ex("RCVimagery/train_polys7.shp"))
trainPolys $class=="willow"]$class <- "hydric"
trainPolys[trainPolys<- spatSample(trainPolys, 1000, method="random")
ptsRCV <- c("forest","hydric","mesic","rocky","water","xeric")
LCclass <- data.frame(value=1:length(LCclass),names=LCclass)
classdf
<- terra::extract(sentRCV, ptsRCV)[,-1] # REMOVE ID
extRCV <- data.frame(class = ptsRCV$class, extRCV) sampdataRCV
12.8.3 CART model and prediction (10 m spring + summer)
library(rpart)
# Train the model
<- rpart(as.factor(class)~., data = sampdataRCV, method = 'class', minsplit = 5) cartmodel
The CART model tree for the 2-season 10 m example (Figure 12.22) interestingly doesn’t not end up with anything in the “rocky” pixels…
library(rpart.plot)
rpart.plot(cartmodel, fallen.leaves=F)
… and we can see in the probability map set that all “rocky” pixel probabilities are very low, below 0.03 (Figure 12.23) …
<- predict(sentRCV, cartmodel, na.rm = TRUE)
CARTpred plot(CARTpred)
<- which.max(CARTpred)
LCpheno <- names(CARTpred)
cls <- data.frame(id = 1:length(cls), class=cls)
df levels(LCpheno) <- df
… and thus no “rocky” classes are assigned (Figure 12.24).
<- c("green4","cyan","gold","grey90","mediumblue","red")
LCcolors plot(LCpheno, col=LCcolors)
We might compare this plot with the two previous – the 20 m (Figure 12.25) and 10 m (Figure 12.26) classifications from 28 June.
plot(LC20m, col=LCcolors)
plot(LC10m, col=LCcolors)
12.8.3.1 Cross Validation : CART model for 10 m spring + summer
set.seed(42)
<- 5 # number of folds
k <- sample(rep(1:k, each = round(nrow(sampdataRCV))/k))
j # table(j)
<- list()
x for (k in 1:5) {
<- sampdataRCV[j!= k, ]
train <- sampdataRCV[j == k, ]
test <- rpart(as.factor(class)~., data=train, method = 'class',
cart minsplit = 5)
<- predict(cart, test, na.rm = TRUE)
pclass # assign class to maximum probablity
<- apply(pclass, 1, which.max)
pclass # create a data.frame using the reference and prediction
<- cbind(test$class, as.integer(pclass))
x[[k]] }
12.8.3.2 Accuracy metrics : CART model for 10 m spring + summer
Finally, as we did above, we’ll look at the various accuracy metrics for the classification using the two-image (peak-growth spring and senescent summer) source.
Confusion Matrix
<- do.call(rbind, x)
y <- data.frame(y)
y colnames(y) <- c('observed', 'predicted')
<- table(y) # confusion matrix
conmat # change the name of the classes
colnames(conmat) <- LCclass # sort(unique(sampdataRCV$class))
rownames(conmat) <- LCclass # sort(unique(sampdataRCV$class))
print(conmat)
## predicted
## observed forest hydric mesic rocky water xeric
## forest 96 2 0 0 1 0
## hydric 2 53 0 0 1 0
## mesic 0 0 89 1 0 1
## rocky 0 0 2 0 0 2
## water 1 1 0 0 22 0
## xeric 0 0 0 0 0 99
Overall Accuracy
<- sum(conmat) # number of total cases
n <- diag(conmat) # number of correctly classified cases per class -- on the diagonal
diag <- sum(diag) / n # overall accuracy
OA OA
## [1] 0.9624665
kappa
<- apply(conmat, 1, sum)
rowsums <- rowsums / n # observed (true) cases per class
p <- apply(conmat, 2, sum)
colsums <- colsums / n # predicted cases per class
q <- sum(p*q)
expAccuracy <- (OA - expAccuracy) / (1 - expAccuracy)
kappa kappa
## [1] 0.9513023
User and Producer accuracy
<- diag / colsums # Producer accuracy
PA <- diag / rowsums # User accuracy
UA <- data.frame(producerAccuracy = PA, userAccuracy = UA)
outAcc outAcc
## producerAccuracy userAccuracy
## forest 0.9696970 0.9696970
## hydric 0.9464286 0.9464286
## mesic 0.9780220 0.9780220
## rocky 0.0000000 0.0000000
## water 0.9166667 0.9166667
## xeric 0.9705882 1.0000000
12.9 Conclusions and Next Steps for Imagery Classification
In this chapter, we’ve seen that we can effectively classify imagery (at least produce convincing maps!), but you can probably sense from the varying results that classification isn’t always as consistent as we might expect. We saw something similar with interpolation models; you can get very different results with different methods, and so you need to consider what method and settings might be most appropriate, based on the nature of your data – its scale and density, for instance – and the goals of your research.
The next steps might be to compare some of these classifications with our field data (basically go back to the training set), but I would be tempted to do that in ArcGIS, ERDAS, or maybe QGIS after writing out our classification rasters to geotiffs. We could then also take advantage of the better cartographic capabilities of GIS and Remote Sensing software when working with multi-band imagery.
We’ve really only scratched the surface on remote sensing methods for imagery classification, and as noted above, readers are encouraged to enroll in remote sensing classes at your local university to learn more of its theory and relevant technology. For instance, why did we look at Sentinel-2, and not Landsat, PlanetScope, ASTER, Spot, … (the list goes on)? There are lots of satellites out there, more all the time, and understanding the the spectral-sensing capabilities of their sensors, and the spatial and temporal properties of their missions, is essential for making the right choice for your research.
There’s also the consideration of relative cost, both for the imagery itself and for your time in processing it. We chose Sentinel-2 because it’s free and served the purpose well, since the 10- and 20-m resolution worked at the scale of the study area in question. It also actually classifies more readily than finer resolution imagery. But there’s a lot more to that decision, and we also employ 1- and 5-cm pixel imagery from drone surveys of the same study area, but for a different purpose (erosion studies) and at much lower image capture frequency, given the significant field time required.
Finally, we focused on pixel classifiers. One advantage of these is that they’re reasonably similar to general classification models, whether using machine learning or not. But there are other imagery classification methods developed in the remote-sensing as well as medical-imaging communities, such as object-based classifiers which use vector geometry to aid in the classification of features. As with so much else in remote sensing methods, we’ll leave those and other methods for you to explore.
Exercises
The exercises for this chapter are to apply some of the same methods above to your own downloaded Sentinel2 scene. We won’t use training sets for spectral signatures or supervised classification, but that would be the next step once you’ve collected training data, either in the field, from high-resolution imagery, or from a combination of sources.
Exercise 12.1 Download another Sentinel2 image from Copernicus (you’ll need to create a free account), and use it to create RGB and NIR-R-G image displays of the entire imagery scene.
Exercise 12.2 From your imagery, find coordinates to crop with for an area of interest, then as we’ve done for Red Clover Valley, create clipped RGB and NIR-R-G displays.
Exercise 12.3 Create and display NDVI and NDGI indices from your cropped image.
Exercise 12.4 From the same cropped image, create a k-means unsupervised classification, with your choice of number of classes.