3 Imagery Processing & Classification

Sentinel2 NIR-R-G image of part of the northern Sierra Nevada and Honey Lake, CA, and Pyramid Lake, NV, 20210628

Figure 3.1: Sentinel2 NIR-R-G image of part of the northern Sierra Nevada and Honey Lake, CA, and Pyramid Lake, NV, 20210628

This chapter explores the use of satellite imagery, including display and analysis methods, including the use of machine learning imagery classification of vegetation and other land cover. These methods are important for environmental analysis, and are covered much more thoroughly in the discipline of remote sensing. Some key things to learn more about are the nature of the electromagnetic spectrum and especially bands of that spectrum that are informative for land cover and especially vegetation detection, and the process of imagery classification. We’ll be using satellite imagery that includes a range of the electromagnetic spectrum from visible to short-wave infrared.

3.1 Reading and displaying Sentinel-2 imagery

We’ll work with 10 and 20 m Sentinel-2 data downloaded from Copernicus Open Access Hub (n.d.) as an entire scene from 20210602 and 20210628, providing cloud-free imagery of Red Clover Valley in the northern Sierra Nevada, where we are doing research on biogeochemical cycles and carbon sequestration from meadow restoration with The Sierra Fund (Clover Valley Ranch Restoration, the Sierra Fund (n.d.)).

Download image zip files from

knitr::include_url("https://sfsu.box.com/s/zx9rvbxdss03rammy7d72rb3uumfzjbf")

and extract into the folder ~/sentinel2/ – the ~ referring to your Home folder (typically Documents). (See the code below to see how one of the two images is read.) Make sure the folders stored there have names ending in .SAFE. The following are the two we’ll be using:

  • S2B_MSIL2A_20210603T184919_N0300_R113_T10TGK_20210603T213609.SAFE
  • S2A_MSIL2A_20210628T184921_N0300_R113_T10TGK_20210628T230915.SAFE
library(stringr)
library(terra)
imgFolder <- "S2A_MSIL2A_20210628T184921_N0300_R113_T10TGK_20210628T230915.SAFE\\GRANULE\\L2A_T10TGK_A031427_20210628T185628"
img20mFolder <- paste0("~/sentinel2/",imgFolder,"\\IMG_DATA\\R20m")
imgDateTime <- str_sub(imgFolder,12,26)
imgDateTime
## [1] "20210628T184921"
imgDate <- str_sub(imgDateTime,1,8)

As documented at Copernicus Open Access Hub (n.d.), Sentinel-2 imagery is collected at three resolutions, with the most bands at the coarsest (60m) resolution. The bands added at that coarsest resolution are not critical for our work as they relate to oceanographic and atmospheric research, and our focus will be on land cover and vegetation in a terrestrial area. So we’ll work with four bands at 10m and an additional six bands at 20m resolution:

10 m bands

  • B02 - Blue 0.490 \(\mu\)m
  • B03 - Green 0.560 \(\mu\)m
  • B04 - Red 0.665 \(\mu\)m
  • B08 - NIR 0.842 \(\mu\)m

20 m bands

  • B02 - Blue 0.490 \(\mu\)m
  • B03 - Green 0.560 \(\mu\)m
  • B04 - Red 0.665 \(\mu\)m
  • B05 - Red Edge 0.705 \(\mu\)m
  • B06 - Red Edge 0.740 \(\mu\)m
  • B07 - Red Edge 0.783 \(\mu\)m
  • B8A - NIR 0.865 \(\mu\)m
  • B11 - SWIR 1.610 \(\mu\)m
  • B12 - SWIR 2.190 \(\mu\)m
sentinelBands <- paste0("B",c("02","03","04","05","06","07","8A","11","12"))
sentinelFiles <- paste0(img20mFolder,"/T10TGK_",imgDateTime,"_",sentinelBands,"_20m.jp2")
sentinel <- rast(sentinelFiles)
sentinelBands
## [1] "B02" "B03" "B04" "B05" "B06" "B07" "B8A" "B11" "B12"
names(sentinel) <- sentinelBands

3.1.1 Individual bands

The Sentinel-2 data are in a 12-bit format, ranging as integers from 0 to 4095, but are preprocessed to represent reflectances at the base of the atmosphere.

We’ll start by looking at 4 key bands – green, red, NIR, SWIR (B11) – and plot these in a gray scale.

par(mfrow=c(2,2))
plot(sentinel$B03, main = "Green", col=gray(0:4095 / 4095))
plot(sentinel$B04, main = "Red", col=gray(0:4095 / 4095))
plot(sentinel$B8A, main = "NIR", col=gray(0:4095 / 4095))
plot(sentinel$B11, main = "SWIR (B11)", col=gray(0:4095 / 4095))
Four bands of a Sentinel-2 scene from 20210628.

Figure 3.2: Four bands of a Sentinel-2 scene from 20210628.

3.1.2 Spectral subsets to create 3-band R-G-B & NIR-R-G for visualization

Displaying the imagery 3 bands at a time (displayed as RGB on our computer screen) is always a good place to start, and two especially useful band sets are RGB itself – so looking like a conventional color aerial photography – and “false color” that includes a band normally invisible to our eyes, such as near infrared that reflects chlorophyll in healthy plants. In standard “false color” surface-reflected near infrared is displayed as red, reflected red is displayed as green and green is displayed as blue. Some advantages of this false color display include:

  • Blue is not included at all, which helps reduce haze
  • Water bodies absorb NIR, so appear very dark in the image
  • Chlorophyll strongly reflects NIR, so healthy vegetation is bright red.
sentRGB <- subset(sentinel,3:1)
plotRGB(sentRGB,stretch="lin")
R-G-B image from Sentinel-2 scene 20210628.

Figure 3.3: R-G-B image from Sentinel-2 scene 20210628.

We can also spectrally subset to get NIR (B8A, the 7th band of the 20 m multispectral raster we’ve created). See the first figure in this chapter for the NIR-R-G image displayed with plotRGB.

sentFCC <- subset(sentinel,c(7,3,2))

3.1.3 Crop to study area extent

The Sentinel-2 scene covers a very large area, more than 100 km on each side, 12,000 km2 in area. The area we are interested in – Red Clover Valley – is enclosed in an area of about 100 km2, so we’ll crop the scene to fit this area.

RCVext <- ext(715680,725040,4419120,4429980)
sentRCV <- crop(sentinel,RCVext)
sentRGB <- subset(sentRCV,3:1)
plotRGB(sentRGB,stretch="lin")
Color image from Sentinel-2 of Red Clover Valley, 20210628.

Figure 3.4: Color image from Sentinel-2 of Red Clover Valley, 20210628.

NIR-R-G:

plotRGB(subset(sentRCV,c(7,3,2)),stretch="lin")
NIR-R-G image from Sentinel-2 of Red Clover Valley, 20210628.

Figure 3.5: NIR-R-G image from Sentinel-2 of Red Clover Valley, 20210628.

3.1.4 Saving results

Optionally, you may want to save this cropped image to call it up quicker at a later time. We’ll build a simple name with the date and time as part of it.

writeRaster(sentRCV, filename=paste0("~/sentinel2/",imgDateTime,".tif"),overwrite=T)

3.1.5 Band scatter plots

Using scatter plots helps us to to understand how we might employ multiple bands in our analysis. Red vs NIR is particularly intriguing, commonly creating a triangular pattern reflecting the high absorption of NIR by water yet high reflection by healthy vegetation which is visually green, so low in red.

pairs(c(sentRCV$B04,sentRCV$B8A), main="Red vs NIR")
Relations between Red and NIR bands, Red Clover Valley Sentinel-2 image, 20210628.

Figure 3.6: Relations between Red and NIR bands, Red Clover Valley Sentinel-2 image, 20210628.

3.2 Spectral profiles

A useful way to look at how our imagery responds to known vegetation / land cover at imagery training polygons we’ll use later for supervised classification, we can extract values at points and create a spectral profile. We start by deriving points from the polygons we created based on field observations. One approach is to use polygon centroids (with centroids()), but to get more points from larger polygons, we’ll instead sample the polygons to get a total of ~1000 points.

library(igisci)
sampRCV <- vect(ex("RCVimagery/train_polys9.shp"))
ptsampRCV <- spatSample(sampRCV, 1000, method="random")
# for just centroids:  ptsampRCV <- centroids(sampRCV)
dfRCV <- terra::extract(sentRCV, ptsampRCV)
head(dfRCV)
##   ID B02 B03 B04  B05  B06  B07  B8A  B11  B12
## 1  1 260 559 490 1049 2425 2850 3204 2139 1134
## 2  2 194 479 352  890 2467 2954 3312 1800  923
## 3  3 217 519 387  947 2482 2981 3319 1936 1005
## 4  4 339 639 608 1167 2385 2772 3126 2357 1304
## 5  5 260 559 490 1049 2425 2850 3204 2139 1134
## 6  6 293 584 547 1074 2345 2740 3144 2201 1216
msRCV <- aggregate(dfRCV[,-1], list(ptsampRCV$class), mean)
rownames(msRCV) <- msRCV[,1]
msRCV <- msRCV[,-1]
library(RColorBrewer)
LCcolors <- brewer.pal(length(unique(sampRCV$class)),"Set1")
msRCV <- as.matrix(msRCV)
# Start with an emply plot
plot(0, ylim=c(0,4195), xlim=c(1,9),type='n',
     xlab="Bands",ylab="Reflectance 12 bit", xaxt='n')
axis(side=1,at=1:9,labels=sentinelBands)
# Add the classes
for (i in 1:nrow(msRCV)){
  lines(msRCV[i,],type='l',lwd=4,lty=1, col=LCcolors[i])
}
title(main="Spectral signatures Sentinel2 bands, Red Clover Valley", font.main=2)
legend('topleft',rownames(msRCV),cex=0.8,lty=1,lwd=3,bty='n', col=LCcolors)
Spectral signature of 9-level training polygons, 20 m Sentinel-2 imagery from 20210628.

Figure 3.7: Spectral signature of 9-level training polygons, 20 m Sentinel-2 imagery from 20210628.

One thing we can observe in the spectral signature is that some vegetation types are pretty similar; for instance ‘mesic’ and ‘mesicRush’ are very similar, as are ‘hydric’ and ‘CARNEB’, so we might want to combine them, and have fewer categories. We also have a 7-level polygon classification, so we’ll use that (later we’ll combine values in code).

sampRCV <- vect(ex("RCVimagery/train_polys7.shp"))
ptsampRCV <- spatSample(sampRCV, 1000, method="random")
dfRCV <- terra::extract(sentRCV, ptsampRCV)
msRCV <- aggregate(dfRCV[,-1], list(ptsampRCV$class), mean)
rownames(msRCV) <- msRCV[,1]
msRCV <- msRCV[,-1]
LCcolors <- c("cyan","gold","darkgreen","black","blue","lawngreen","red")
msRCV <- as.matrix(msRCV)
plot(0, ylim=c(0,4195), xlim=c(1,9),type='n',
     xlab="Bands",ylab="Reflectance 12 bit", xaxt='n')
axis(side=1,at=1:9,labels=sentinelBands)
for (i in 1:nrow(msRCV)){
  lines(msRCV[i,],type='l',lwd=4,lty=1, col=LCcolors[i])
}
title(main="Spectral signatures Sentinel2 bands, 7 land cover classes", font.main=2)
legend('topleft',rownames(msRCV),cex=0.8,lty=1,lwd=3,bty='n', col=LCcolors)
Spectral signature of 7-level training polygons, 20 m Sentinel-2 imagery from 20210628.

Figure 3.8: Spectral signature of 7-level training polygons, 20 m Sentinel-2 imagery from 20210628.

From this spectral signature, we can see that at least at this scale, there’s really no difference between the spectral response ‘hydric’ and ‘willow’. The structure of the willows copses can’t be distinguished from the herbaceous hydric sedges with 20 m pixels. Fortunately, in terms of hydrologic response, willows are also hydric plants so we can just recode willow as hydric:

sampRCV[sampRCV$class=="willow"]$class <- "hydric"
ptsampRCV <- spatSample(sampRCV, 1000, method="random")
dfRCV <- terra::extract(sentRCV, ptsampRCV)
msRCV <- aggregate(dfRCV[,-1], list(ptsampRCV$class), mean)
rownames(msRCV) <- msRCV[,1]
msRCV <- msRCV[,-1]
msRCV <- as.matrix(msRCV)
plot(0, ylim=c(0,4195), xlim=c(1,9),type='n',
     xlab="Bands",ylab="Reflectance 12 bit", xaxt='n')
axis(side=1,at=1:9,labels=sentinelBands)
for (i in 1:nrow(msRCV)){
  lines(msRCV[i,],type='l',lwd=4,lty=1, col=LCcolors[i])
}
title(main="Spectral signatures, 6 land-cover classes, RCV", font.main=2)
legend('topleft',rownames(msRCV),cex=0.8,lty=1,lwd=3,bty='n', col=LCcolors)
Spectral signature of 6-level training polygons, 20 m Sentinel-2 imagery from 20210628.

Figure 3.9: Spectral signature of 6-level training polygons, 20 m Sentinel-2 imagery from 20210628.

3.3 Map algebra & vegetation indices

We can also do pixel-wise calculations, the map algebra method (Tomlin (1990)) used in raster GIS. We’ll use it to create vegetation indices, which are widely used as a single quantitative measure to relate to phenological change, moisture and nutrients.

3.3.1 Vegetation indices

The venerable Normalized Difference Vegetation Index (NDVI) uses NIR (Sentinel2 20m: B8A) and RED (Sentinel2 20m: B04) in a ratio that normalizes NIR with respect to visible (usually red):

\[ NDVI = \frac{NIR-RED}{NIR+RED} \]

ndviRCV <- (sentRCV$B8A - sentRCV$B04)/
           (sentRCV$B8A + sentRCV$B04)
plot(ndviRCV, col=rev(terrain.colors(10)), main=paste('NDVI',imgDateTime))
NDVI from Sentinel-2 image, 20210628.

Figure 3.10: NDVI from Sentinel-2 image, 20210628.

3.3.2 Histogram

hist(ndviRCV, main='NDVI values', xlab='NDVI',ylab='Frequency',
     col='wheat',xlim=c(-0.5,1),breaks=30,xaxt='n')
axis(side=1,at=seq(-0.6,1,0.2), labels=seq(-0.6,1,0.2))
NDVI histogram, Sentinel-2 image, 20210628.

Figure 3.11: NDVI histogram, Sentinel-2 image, 20210628.

3.3.3 Other vegetation indices

With the variety of spectral bands available on satellite imagery comes a lot of possibilities for other vegetation (and non-vegetation) indices.

One that has been used to detect moisture content in plants is the Normalized Difference Moisture Index \(NDMI = \frac{NIR-SWIR}{NIR+SWIR}\), which for Sentinel-2 20 m imagery would be \(\frac{B8A-B11}{B8A+B11}\):

ndmiRCV <- (sentRCV$B8A - sentRCV$B11)/
           (sentRCV$B8A + sentRCV$B11)
plot(ndmiRCV, col=rev(terrain.colors(10)), main=paste('NDMI',imgDateTime))
NDMI from Sentinel-2 image, 20210628.

Figure 3.12: NDMI from Sentinel-2 image, 20210628.

hist(ndmiRCV, main='NDMI values', xlab='NDMI',ylab='Frequency',
     col='wheat',xlim=c(-0.5,1),breaks=30,xaxt='n')
axis(side=1,at=seq(-0.6,1,0.2), labels=seq(-0.6,1,0.2))
NDMI histogram, Sentinel-2 image, 20210628.

Figure 3.13: NDMI histogram, Sentinel-2 image, 20210628.

Another useful index – Normalized Difference Greenness Index (NDGI) uses three bands: NIR, Red, and Green (Yang et al. (2019)), and was first proposed for MODIS but can be varied slightly for Sentinel 2. It has the advantage of using the more widely available 4-band imagery such as is available in the 10 m Sentinel 2 product, 3 m PlanetScope, or multispectral drone imagery from MicaSense which is commonly 5 cm. We’ll stick with the 20 m Sentinel 2 product for now.

\[ NDGI=\frac{0.616GREEN+0.384NIR-RED}{0.616GREEN+0.384NIR+RED} \]

ndgiRCV <- (0.616 * sentRCV$B03 + 0.384 * sentRCV$B8A - sentRCV$B04)/
           (0.616 * sentRCV$B03 + 0.384 * sentRCV$B8A + sentRCV$B04)
hist(ndgiRCV, main='NDGI values', xlab='NDGI',ylab='Frequency',
     col='wheat',xlim=c(-0.5,1),breaks=30,xaxt='n')
axis(side=1,at=seq(-0.6,1,0.2), labels=seq(-0.6,1,0.2))
NDGI histogram, Sentinel-2 image, 20210628.

Figure 3.14: NDGI histogram, Sentinel-2 image, 20210628.

3.4 Unsupervised Classification with k-means

We’ll look at supervised classification next, where we’ll provide the classifier known vegetation or land cover types, but with unsupervised we just provide the number of classes. We’ll use the NDGI we just derived above, and start by creating a vector from the values:

vals <- values(ndgiRCV)
str(vals)
##  num [1:254124, 1] 0.356 0.429 0.4 0.359 0.266 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : NULL
##   ..$ : chr "B03"

We’ll then run the kmeans function to derive clusters:

kmc <- kmeans(na.omit(vals), centers=5, iter.max=500, nstart=5, algorithm="Lloyd")

Now we convert the vector back into a matrix with the same dimensions as the original image.

kimg <- ndgiRCV
values(kimg) <- kmc$cluster
plot(ndgiRCV, col=rev(terrain.colors(10)), main=paste('NDGI',imgDateTime))
NDGI from Sentinel-2 image, 20210628.

Figure 3.15: NDGI from Sentinel-2 image, 20210628.

library(RColorBrewer)
LCcolors <- brewer.pal(length(unique(values(kimg))),"Set1")
plot(kimg, main = 'Unsupervised classification', col = LCcolors, type="classes")
Unsupervised k-means classification, Red Clover Valley, Sentinel-2, 20210628.

Figure 3.16: Unsupervised k-means classification, Red Clover Valley, Sentinel-2, 20210628.

3.5 Machine Learning for Supervised Classification of Imagery

Using machine learning algorithms is one approach to imagery classification, whether that classification is object-based or pixel-based. We’re going to focus on a pixel-based approach to identify land cover from imagery or other continuous raster variables (such as elevation or elevation-derived rasters such as slope, curvature, and roughness), employing samples of those variables called training samples in a “supervised” imagery classification.

It’s useful to realize that the modeling methods we use for this type of imagery classification are really no different at the core from methods we’d use to work with continuous variables that might not even be spatial. For example, if we leave off the classification process for a moment, a machine learning algorithm might be used to predict a response result from a series of predictor variables, like predicting temperature from elevation and latitude, or acceleration might be predicted by some force acting on a body. So the first model might be used to predict the temperature of any location (within the boundary of the study area) given an elevation and latitude; or the second model might predict an acceleration given the magnitude of a force applied to the body (maybe of a given mass). A classification model varies on this by predicting a nominal variable like type of land cover; some other types of responses might be counts (using a Poisson model) or probabilities (using a logistic model).

The imagery classification approach adds to this model an input preparation process and an output prediction process:

  • A training set of points and polygons are created that represent areas of known classification such as land cover like forest or wetland, used to identify values of the predictor variables (e.g. imagery bands)
  • A predicted raster is created using the model applied to the original rasters.

3.6 Supervised Classification Using Classification and Regression Trees (CART)

We’ll use a machine learning model called Classification and Regression Trees (CART) that is one of the simplest to understand in terms of the results it produces, which includes a display of the resulting classification decision “tree”. We have collected field observations of vegetation/land cover types in our meadow, so can see if we can use these to train a classification model.

We’ll start by reading in the data again so this section doesn’t depend on the above code.

3.6.1 Read imagery and training data and extract sample values for training

library(terra); library(stringr)
imgFolder <- paste0("S2A_MSIL2A_20210628T184921_N0300_R113_T10TGK_20210628T230915.",
                    "SAFE\\GRANULE\\L2A_T10TGK_A031427_20210628T185628")
img20mFolder <- paste0("~/sentinel2/",imgFolder,"\\IMG_DATA\\R20m")
imgDateTime <- str_sub(imgFolder,12,26)
sentinelBands <- paste0("B",c("02","03","04","05","06","07","8A","11","12"))
sentinelFiles <- paste0(img20mFolder,"/T10TGK_",imgDateTime,"_",sentinelBands,"_20m.jp2")
sentinel <- rast(sentinelFiles)
names(sentinel) <- sentinelBands
RCVext <- ext(715680,725040,4419120,4429980)
sentRCV <- crop(sentinel,RCVext)
sampRCV <- vect(ex("RCVimagery/train_polys7.shp"))
sampRCV[sampRCV$class=="willow"]$class <- "hydric"
ptsampRCV <- spatSample(sampRCV, 1000, method="random")
plot(ptsampRCV, "class")
Training samples

Figure 3.17: Training samples

LCclass <- c("forest","hydric","mesic","rocky","water","xeric")
classdf <- data.frame(value=1:length(LCclass),names=LCclass)
dfRCV <- extract(sentRCV, ptsampRCV)[,-1] # REMOVE ID
sampdataRCV <- data.frame(class = ptsampRCV$class, dfRCV)

3.6.2 Training the CART model

The Recursive Partitioning and Regression Trees package rpart fits the model.

library(rpart)
cartmodel <- rpart(as.factor(class)~., data = sampdataRCV, method = 'class', minsplit = 5)

Now we can display the decision tree using the rpart.plot. This function clearly shows where each variable (in our case imagery spectral bands) contribute to the classification, and together with the spectral signature analysis above, helps us evaluate how the process works and possibly how to improve our training data where distinction among classes is unclear.

library(rpart.plot)
rpart.plot(cartmodel, fallen.leaves=F)
CART Decision Tree, Sentinel-2 20 m 20210628.

Figure 3.18: CART Decision Tree, Sentinel-2 20 m 20210628.

3.6.3 Prediction using the CART model

Then we can use the model to predict the probabilities of each vegetation class for each pixel.

classified <- predict(sentRCV, cartmodel, na.rm = TRUE)
classified
## class       : SpatRaster 
## dimensions  : 543, 468, 6  (nrow, ncol, nlyr)
## resolution  : 20, 20  (x, y)
## extent      : 715680, 725040, 4419120, 4429980  (xmin, xmax, ymin, ymax)
## coord. ref. : WGS 84 / UTM zone 10N (EPSG:32610) 
## source      : memory 
## names       :    forest,    hydric,     mesic,     rocky,     water,     xeric 
## min values  :         0,         0,         0,         0,         0,         0 
## max values  : 0.9646018, 1.0000000, 1.0000000, 1.0000000, 1.0000000, 0.9560440
plot(classified)
CART classification, probabilities of each class, Sentinel-2 20 m 20210628.

Figure 3.19: CART classification, probabilities of each class, Sentinel-2 20 m 20210628.

Note the ranges of values, each extending to values approaching 1.0. If we had used the 9-category training set described earlier, we would find some categories with very low scores, such that they will never dominate a pixel. We can be sure from the above that we’ll get all categories used in the final output, and this will avoid potential problems in validation.

Now we’ll make a single SpatRaster showing the vegetation/land cover with the highest probability.

LC20m <- which.max(classified)
LC20m
## class       : SpatRaster 
## dimensions  : 543, 468, 1  (nrow, ncol, nlyr)
## resolution  : 20, 20  (x, y)
## extent      : 715680, 725040, 4419120, 4429980  (xmin, xmax, ymin, ymax)
## coord. ref. : WGS 84 / UTM zone 10N (EPSG:32610) 
## source      : memory 
## name        : which.max 
## min value   :         1 
## max value   :         6
cls <- names(classified)
df <- data.frame(id = 1:length(cls), class=cls)
levels(LC20m) <- df
LC20m
## class       : SpatRaster 
## dimensions  : 543, 468, 1  (nrow, ncol, nlyr)
## resolution  : 20, 20  (x, y)
## extent      : 715680, 725040, 4419120, 4429980  (xmin, xmax, ymin, ymax)
## coord. ref. : WGS 84 / UTM zone 10N (EPSG:32610) 
## source      : memory 
## name        :  class 
## min value   : forest 
## max value   :  xeric
LCcolors <- c("green4","cyan","gold","grey90","mediumblue","red")
plot(LC20m, col=LCcolors)
CART classification, highest probability class, Sentinel-2 20 m 20210628.

Figure 3.20: CART classification, highest probability class, Sentinel-2 20 m 20210628.

3.6.4 Validating the model

An important part of the imagery classification process is validation, where we look at how well the model works. The way this is done is pretty easy to understand, and requires have testing data in addition to the training data mentioned above. Testing data can be created in a variety of ways, commonly through field observations but also with finer resolution imagery like drone imagery. Since this is also how we train our data, often you’re selecting some for training, and a separate set for testing.

3.6.4.1 The “overfit model” problem

It’s important to realize that the accuracy we determine is only based on the training and testing data. The accuracy of the prediction of the classification elsewhere will likely be somewhat less than this, and if this is substantial our model is “overfit”. We don’t actually know how overfit a model truly is because that depends on how likely the conditions seen in our training and testing data also occur throughout the rest of the image; if those conditions are common, just not sampled, then the model might actually be pretty well fit.

In thinking about the concept of overfit models and selecting out training (and testing) sets, it’s useful to consider the purpose of our classification and how important it is that our predictions are absolutely reliable. In choosing training sets, accuracy is also important, so we will want to make sure that they are good representatives of the land cover type (assuming that’s our response variable). While some land covers are pretty clear (like streets or buildings), there’s a lot of fuzziness in the world: you might be trying to identify wetland conditions based on the type of vegetation growing, but in a meadow you can commonly find wetland species mixed in with more mesic species – to pick a reliable wetland sample we might want to only pick areas with only wetland species (and this can get tricky since there are many challenges of “obligate wetland” or “facultative wetland” species.) The CART model applied a probability model for each response value, which we then used to derive a single prediction based on the maximum probability.

3.6.4.2 Cross Validation : 20 m Sentinel CART model

In cross-validation, we use one set of data and run the model multiple times, and for each validating with a part of the data not used for training the model. In k-fold cross validation, k represents the number of groups and number of models. The k value can be up to the number of observations, but you do need to consider processing time, and you may not get much more reliable assessments with using all of the observations. We’ll use 5 folds.

set.seed(42)
k <- 5 # number of folds
j <- sample(rep(1:k, each = round(nrow(sampdataRCV))/k))
table(j)
## j
##  1  2  3  4  5 
## 74 74 74 74 74
x <- list()
for (k in 1:5) {
    train <- sampdataRCV[j!= k, ]
    test <- sampdataRCV[j == k, ]
    cart <- rpart(as.factor(class)~., data=train, method = 'class',
                  minsplit = 5)
    pclass <- predict(cart, test, na.rm = TRUE)
    # assign class to maximum probablity
    pclass <- apply(pclass, 1, which.max)
    # create a data.frame using the reference and prediction
    x[[k]] <- cbind(test$class, as.integer(pclass))
}

Confusion Matrix

y <- do.call(rbind, x)
y <- data.frame(y)
colnames(y) <- c('observed', 'predicted')
conmat <- table(y) # confusion matrix
# change the name of the classes
colnames(conmat) <- LCclass # sort(unique(sampdataRCV$class))
rownames(conmat) <- LCclass # sort(unique(sampdataRCV$class))
print(conmat)
##         predicted
## observed forest hydric mesic rocky water xeric
##   forest    109      0     0     0     0     0
##   hydric      1     41     0     0     0     1
##   mesic       0      0    97     0     0     3
##   rocky       0      0     1     6     0     1
##   water       2      0     0     0    23     0
##   xeric       0      0     1     0     0    86

Overall Accuracy

n <- sum(conmat) # number of total cases
diag <- diag(conmat) # number of correctly classified cases per class -- on the diagonal
OA <- sum(diag) / n # overall accuracy
OA
## [1] 0.9731183

kappa

rowsums <- apply(conmat, 1, sum) 
p <- rowsums / n # observed (true) cases per class
colsums <- apply(conmat, 2, sum)
q <- colsums / n # predicted cases per class
expAccuracy <- sum(p*q)
kappa <- (OA - expAccuracy) / (1 - expAccuracy)
kappa
## [1] 0.9648967

Producer’s and User’s accuracies

We can look at accuracy either in terms of how well a model works or what kinds of errors it has; these are complements.

  • The so-called “Producer’s Accuracy” refers to how often real conditions on the ground are displayed in the classification, from the perspective of the map producer, and is the complement of errors of omission.
  • The so-called “User’s Accuracy” refers to how often the class will actually occur on the ground, so is from the perspective of the user (though this is a bit confusing), and is the complement of errors of commission.
PA <- diag / colsums # Producer accuracy
UA <- diag / rowsums # User accuracy
outAcc <- data.frame(producerAccuracy = PA, userAccuracy = UA)
outAcc
##        producerAccuracy userAccuracy
## forest        0.9732143    1.0000000
## hydric        1.0000000    0.9534884
## mesic         0.9797980    0.9700000
## rocky         1.0000000    0.7500000
## water         1.0000000    0.9200000
## xeric         0.9450549    0.9885057

3.7 Classifying with 10 m Sentinel 2 imagery

The 10 m product of Sentinel 2 has much fewer bands (4), but the finer resolution improves its visual precision. The four bands are the most common baseline set used in multispectral imagery: blue (B02, 490 nm), green (B03, 560 nm), red (B04, 665 nm), and NIR (B08, 842 nm). From this we can create color and NIR-R-G images, and the vegetation indices NDVI and NDGI, among others.

For the 10 m product, we won’t repeat many of the exploration steps we looked at above, such as the histograms, but go right to training and classification using CART.

imgFolder <- paste0("S2A_MSIL2A_20210628T184921_N0300_R113_T10TGK_20210628T230915.",
                    "SAFE\\GRANULE\\L2A_T10TGK_A031427_20210628T185628")
img10mFolder <- paste0("~/sentinel2/",imgFolder,"\\IMG_DATA\\R10m")
imgDateTime <- str_sub(imgFolder,12,26)
imgDateTime
## [1] "20210628T184921"
imgDate <- str_sub(imgDateTime,1,8)

3.7.1 Extract bands (10 m)

sentinelBands <- paste0("B",c("02","03","04","08"))
sentinelFiles <- paste0(img10mFolder,"/T10TGK_",imgDateTime,"_",sentinelBands,"_10m.jp2")
sentinel <- rast(sentinelFiles)
names(sentinel) <- sentinelBands

3.7.2 Crop to RCV extent (10 m)

RCVext <- ext(715680,725040,4419120,4429980)
sentRCV <- crop(sentinel,RCVext)

3.7.3 Extract 10 m pixel values as dfRCV

dfRCV <- extract(sentRCV, ptsampRCV)[,-1] # REMOVE ID
head(dfRCV,n=10)
##    B02 B03 B04  B08
## 1  179 460 360 3122
## 2  192 504 341 3022
## 3  169 463 347 3008
## 4  191 510 390 3068
## 5  183 471 340 3022
## 6  274 570 436 3096
## 7  239 526 437 2988
## 8  232 525 393 3054
## 9  184 469 363 3150
## 10 299 590 486 2930
sampdataRCV <- data.frame(class = ptsampRCV$class, dfRCV)

3.7.4 Training the CART model (10 m)

library(rpart)
# Train the model
cartmodel <- rpart(as.factor(class)~., data = sampdataRCV, method = 'class', minsplit = 5)
#print(cartmodel)
library(rpart.plot)
rpart.plot(cartmodel, fallen.leaves=F)

3.7.5 Prediction using the CART model (10 m)

classified <- predict(sentRCV, cartmodel, na.rm = TRUE)
plot(classified)
CART classification, probabilities of each class, Sentinel-2 10 m 20210628.

Figure 3.21: CART classification, probabilities of each class, Sentinel-2 10 m 20210628.

LC10m <- which.max(classified)
cls <- names(classified)
df <- data.frame(id = 1:length(cls), class=cls)
levels(LC10m) <- df
plot(LC10m, col=LCcolors)
CART classification, highest probability class, Sentinel-2 10 m 20210628.

Figure 3.22: CART classification, highest probability class, Sentinel-2 10 m 20210628.

3.7.5.1 Cross Validation : 10 m Sentinel CART model

set.seed(42)
k <- 5 # number of folds
j <- sample(rep(1:k, each = round(nrow(sampdataRCV))/k))
table(j)
## j
##  1  2  3  4  5 
## 74 74 74 74 74
x <- list()
for (k in 1:5) {
    train <- sampdataRCV[j!= k, ]
    test <- sampdataRCV[j == k, ]
    cart <- rpart(as.factor(class)~., data=train, method = 'class',
                  minsplit = 5)
    pclass <- predict(cart, test, na.rm = TRUE)
    # assign class to maximum probablity
    pclass <- apply(pclass, 1, which.max)
    # create a data.frame using the reference and prediction
    x[[k]] <- cbind(test$class, as.integer(pclass))
}

Confusion Matrix

y <- do.call(rbind, x)
y <- data.frame(y)
colnames(y) <- c('observed', 'predicted')
conmat <- table(y) # confusion matrix
# change the name of the classes
colnames(conmat) <- LCclass # sort(unique(sampdataRCV$class))
rownames(conmat) <- LCclass # sort(unique(sampdataRCV$class))
print(conmat)
##         predicted
## observed forest hydric mesic rocky water xeric
##   forest    109      0     0     0     0     0
##   hydric      0     42     0     0     0     1
##   mesic       0      0    97     0     0     3
##   rocky       0      0     2     6     0     0
##   water       1      0     0     1    23     0
##   xeric       1      0     0     0     0    86

Overall Accuracy

n <- sum(conmat) # number of total cases
diag <- diag(conmat) # number of correctly classified cases per class -- on the diagonal
OA <- sum(diag) / n # overall accuracy
OA
## [1] 0.9758065

kappa

rowsums <- apply(conmat, 1, sum) 
p <- rowsums / n # observed (true) cases per class
colsums <- apply(conmat, 2, sum)
q <- colsums / n # predicted cases per class
expAccuracy <- sum(p*q)
kappa <- (OA - expAccuracy) / (1 - expAccuracy)
kappa
## [1] 0.9684502

User and Producer accuracy

PA <- diag / colsums # Producer accuracy
UA <- diag / rowsums # User accuracy
outAcc <- data.frame(producerAccuracy = PA, userAccuracy = UA)
outAcc
##        producerAccuracy userAccuracy
## forest        0.9819820    1.0000000
## hydric        1.0000000    0.9767442
## mesic         0.9797980    0.9700000
## rocky         0.8571429    0.7500000
## water         1.0000000    0.9200000
## xeric         0.9555556    0.9885057

3.8 Using two images to capture spring green-up and summer senescence.

Most plants change their spectral response seasonally, a process referred to as phenology. In these montane meadows, phenological changes start with snow melt, followed by “green-up”, then senescence. 2019 was a particularly dry year, so the period from green-up to senescence was short. We’ll use a 3 June image as green-up, then a 28 June image as senescence.

Spring

library(terra); library(stringr)
imgFolderSpring <- "S2B_MSIL2A_20210603T184919_N0300_R113_T10TGK_20210603T213609.SAFE\\GRANULE\\L2A_T10TGK_A022161_20210603T185928"
img10mFolderSpring <- paste0("~/sentinel2/",imgFolderSpring,"\\IMG_DATA\\R10m")
imgDateTimeSpring <- str_sub(imgFolderSpring,12,26)

Summer

imgFolderSummer <- "S2A_MSIL2A_20210628T184921_N0300_R113_T10TGK_20210628T230915.SAFE\\GRANULE\\L2A_T10TGK_A031427_20210628T185628"
img10mFolderSummer <- paste0("~/sentinel2/",imgFolder,"\\IMG_DATA\\R10m")
imgDateTimeSummer <- str_sub(imgFolderSummer,12,26)
sentinelBands <- paste0("B",c("02","03","04","08"))
sentinelFilesSpring <- paste0(img10mFolderSpring,"/T10TGK_",imgDateTimeSpring,"_",sentinelBands,"_10m.jp2")
sentinelSpring <- rast(sentinelFilesSpring)
names(sentinelSpring) <- paste0("spring",sentinelBands)

sentinelFilesSummer <- paste0(img10mFolderSummer,"/T10TGK_",imgDateTimeSummer,"_",sentinelBands,"_10m.jp2")
sentinelSummer <- rast(sentinelFilesSummer)
names(sentinelSummer) <- paste0("summer",sentinelBands)
RCVext <- ext(715680,725040,4419120,4429980)
sentRCVspring <- crop(sentinelSpring,RCVext)
sentRCVsummer <- crop(sentinelSummer,RCVext)
blueSp <- sentRCVspring$springB02
greenSp <- sentRCVspring$springB03
redSp <- sentRCVspring$springB04
nirSp <- sentRCVspring$springB08
ndgiSp <- (0.616 * greenSp + 0.384 * nirSp - redSp)/(0.616 * greenSp + 0.384 * nirSp + redSp)
blueSu <- sentRCVsummer$summerB02
greenSu <- sentRCVsummer$summerB03
redSu <- sentRCVsummer$summerB04
nirSu <- sentRCVsummer$summerB08
ndgiSu <- (0.616 * greenSu + 0.384 * nirSu - redSu)/(0.616 * greenSu + 0.384 * nirSu + redSu)

3.8.1 Create a 10-band stack from both images

sentRCV <- c(blueSp,greenSp,redSp,nirSp,ndgiSp,blueSu,greenSu,redSu,nirSu,ndgiSu)
names(sentRCV) <- c("blueSp","greenSp","redSp","nirSp","ndgiSp",
                    "blueSu","greenSu","redSu","nirSu","ndgiSu")

3.8.2 Read in training data

sampRCV <- vect(ex("RCVimagery/train_polys7.shp"))
ptsampRCV <- spatSample(sampRCV, 1000, method="random")

3.8.3 Identify LC classes and assign colors

LCclass <- c("forest","hydric","mesic","rocky","water","willow","xeric")
classdf <- data.frame(value=1:length(LCclass),names=LCclass)
sampRCV <- vect(ex("RCVimagery/train_polys7.shp"))
sampRCV[sampRCV$class=="willow"]$class <- "hydric"
ptsampRCV <- spatSample(sampRCV, 1000, method="random")
LCclass <- c("forest","hydric","mesic","rocky","water","xeric")
classdf <- data.frame(value=1:length(LCclass),names=LCclass)
dfRCV <- terra::extract(sentRCV, ptsampRCV)[,-1] # REMOVE ID
sampdataRCV <- data.frame(class = ptsampRCV$class, dfRCV)

3.8.4 CART model (10 m spring + summer)

library(rpart)
# Train the model
cartmodel <- rpart(as.factor(class)~., data = sampdataRCV, method = 'class', minsplit = 5)
library(rpart.plot)
rpart.plot(cartmodel, fallen.leaves=F)
CART decision tree, Sentinel 10-m, spring and summer 2021 images

Figure 3.23: CART decision tree, Sentinel 10-m, spring and summer 2021 images

3.8.5 Prediction using the CART model (10 m spring + summer)

classified <- predict(sentRCV, cartmodel, na.rm = TRUE)
plot(classified)
CART classification, probabilities of each class, Sentinel-2 10 m, 2021 spring and summer phenology

Figure 3.24: CART classification, probabilities of each class, Sentinel-2 10 m, 2021 spring and summer phenology

LCpheno <- which.max(classified)
cls <- names(classified)
df <- data.frame(id = 1:length(cls), class=cls)
levels(LCpheno) <- df
LCcolors <- c("green4","cyan","gold","grey90","mediumblue","red")
plot(LCpheno, col=LCcolors)
CART classification, highest probability class, Sentinel-2 10 m, 2021 spring and summer phenology

Figure 3.25: CART classification, highest probability class, Sentinel-2 10 m, 2021 spring and summer phenology

We might compare this plot with the two previous – the 20 m and 10 m classifications from 28 June.

plot(LC20m, col=LCcolors)
Classification of Sentinel-2 20 m image.

Figure 3.26: Classification of Sentinel-2 20 m image.

plot(LC10m, col=LCcolors)
Classification of Sentinel-2 10 m spring and summer images.

Figure 3.27: Classification of Sentinel-2 10 m spring and summer images.

3.8.5.1 Cross Validation : CART model for 10 m spring + summer

set.seed(42)
k <- 5 # number of folds
j <- sample(rep(1:k, each = round(nrow(sampdataRCV))/k))
table(j)
## j
##  1  2  3  4  5 
## 79 79 79 79 79
x <- list()
for (k in 1:5) {
    train <- sampdataRCV[j!= k, ]
    test <- sampdataRCV[j == k, ]
    cart <- rpart(as.factor(class)~., data=train, method = 'class',
                  minsplit = 5)
    pclass <- predict(cart, test, na.rm = TRUE)
    # assign class to maximum probablity
    pclass <- apply(pclass, 1, which.max)
    # create a data.frame using the reference and prediction
    x[[k]] <- cbind(test$class, as.integer(pclass))
}

Confusion Matrix

y <- do.call(rbind, x)
y <- data.frame(y)
colnames(y) <- c('observed', 'predicted')
conmat <- table(y) # confusion matrix
# change the name of the classes
colnames(conmat) <- LCclass # sort(unique(sampdataRCV$class))
rownames(conmat) <- LCclass # sort(unique(sampdataRCV$class))
print(conmat)
##         predicted
## observed forest hydric mesic rocky water xeric
##   forest    112      3     0     0     0     0
##   hydric      1     50     0     0     0     0
##   mesic       0      0   114     0     0     1
##   rocky       0      0     0     2     0     2
##   water       0      1     0     0    21     0
##   xeric       0      0     1     4     0    86

Overall Accuracy

n <- sum(conmat) # number of total cases
diag <- diag(conmat) # number of correctly classified cases per class -- on the diagonal
OA <- sum(diag) / n # overall accuracy
OA
## [1] 0.9673367

kappa

rowsums <- apply(conmat, 1, sum) 
p <- rowsums / n # observed (true) cases per class
colsums <- apply(conmat, 2, sum)
q <- colsums / n # predicted cases per class
expAccuracy <- sum(p*q)
kappa <- (OA - expAccuracy) / (1 - expAccuracy)
kappa
## [1] 0.9571848

User and Producer accuracy

PA <- diag / colsums # Producer accuracy
UA <- diag / rowsums # User accuracy
outAcc <- data.frame(producerAccuracy = PA, userAccuracy = UA)
outAcc
##        producerAccuracy userAccuracy
## forest        0.9911504    0.9739130
## hydric        0.9259259    0.9803922
## mesic         0.9913043    0.9913043
## rocky         0.3333333    0.5000000
## water         1.0000000    0.9545455
## xeric         0.9662921    0.9450549

References

Clover Valley Ranch Restoration, the Sierra Fund. n.d. https://sierrafund.org/clover-valley-ranch/.
Copernicus Open Access Hub. n.d. https://scihub.copernicus.eu/.
Tomlin, C. Dana. 1990. Geographic Information Systems and Cartographic Modeling. Englewood Cliffs, N.J: Prentice Hall.
Yang, W., H. Kobayashi, C. Wang, J. Shen M. abd Chen, B. Matsushita, Y. Tang, Y. Kim, et al. 2019. “A Semi-Analytical Snow-Free Vegetation Index for Improving Estimation of Plant Phenology in Tundra and Grassland Ecosystems.” Remote Sensing of Environment 228: 31–44.