Chapter 15 Further Issues IV: Quality Control

When someone is honestly 55% right, that’s very good and there’s no use wrangling. And if someone is 60% right, it’s wonderful, it’s great luck, and let him thank God. But what’s to be said about 75% right? Wise people say this is suspicious. Well, and what about 100% right? Whoever says he’s 100% right is a fanatic, a thug, and the worst kind of rascal.

— Quote assigned to an elder Jewish man from Galicia, The Captive Mind by Czeslaw Miłosz (1953), translation by Jane Zielonko

Stand up You’ve got to manage I won’t sympathize any more. And if you complain once more You’ll meet an army of me.

— Björk, Army of Me (1995)

Quality control is essential to making good inferences from matric projection analysis. Package lefko3 was made with quality control as a top priority. In this chapter, we will look at a number of the quality control tools and options that are included in this package.

15.1 Quality control in life history models and vertical datasets

Vertical datasets are standardized using the functions verticalize3() and historicalize3(). These functions also offer quality control options, particularly to assess whether the standardized datasets and the life history models match properly.

Let’s try an example of using th quality control features of these functions. In the code below, we set up a stageframe for the Cypripedium dataset. However, we have deliberately assigned a smaller binwidth for the Small adult class. This will cause function verticalize3() to fail in stage assignment for some portion of the data. Let’s see this in action.

data(cypdata)

sizevector <- c(0, 0, 0, 0, 0, 0, 1, 3, 6, 11, 19.5)
stagevector <- c("SD", "P1", "P2", "P3", "SL", "D", "XSm", "Sm", "Md", "Lg",
  "XLg")
repvector <- c(0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1)
obsvector <- c(0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1)
matvector <- c(0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1)
immvector <- c(0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0)
propvector <- c(1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
indataset <- c(0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1)
binvec <- c(0, 0, 0, 0, 0, 0.5, 0.5, 0.5, 1.5, 3.5, 5) # 8th entry originally 1.5
comments <- c("Dormant seed", "1st yr protocorm", "2nd yr protocorm",
  "3rd yr protocorm", "Seedling", "Dormant adult",
  "Extra small adult (1 shoot)", "Small adult (2-4 shoots)",
  "Medium adult (5-7 shoots)", "Large adult (8-14 shoots)",
  "Extra large adult (>14 shoots)")
cypframe_raw <- sf_create(sizes = sizevector, stagenames = stagevector, 
  repstatus = repvector, obsstatus = obsvector, matstatus = matvector,
  propstatus = propvector, immstatus = immvector, indataset = indataset, 
  binhalfwidth = binvec, comments = comments)

cypraw_v1 <- verticalize3(data = cypdata, noyears = 6, firstyear = 2004, 
  patchidcol = "patch", individcol = "plantid", blocksize = 4,
  sizeacol = "Inf2.04", sizebcol = "Inf.04", sizeccol = "Veg.04",
  repstracol = "Inf.04", repstrbcol = "Inf2.04", fecacol = "Pod.04",
  stageassign = cypframe_raw, stagesize = "sizeadded", NAas0 = TRUE,
  NRasRep = TRUE, age_offset = 4)
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.

The output shows quite a few repeated warnings about some stages in the dataset not matching the input life history model as programmed in the stageframe. Let’s take a look more closely, using the summary_hfv() function. We will suppress the output as the data frame of errors is quite large.

> 
> This hfv dataset contains 320 rows, 57 variables, 1 population, 
> 3 patches, 74 individuals, and 5 time steps.
> Problems in stage assignment identified in rows:
> 
>   [1]   2   5   9  10  11  12  14  16  19  21  24  26  27  28  29  33  34  38
>  [19]  39  40  42  46  48  49  52  53  54  59  64  66  69  73  74  75  76  78
>  [37]  80  83  85  88  90  91  92  93  94  98  99 101 102 103 104 105 106 107
>  [55] 111 112 113 114 115 116 117 118 119 120 125 129 131 134 135 136 140 141
>  [73] 142 143 145 148 149 150 152 155 157 158 159 160 161 166 168 169 170 171
>  [91] 172 173 174 178 179 180 181 182 183 184 185 188 190 194 195 199 200 204
> [109] 205 209 212 213 216 220 221 222 223 224 229 232 233 234 236 237 239 241
> [127] 243 244 245 246 247 248 250 252 256 257 259 260 261 262 265 266 270 271
> [145] 272 274 275 276 283 285 286 291 294 295 296 298 300 303 304 306 309 310
> [163] 312 316 317 318 319 320
>      rowid          popid           patchid    individ           year2     
>  Min.   : 1.00   Length:320         A: 93   Min.   : 164.0   Min.   :2004  
>  1st Qu.:21.00   Class :character   B:154   1st Qu.: 391.0   1st Qu.:2005  
>  Median :37.50   Mode  :character   C: 73   Median : 453.0   Median :2006  
>  Mean   :38.45                              Mean   : 651.5   Mean   :2006  
>  3rd Qu.:56.00                              3rd Qu.: 476.0   3rd Qu.:2007  
>  Max.   :77.00                              Max.   :1560.0   Max.   :2008  
>    firstseen       lastseen        obsage       obslifespan   
>  Min.   :2004   Min.   :2004   Min.   :5.000   Min.   :0.000  
>  1st Qu.:2004   1st Qu.:2009   1st Qu.:6.000   1st Qu.:5.000  
>  Median :2004   Median :2009   Median :7.000   Median :5.000  
>  Mean   :2004   Mean   :2009   Mean   :6.853   Mean   :4.556  
>  3rd Qu.:2004   3rd Qu.:2009   3rd Qu.:8.000   3rd Qu.:5.000  
>  Max.   :2008   Max.   :2009   Max.   :9.000   Max.   :5.000  
>      sizea1             sizeb1            sizec1       size1added    
>  Min.   :0.000000   Min.   : 0.0000   Min.   : 0.0   Min.   : 0.000  
>  1st Qu.:0.000000   1st Qu.: 0.0000   1st Qu.: 0.0   1st Qu.: 0.000  
>  Median :0.000000   Median : 0.0000   Median : 1.0   Median : 2.000  
>  Mean   :0.009375   Mean   : 0.7469   Mean   : 1.9   Mean   : 2.656  
>  3rd Qu.:0.000000   3rd Qu.: 1.0000   3rd Qu.: 3.0   3rd Qu.: 4.000  
>  Max.   :1.000000   Max.   :18.0000   Max.   :13.0   Max.   :21.000  
>     repstra1          repstrb1         repstr1added         feca1       
>  Min.   : 0.0000   Min.   :0.000000   Min.   : 0.0000   Min.   :0.0000  
>  1st Qu.: 0.0000   1st Qu.:0.000000   1st Qu.: 0.0000   1st Qu.:0.0000  
>  Median : 0.0000   Median :0.000000   Median : 0.0000   Median :0.0000  
>  Mean   : 0.7469   Mean   :0.009375   Mean   : 0.7562   Mean   :0.2656  
>  3rd Qu.: 1.0000   3rd Qu.:0.000000   3rd Qu.: 1.0000   3rd Qu.:0.0000  
>  Max.   :18.0000   Max.   :1.000000   Max.   :18.0000   Max.   :7.0000  
>    fec1added        obsstatus1       repstatus1       fecstatus1    
>  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
>  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
>  Median :0.0000   Median :1.0000   Median :0.0000   Median :0.0000  
>  Mean   :0.2656   Mean   :0.7469   Mean   :0.2875   Mean   :0.1344  
>  3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:0.0000  
>  Max.   :7.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
>    matstatus1         alive1          stage1           stage1index    
>  Min.   :0.0000   Min.   :0.0000   Length:320         Min.   : 0.000  
>  1st Qu.:0.0000   1st Qu.:1.0000   Class :character   1st Qu.: 0.000  
>  Median :1.0000   Median :1.0000   Mode  :character   Median : 7.000  
>  Mean   :0.5469   Mean   :0.7688                      Mean   : 4.369  
>  3rd Qu.:1.0000   3rd Qu.:1.0000                      3rd Qu.: 8.000  
>  Max.   :1.0000   Max.   :1.0000                      Max.   :11.000  
>      sizea2             sizeb2            sizec2         size2added    
>  Min.   :0.000000   Min.   : 0.0000   Min.   : 0.000   Min.   : 0.000  
>  1st Qu.:0.000000   1st Qu.: 0.0000   1st Qu.: 1.000   1st Qu.: 1.000  
>  Median :0.000000   Median : 0.0000   Median : 2.000   Median : 2.000  
>  Mean   :0.009375   Mean   : 0.8969   Mean   : 2.416   Mean   : 3.322  
>  3rd Qu.:0.000000   3rd Qu.: 1.0000   3rd Qu.: 3.000   3rd Qu.: 4.000  
>  Max.   :1.000000   Max.   :18.0000   Max.   :13.000   Max.   :24.000  
>     repstra2          repstrb2         repstr2added         feca2       
>  Min.   : 0.0000   Min.   :0.000000   Min.   : 0.0000   Min.   :0.0000  
>  1st Qu.: 0.0000   1st Qu.:0.000000   1st Qu.: 0.0000   1st Qu.:0.0000  
>  Median : 0.0000   Median :0.000000   Median : 0.0000   Median :0.0000  
>  Mean   : 0.8969   Mean   :0.009375   Mean   : 0.9062   Mean   :0.2906  
>  3rd Qu.: 1.0000   3rd Qu.:0.000000   3rd Qu.: 1.0000   3rd Qu.:0.0000  
>  Max.   :18.0000   Max.   :1.000000   Max.   :18.0000   Max.   :7.0000  
>    fec2added        obsstatus2       repstatus2       fecstatus2    
>  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
>  1st Qu.:0.0000   1st Qu.:1.0000   1st Qu.:0.0000   1st Qu.:0.0000  
>  Median :0.0000   Median :1.0000   Median :0.0000   Median :0.0000  
>  Mean   :0.2906   Mean   :0.9531   Mean   :0.3688   Mean   :0.1562  
>  3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:0.0000  
>  Max.   :7.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
>    matstatus2     alive2     stage2           stage2index    
>  Min.   :1    Min.   :1   Length:320         Min.   : 0.000  
>  1st Qu.:1    1st Qu.:1   Class :character   1st Qu.: 0.000  
>  Median :1    Median :1   Mode  :character   Median : 7.000  
>  Mean   :1    Mean   :1                      Mean   : 5.769  
>  3rd Qu.:1    3rd Qu.:1                      3rd Qu.: 8.000  
>  Max.   :1    Max.   :1                      Max.   :11.000  
>      sizea3             sizeb3           sizec3         size3added    
>  Min.   :0.000000   Min.   : 0.000   Min.   : 0.000   Min.   : 0.000  
>  1st Qu.:0.000000   1st Qu.: 0.000   1st Qu.: 1.000   1st Qu.: 1.000  
>  Median :0.000000   Median : 0.000   Median : 1.000   Median : 2.000  
>  Mean   :0.009375   Mean   : 1.069   Mean   : 2.209   Mean   : 3.288  
>  3rd Qu.:0.000000   3rd Qu.: 1.000   3rd Qu.: 3.000   3rd Qu.: 4.000  
>  Max.   :1.000000   Max.   :18.000   Max.   :13.000   Max.   :24.000  
>     repstra3         repstrb3         repstr3added        feca3       
>  Min.   : 0.000   Min.   :0.000000   Min.   : 0.000   Min.   :0.0000  
>  1st Qu.: 0.000   1st Qu.:0.000000   1st Qu.: 0.000   1st Qu.:0.0000  
>  Median : 0.000   Median :0.000000   Median : 0.000   Median :0.0000  
>  Mean   : 1.069   Mean   :0.009375   Mean   : 1.078   Mean   :0.4562  
>  3rd Qu.: 1.000   3rd Qu.:0.000000   3rd Qu.: 1.000   3rd Qu.:0.0000  
>  Max.   :18.000   Max.   :1.000000   Max.   :18.000   Max.   :8.0000  
>    fec3added        obsstatus3    repstatus3    fecstatus3       matstatus3
>  Min.   :0.0000   Min.   :0.0   Min.   :0.0   Min.   :0.0000   Min.   :1   
>  1st Qu.:0.0000   1st Qu.:1.0   1st Qu.:0.0   1st Qu.:0.0000   1st Qu.:1   
>  Median :0.0000   Median :1.0   Median :0.0   Median :0.0000   Median :1   
>  Mean   :0.4562   Mean   :0.9   Mean   :0.4   Mean   :0.2219   Mean   :1   
>  3rd Qu.:0.0000   3rd Qu.:1.0   3rd Qu.:1.0   3rd Qu.:0.0000   3rd Qu.:1   
>  Max.   :8.0000   Max.   :1.0   Max.   :1.0   Max.   :1.0000   Max.   :1   
>      alive3          stage3           stage3index    
>  Min.   :0.0000   Length:320         Min.   : 0.000  
>  1st Qu.:1.0000   Class :character   1st Qu.: 0.000  
>  Median :1.0000   Mode  :character   Median : 7.000  
>  Mean   :0.9469                      Mean   : 5.419  
>  3rd Qu.:1.0000                      3rd Qu.: 8.000  
>  Max.   :1.0000                      Max.   :11.000
>    rowid popid patchid individ year2 firstseen lastseen obsage obslifespan
> 2      2             A     165  2004      2004     2009      5           5
> 5      5             A     243  2004      2004     2009      5           5
> 9      9             A     251  2004      2004     2009      5           5
> 10    10             A     252  2004      2004     2009      5           5
> 11    11             A     253  2004      2004     2009      5           5
> 12    12             A     255  2004      2004     2008      5           4
> 14    15             A     259  2004      2004     2009      5           5
> 16    19             A     264  2004      2004     2007      5           3
> 19    22             A     393  2004      2004     2009      5           5
> 21    24             B     431  2004      2004     2009      5           5
> 24    27             B     437  2004      2004     2009      5           5
> 26    30             B     441  2004      2004     2009      5           5
> 27    31             B     442  2004      2004     2009      5           5
> 28    32             B     443  2004      2004     2009      5           5
> 29    33             B     445  2004      2004     2009      5           5
> 33    37             B     452  2004      2004     2009      5           5
> 34    38             B     454  2004      2004     2009      5           5
>    sizea1 sizeb1 sizec1 size1added repstra1 repstrb1 repstr1added feca1
> 2       0      0      0          0        0        0            0     0
> 5       0      0      0          0        0        0            0     0
> 9       0      0      0          0        0        0            0     0
> 10      0      0      0          0        0        0            0     0
> 11      0      0      0          0        0        0            0     0
> 12      0      0      0          0        0        0            0     0
> 14      0      0      0          0        0        0            0     0
> 16      0      0      0          0        0        0            0     0
> 19      0      0      0          0        0        0            0     0
> 21      0      0      0          0        0        0            0     0
> 24      0      0      0          0        0        0            0     0
> 26      0      0      0          0        0        0            0     0
> 27      0      0      0          0        0        0            0     0
> 28      0      0      0          0        0        0            0     0
> 29      0      0      0          0        0        0            0     0
> 33      0      0      0          0        0        0            0     0
> 34      0      0      0          0        0        0            0     0
>    fec1added obsstatus1 repstatus1 fecstatus1 matstatus1 alive1   stage1
> 2          0          0          0          0          0      0 NotAlive
> 5          0          0          0          0          0      0 NotAlive
> 9          0          0          0          0          0      0 NotAlive
> 10         0          0          0          0          0      0 NotAlive
> 11         0          0          0          0          0      0 NotAlive
> 12         0          0          0          0          0      0 NotAlive
> 14         0          0          0          0          0      0 NotAlive
> 16         0          0          0          0          0      0 NotAlive
> 19         0          0          0          0          0      0 NotAlive
> 21         0          0          0          0          0      0 NotAlive
> 24         0          0          0          0          0      0 NotAlive
> 26         0          0          0          0          0      0 NotAlive
> 27         0          0          0          0          0      0 NotAlive
> 28         0          0          0          0          0      0 NotAlive
> 29         0          0          0          0          0      0 NotAlive
> 33         0          0          0          0          0      0 NotAlive
> 34         0          0          0          0          0      0 NotAlive
>    stage1index sizea2 sizeb2 sizec2 size2added repstra2 repstrb2 repstr2added
> 2            0      0      2      1          3        2        0            2
> 5            0      0      0      5          5        0        0            0
> 9            0      0      0      2          2        0        0            0
> 10           0      0      0      1          1        0        0            0
> 11           0      0      0      1          1        0        0            0
> 12           0      0      0      8          8        0        0            0
> 14           0      0      0      2          2        0        0            0
> 16           0      0      0      2          2        0        0            0
> 19           0      0      2      3          5        2        0            2
> 21           0      0      0      6          6        0        0            0
> 24           0      0      1      3          4        1        0            1
> 26           0      0      0      4          4        0        0            0
> 27           0      0      0      4          4        0        0            0
> 28           0      0      0      4          4        0        0            0
> 29           0      0      0      2          2        0        0            0
> 33           0      0      1      3          4        1        0            1
> 34           0      0      1      2          3        1        0            1
>    feca2 fec2added obsstatus2 repstatus2 fecstatus2 matstatus2 alive2   stage2
> 2      1         1          1          1          1          1      1       Sm
> 5      0         0          1          0          0          1      1       Md
> 9      0         0          1          0          0          1      1 NotAlive
> 10     0         0          1          0          0          1      1      XSm
> 11     0         0          1          0          0          1      1      XSm
> 12     0         0          1          0          0          1      1       Lg
> 14     0         0          1          0          0          1      1 NotAlive
> 16     0         0          1          0          0          1      1 NotAlive
> 19     2         2          1          1          1          1      1       Md
> 21     0         0          1          0          0          1      1       Md
> 24     1         1          1          1          1          1      1 NotAlive
> 26     0         0          1          0          0          1      1 NotAlive
> 27     0         0          1          0          0          1      1 NotAlive
> 28     0         0          1          0          0          1      1 NotAlive
> 29     0         0          1          0          0          1      1 NotAlive
> 33     0         0          1          1          0          1      1 NotAlive
> 34     0         0          1          1          0          1      1       Sm
>    stage2index sizea3 sizeb3 sizec3 size3added repstra3 repstrb3 repstr3added
> 2            8      0      2      0          2        2        0            2
> 5            9      0      0      2          2        0        0            0
> 9            0      0      0      2          2        0        0            0
> 10           7      0      2      0          2        2        0            2
> 11           7      0      1      1          2        1        0            1
> 12          10      0      1      3          4        1        0            1
> 14           0      0      1      2          3        1        0            1
> 16           0      0      1      0          1        1        0            1
> 19           9      0      3      1          4        3        0            3
> 21           9      0      0      4          4        0        0            0
> 24           0      0      3      1          4        3        0            3
> 26           0      0      2      4          6        2        0            2
> 27           0      0      0      2          2        0        0            0
> 28           0      0      0      2          2        0        0            0
> 29           0      0      0      1          1        0        0            0
> 33           0      0      5      0          5        5        0            5
> 34           8      0      0      2          2        0        0            0
>    feca3 fec3added obsstatus3 repstatus3 fecstatus3 matstatus3 alive3  stage3
> 2      0         0          1          1          0          1      1 NoMatch
> 5      0         0          1          0          0          1      1 NoMatch
> 9      0         0          1          0          0          1      1 NoMatch
> 10     0         0          1          1          0          1      1 NoMatch
> 11     0         0          1          1          0          1      1 NoMatch
> 12     1         1          1          1          1          1      1 NoMatch
> 14     0         0          1          1          0          1      1      Sm
> 16     0         0          1          1          0          1      1     XSm
> 19     1         1          1          1          1          1      1 NoMatch
> 21     0         0          1          0          0          1      1 NoMatch
> 24     2         2          1          1          1          1      1 NoMatch
> 26     2         2          1          1          1          1      1      Md
> 27     0         0          1          0          0          1      1 NoMatch
> 28     0         0          1          0          0          1      1 NoMatch
> 29     0         0          1          0          0          1      1     XSm
> 33     2         2          1          1          1          1      1      Md
> 34     0         0          1          0          0          1      1 NoMatch
>    stage3index
> 2            0
> 5            0
> 9            0
> 10           0
> 11           0
> 12           0
> 14           8
> 16           7
> 19           0
> 21           0
> 24           0
> 26           9
> 27           0
> 28           0
> 29           7
> 33           9
> 34           0
>  [ reached 'max' / getOption("max.print") -- omitted 151 rows ]

The output above shows, from the third line down a few lines, rows in the standardized dataset that have problems in stage assignment. We can use this output to take a look at some of these rows and try to determine where our mistake is, as below.

cypraw_v1[c(2, 5, 9, 10, 11),]
>    rowid popid patchid individ year2 firstseen lastseen obsage obslifespan
> 2      2             A     165  2004      2004     2009      5           5
> 5      5             A     243  2004      2004     2009      5           5
> 9      9             A     251  2004      2004     2009      5           5
> 10    10             A     252  2004      2004     2009      5           5
> 11    11             A     253  2004      2004     2009      5           5
>    sizea1 sizeb1 sizec1 size1added repstra1 repstrb1 repstr1added feca1
> 2       0      0      0          0        0        0            0     0
> 5       0      0      0          0        0        0            0     0
> 9       0      0      0          0        0        0            0     0
> 10      0      0      0          0        0        0            0     0
> 11      0      0      0          0        0        0            0     0
>    fec1added obsstatus1 repstatus1 fecstatus1 matstatus1 alive1   stage1
> 2          0          0          0          0          0      0 NotAlive
> 5          0          0          0          0          0      0 NotAlive
> 9          0          0          0          0          0      0 NotAlive
> 10         0          0          0          0          0      0 NotAlive
> 11         0          0          0          0          0      0 NotAlive
>    stage1index sizea2 sizeb2 sizec2 size2added repstra2 repstrb2 repstr2added
> 2            0      0      2      1          3        2        0            2
> 5            0      0      0      5          5        0        0            0
> 9            0      0      0      2          2        0        0            0
> 10           0      0      0      1          1        0        0            0
> 11           0      0      0      1          1        0        0            0
>    feca2 fec2added obsstatus2 repstatus2 fecstatus2 matstatus2 alive2   stage2
> 2      1         1          1          1          1          1      1       Sm
> 5      0         0          1          0          0          1      1       Md
> 9      0         0          1          0          0          1      1 NotAlive
> 10     0         0          1          0          0          1      1      XSm
> 11     0         0          1          0          0          1      1      XSm
>    stage2index sizea3 sizeb3 sizec3 size3added repstra3 repstrb3 repstr3added
> 2            8      0      2      0          2        2        0            2
> 5            9      0      0      2          2        0        0            0
> 9            0      0      0      2          2        0        0            0
> 10           7      0      2      0          2        2        0            2
> 11           7      0      1      1          2        1        0            1
>    feca3 fec3added obsstatus3 repstatus3 fecstatus3 matstatus3 alive3  stage3
> 2      0         0          1          1          0          1      1 NoMatch
> 5      0         0          1          0          0          1      1 NoMatch
> 9      0         0          1          0          0          1      1 NoMatch
> 10     0         0          1          1          0          1      1 NoMatch
> 11     0         0          1          1          0          1      1 NoMatch
>    stage3index
> 2            0
> 5            0
> 9            0
> 10           0
> 11           0

In the output above, we find that the five rows we have chosen to investigate show NoMatch under the stage3 column, meaning that R could not assign stages in time t+1 here. In these five cases, the individuals were observable and mature, though they could be reproductive or not. The size seems to be the common feature, which is 2 for all (see the size3added column).

Let’s see if we can get to the bottom of the problem by looking at the stageframe.

cypframe_raw
>    stage size size_b size_c min_age max_age repstatus obsstatus propstatus
> 1     SD  0.0     NA     NA      NA      NA         0         0          1
> 2     P1  0.0     NA     NA      NA      NA         0         0          0
> 3     P2  0.0     NA     NA      NA      NA         0         0          0
> 4     P3  0.0     NA     NA      NA      NA         0         0          0
> 5     SL  0.0     NA     NA      NA      NA         0         0          0
> 6      D  0.0     NA     NA      NA      NA         0         0          0
> 7    XSm  1.0     NA     NA      NA      NA         1         1          0
> 8     Sm  3.0     NA     NA      NA      NA         1         1          0
> 9     Md  6.0     NA     NA      NA      NA         1         1          0
> 10    Lg 11.0     NA     NA      NA      NA         1         1          0
> 11   XLg 19.5     NA     NA      NA      NA         1         1          0
>    immstatus matstatus indataset binhalfwidth_raw sizebin_min sizebin_max
> 1          0         0         0              0.0         0.0         0.0
> 2          1         0         0              0.0         0.0         0.0
> 3          1         0         0              0.0         0.0         0.0
> 4          1         0         0              0.0         0.0         0.0
> 5          1         0         0              0.0         0.0         0.0
> 6          0         1         1              0.5        -0.5         0.5
> 7          0         1         1              0.5         0.5         1.5
> 8          0         1         1              0.5         2.5         3.5
> 9          0         1         1              1.5         4.5         7.5
> 10         0         1         1              3.5         7.5        14.5
> 11         0         1         1              5.0        14.5        24.5
>    sizebin_center sizebin_width binhalfwidthb_raw sizebinb_min sizebinb_max
> 1             0.0             0                NA           NA           NA
> 2             0.0             0                NA           NA           NA
> 3             0.0             0                NA           NA           NA
> 4             0.0             0                NA           NA           NA
> 5             0.0             0                NA           NA           NA
> 6             0.0             1                NA           NA           NA
> 7             1.0             1                NA           NA           NA
> 8             3.0             1                NA           NA           NA
> 9             6.0             3                NA           NA           NA
> 10           11.0             7                NA           NA           NA
> 11           19.5            10                NA           NA           NA
>    sizebinb_center sizebinb_width binhalfwidthc_raw sizebinc_min sizebinc_max
> 1               NA             NA                NA           NA           NA
> 2               NA             NA                NA           NA           NA
> 3               NA             NA                NA           NA           NA
> 4               NA             NA                NA           NA           NA
> 5               NA             NA                NA           NA           NA
> 6               NA             NA                NA           NA           NA
> 7               NA             NA                NA           NA           NA
> 8               NA             NA                NA           NA           NA
> 9               NA             NA                NA           NA           NA
> 10              NA             NA                NA           NA           NA
> 11              NA             NA                NA           NA           NA
>    sizebinc_center sizebinc_width group                       comments
> 1               NA             NA     0                   Dormant seed
> 2               NA             NA     0               1st yr protocorm
> 3               NA             NA     0               2nd yr protocorm
> 4               NA             NA     0               3rd yr protocorm
> 5               NA             NA     0                       Seedling
> 6               NA             NA     0                  Dormant adult
> 7               NA             NA     0    Extra small adult (1 shoot)
> 8               NA             NA     0       Small adult (2-4 shoots)
> 9               NA             NA     0      Medium adult (5-7 shoots)
> 10              NA             NA     0      Large adult (8-14 shoots)
> 11              NA             NA     0 Extra large adult (>14 shoots)

The key to assessing where the problem lies is in assessing what is missing from the size bins here. To assess this, we can look at the sizebin_min and sizebin_max columns. Doing so shows us that, in the adult stages, stage XSm ranges in size from 0.5 to 1.5, and the next bigger stage ranges in size from 2.5 to 3.5. In fact, looking further, we also see that stage Md ranges in size from 4.5 to 7.5, meaning that a size of 4 is also not included in any stage. With this knowledge in hand, we can revise our stageframe to expand the bin width of stage Sm an extra 2 sprouts, as below.

binvec <- c(0, 0, 0, 0, 0, 0.5, 0.5, 1.5, 1.5, 3.5, 5)
cypframe_raw <- sf_create(sizes = sizevector, stagenames = stagevector, 
  repstatus = repvector, obsstatus = obsvector, matstatus = matvector,
  propstatus = propvector, immstatus = immvector, indataset = indataset, 
  binhalfwidth = binvec, comments = comments)

cypraw_v2 <- verticalize3(data = cypdata, noyears = 6, firstyear = 2004, 
  patchidcol = "patch", individcol = "plantid", blocksize = 4,
  sizeacol = "Inf2.04", sizebcol = "Inf.04", sizeccol = "Veg.04",
  repstracol = "Inf.04", repstrbcol = "Inf2.04", fecacol = "Pod.04",
  stageassign = cypframe_raw, stagesize = "sizeadded", NAas0 = TRUE,
  NRasRep = TRUE, age_offset = 4)
summary_hfv(cypraw_v2)
> 
> This hfv dataset contains 320 rows, 57 variables, 1 population, 
> 3 patches, 74 individuals, and 5 time steps.
>      rowid          popid           patchid    individ           year2     
>  Min.   : 1.00   Length:320         A: 93   Min.   : 164.0   Min.   :2004  
>  1st Qu.:21.00   Class :character   B:154   1st Qu.: 391.0   1st Qu.:2005  
>  Median :37.50   Mode  :character   C: 73   Median : 453.0   Median :2006  
>  Mean   :38.45                              Mean   : 651.5   Mean   :2006  
>  3rd Qu.:56.00                              3rd Qu.: 476.0   3rd Qu.:2007  
>  Max.   :77.00                              Max.   :1560.0   Max.   :2008  
>    firstseen       lastseen        obsage       obslifespan   
>  Min.   :2004   Min.   :2004   Min.   :5.000   Min.   :0.000  
>  1st Qu.:2004   1st Qu.:2009   1st Qu.:6.000   1st Qu.:5.000  
>  Median :2004   Median :2009   Median :7.000   Median :5.000  
>  Mean   :2004   Mean   :2009   Mean   :6.853   Mean   :4.556  
>  3rd Qu.:2004   3rd Qu.:2009   3rd Qu.:8.000   3rd Qu.:5.000  
>  Max.   :2008   Max.   :2009   Max.   :9.000   Max.   :5.000  
>      sizea1             sizeb1            sizec1       size1added    
>  Min.   :0.000000   Min.   : 0.0000   Min.   : 0.0   Min.   : 0.000  
>  1st Qu.:0.000000   1st Qu.: 0.0000   1st Qu.: 0.0   1st Qu.: 0.000  
>  Median :0.000000   Median : 0.0000   Median : 1.0   Median : 2.000  
>  Mean   :0.009375   Mean   : 0.7469   Mean   : 1.9   Mean   : 2.656  
>  3rd Qu.:0.000000   3rd Qu.: 1.0000   3rd Qu.: 3.0   3rd Qu.: 4.000  
>  Max.   :1.000000   Max.   :18.0000   Max.   :13.0   Max.   :21.000  
>     repstra1          repstrb1         repstr1added         feca1       
>  Min.   : 0.0000   Min.   :0.000000   Min.   : 0.0000   Min.   :0.0000  
>  1st Qu.: 0.0000   1st Qu.:0.000000   1st Qu.: 0.0000   1st Qu.:0.0000  
>  Median : 0.0000   Median :0.000000   Median : 0.0000   Median :0.0000  
>  Mean   : 0.7469   Mean   :0.009375   Mean   : 0.7562   Mean   :0.2656  
>  3rd Qu.: 1.0000   3rd Qu.:0.000000   3rd Qu.: 1.0000   3rd Qu.:0.0000  
>  Max.   :18.0000   Max.   :1.000000   Max.   :18.0000   Max.   :7.0000  
>    fec1added        obsstatus1       repstatus1       fecstatus1    
>  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
>  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
>  Median :0.0000   Median :1.0000   Median :0.0000   Median :0.0000  
>  Mean   :0.2656   Mean   :0.7469   Mean   :0.2875   Mean   :0.1344  
>  3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:0.0000  
>  Max.   :7.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
>    matstatus1         alive1          stage1           stage1index    
>  Min.   :0.0000   Min.   :0.0000   Length:320         Min.   : 0.000  
>  1st Qu.:1.0000   1st Qu.:1.0000   Class :character   1st Qu.: 6.000  
>  Median :1.0000   Median :1.0000   Mode  :character   Median : 8.000  
>  Mean   :0.7688   Mean   :0.7688                      Mean   : 6.144  
>  3rd Qu.:1.0000   3rd Qu.:1.0000                      3rd Qu.: 8.000  
>  Max.   :1.0000   Max.   :1.0000                      Max.   :11.000  
>      sizea2             sizeb2            sizec2         size2added    
>  Min.   :0.000000   Min.   : 0.0000   Min.   : 0.000   Min.   : 0.000  
>  1st Qu.:0.000000   1st Qu.: 0.0000   1st Qu.: 1.000   1st Qu.: 1.000  
>  Median :0.000000   Median : 0.0000   Median : 2.000   Median : 2.000  
>  Mean   :0.009375   Mean   : 0.8969   Mean   : 2.416   Mean   : 3.322  
>  3rd Qu.:0.000000   3rd Qu.: 1.0000   3rd Qu.: 3.000   3rd Qu.: 4.000  
>  Max.   :1.000000   Max.   :18.0000   Max.   :13.000   Max.   :24.000  
>     repstra2          repstrb2         repstr2added         feca2       
>  Min.   : 0.0000   Min.   :0.000000   Min.   : 0.0000   Min.   :0.0000  
>  1st Qu.: 0.0000   1st Qu.:0.000000   1st Qu.: 0.0000   1st Qu.:0.0000  
>  Median : 0.0000   Median :0.000000   Median : 0.0000   Median :0.0000  
>  Mean   : 0.8969   Mean   :0.009375   Mean   : 0.9062   Mean   :0.2906  
>  3rd Qu.: 1.0000   3rd Qu.:0.000000   3rd Qu.: 1.0000   3rd Qu.:0.0000  
>  Max.   :18.0000   Max.   :1.000000   Max.   :18.0000   Max.   :7.0000  
>    fec2added        obsstatus2       repstatus2       fecstatus2    
>  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
>  1st Qu.:0.0000   1st Qu.:1.0000   1st Qu.:0.0000   1st Qu.:0.0000  
>  Median :0.0000   Median :1.0000   Median :0.0000   Median :0.0000  
>  Mean   :0.2906   Mean   :0.9531   Mean   :0.3688   Mean   :0.1562  
>  3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:0.0000  
>  Max.   :7.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
>    matstatus2     alive2     stage2           stage2index    
>  Min.   :1    Min.   :1   Length:320         Min.   : 6.000  
>  1st Qu.:1    1st Qu.:1   Class :character   1st Qu.: 7.000  
>  Median :1    Median :1   Mode  :character   Median : 8.000  
>  Mean   :1    Mean   :1                      Mean   : 7.919  
>  3rd Qu.:1    3rd Qu.:1                      3rd Qu.: 8.000  
>  Max.   :1    Max.   :1                      Max.   :11.000  
>      sizea3             sizeb3           sizec3         size3added    
>  Min.   :0.000000   Min.   : 0.000   Min.   : 0.000   Min.   : 0.000  
>  1st Qu.:0.000000   1st Qu.: 0.000   1st Qu.: 1.000   1st Qu.: 1.000  
>  Median :0.000000   Median : 0.000   Median : 1.000   Median : 2.000  
>  Mean   :0.009375   Mean   : 1.069   Mean   : 2.209   Mean   : 3.288  
>  3rd Qu.:0.000000   3rd Qu.: 1.000   3rd Qu.: 3.000   3rd Qu.: 4.000  
>  Max.   :1.000000   Max.   :18.000   Max.   :13.000   Max.   :24.000  
>     repstra3         repstrb3         repstr3added        feca3       
>  Min.   : 0.000   Min.   :0.000000   Min.   : 0.000   Min.   :0.0000  
>  1st Qu.: 0.000   1st Qu.:0.000000   1st Qu.: 0.000   1st Qu.:0.0000  
>  Median : 0.000   Median :0.000000   Median : 0.000   Median :0.0000  
>  Mean   : 1.069   Mean   :0.009375   Mean   : 1.078   Mean   :0.4562  
>  3rd Qu.: 1.000   3rd Qu.:0.000000   3rd Qu.: 1.000   3rd Qu.:0.0000  
>  Max.   :18.000   Max.   :1.000000   Max.   :18.000   Max.   :8.0000  
>    fec3added        obsstatus3    repstatus3    fecstatus3       matstatus3
>  Min.   :0.0000   Min.   :0.0   Min.   :0.0   Min.   :0.0000   Min.   :1   
>  1st Qu.:0.0000   1st Qu.:1.0   1st Qu.:0.0   1st Qu.:0.0000   1st Qu.:1   
>  Median :0.0000   Median :1.0   Median :0.0   Median :0.0000   Median :1   
>  Mean   :0.4562   Mean   :0.9   Mean   :0.4   Mean   :0.2219   Mean   :1   
>  3rd Qu.:0.0000   3rd Qu.:1.0   3rd Qu.:1.0   3rd Qu.:0.0000   3rd Qu.:1   
>  Max.   :8.0000   Max.   :1.0   Max.   :1.0   Max.   :1.0000   Max.   :1   
>      alive3          stage3           stage3index    
>  Min.   :0.0000   Length:320         Min.   : 0.000  
>  1st Qu.:1.0000   Class :character   1st Qu.: 7.000  
>  Median :1.0000   Mode  :character   Median : 8.000  
>  Mean   :0.9469                      Mean   : 7.544  
>  3rd Qu.:1.0000                      3rd Qu.: 8.000  
>  Max.   :1.0000                      Max.   :11.000

We no longer see any issues popping up in the summary_hfv() output.

In addition to the above, the function hfv_qc() is extremely useful in assessing the quality of our data. Let’s use this function to explore our vertical dataset.

hfv_qc(cypraw_v2)
> Survival:
> 
>   Data subset has 58 variables and 320 transitions.
> 
>   Variable alive3 has 0 missing values.
>   Variable alive3 is a binomial variable.
> 
> 
> Primary size:
> 
>   Data subset has 58 variables and 303 transitions.
> 
>   Variable sizea3 has 0 missing values.
>   Variable sizea3 appears to be an integer variable.
> 
>   Variable sizea3 is fully non-negative.
> 
>   Overdispersion test:
>     Mean sizea3 is 0.009901
>     The variance in sizea3 is 0.009835
>     The probability of this dispersion level by chance assuming that
>     the true mean sizea3 = variance in sizea3,
>     and an alternative hypothesis of overdispersion, is 1
>     Dispersion level in sizea3 matches expectation.
> 
>   Zero-inflation and truncation tests:
>     Mean lambda in sizea3 is 0.9901
>     The actual number of 0s in sizea3 is 300
>     The expected number of 0s in sizea3 under the null hypothesis is 300
>     The probability of this deviation in 0s from expectation by chance is 0.9025
>     Variable sizea3 is not significantly zero-inflated.
> 
> 
> Fecundity:
> 
>   Data subset has 58 variables and 320 transitions.
> 
>   Variable feca2 has 0 missing values.
>   Variable feca2 appears to be an integer variable.
> 
>   Variable feca2 is fully non-negative.
> 
>   Overdispersion test:
>     Mean feca2 is 0.2906
>     The variance in feca2 is 0.7084
>     The probability of this dispersion level by chance assuming that
>     the true mean feca2 = variance in feca2,
>     and an alternative hypothesis of overdispersion, is 1
>     Dispersion level in feca2 matches expectation.
> 
>   Zero-inflation and truncation tests:
>     Mean lambda in feca2 is 0.7478
>     The actual number of 0s in feca2 is 270
>     The expected number of 0s in feca2 under the null hypothesis is 239.3
>     The probability of this deviation in 0s from expectation by chance is 2.189e-26
>     Variable feca2 is significantly zero-inflated.

The output gives us quite a lot to work with. All of the variables that we might be interested in assessing as vital rates are examined. Naturally, variables coding for probabilities need to be binomial, and so we see that we have variables tested for whether they fit the characteristics of a binomial variable. We see that size and fecundity are explored to assess whether they fit the characteristics required of the associated distribution. So, they are examined for whether they are count variables or continuous, and they are also assessed to see whether they match the characteristics of distributions such as the Gaussian, the Poisson, and the negative binomial. Lastly, we see that the output includes information about the data subsets that will be used to assess the various vital rates, including the numbers of individuals and the the numbers of transitions (standardized dataset rows) to parameterize the vital rate models.

15.2 Quality control in vital rate models

The function modelsearch(), and its associated summary() function for lefkoMod objects, both provide critical quality control for vital rate models used to develop function-based MPMs, including discretized MPMs. The two key processes are actually conducted by function modelsearch() itself, but summary() provides easy access to the results. In particular, modelsearch() assesses the numbers of individuals and transitions used to develop each vital rate model, and the overall accuracy of each model.

Let’s take a look at how this works using the function-based version of the Cypripedium analysis, as given in Chapter 5. Here, we load all of the preliminaries for the historical analysis.

data(cypdata)

stagevector <- c("SD", "P1", "P2", "P3", "SL", "D", "V1", "V2", "V3", "V4",
  "V5", "V6", "V7", "V8", "V9", "V10", "V11", "V12", "V13", "V14", "V15", "V16",
  "V17", "V18", "V19", "V20", "V21", "V22", "V23", "V24", "F1", "F2", "F3",
  "F4", "F5", "F6", "F7", "F8", "F9", "F10", "F11", "F12", "F13", "F14", "F15",
  "F16", "F17", "F18", "F19", "F20", "F21", "F22", "F23", "F24")
indataset <- c(0, 0, 0, 0, 0, rep(1, 49))
sizevector <- c(0, 0, 0, 0, 0, seq(from = 0, t = 24), seq(from = 1, to = 24))
repvector <- c(0, 0, 0, 0, 0, rep(0, 25), rep(1, 24))
obsvector <- c(0, 0, 0, 0, 0, 0, rep(1, 48))
matvector <- c(0, 0, 0, 0, 0, rep(1, 49))
immvector <- c(0, 1, 1, 1, 1, rep(0, 49))
propvector <- c(1, rep(0, 53))
comments <- c("Dormant seed", "Yr1 protocorm", "Yr2 protocorm", "Yr3 protocorm",
  "Seedling", "Veg dorm", "Veg adult 1 stem", "Veg adult 2 stems",
  "Veg adult 3 stems", "Veg adult 4 stems", "Veg adult 5 stems",
  "Veg adult 6 stems", "Veg adult 7 stems", "Veg adult 8 stems",
  "Veg adult 9 stems", "Veg adult 10 stems", "Veg adult 11 stems",
  "Veg adult 12 stems", "Veg adult 13 stems", "Veg adult 14 stems",
  "Veg adult 15 stems", "Veg adult 16 stems", "Veg adult 17 stems",
  "Veg adult 18 stems", "Veg adult 19 stems", "Veg adult 20 stems",
  "Veg adult 21 stems", "Veg adult 22 stems", "Veg adult 23 stems",
  "Veg adult 24 stems", "Flo adult 1 stem", "Flo adult 2 stems",
  "Flo adult 3 stems", "Flo adult 4 stems", "Flo adult 5 stems",
  "Flo adult 6 stems", "Flo adult 7 stems", "Flo adult 8 stems",
  "Flo adult 9 stems", "Flo adult 10 stems", "Flo adult 11 stems",
  "Flo adult 12 stems", "Flo adult 13 stems", "Flo adult 14 stems",
  "Flo adult 15 stems", "Flo adult 16 stems", "Flo adult 17 stems",
  "Flo adult 18 stems", "Flo adult 19 stems", "Flo adult 20 stems",
  "Flo adult 21 stems", "Flo adult 22 stems", "Flo adult 23 stems",
  "Flo adult 24 stems")

cypframe_fb <- sf_create(sizes = sizevector, stagenames = stagevector, 
  repstatus = repvector, obsstatus = obsvector, matstatus = matvector, 
  propstatus = propvector, immstatus = immvector, indataset = indataset,
  comments = comments)

cypfb_v1 <- verticalize3(data = cypdata, noyears = 6, firstyear = 2004, 
  patchidcol = "patch", individcol = "plantid", blocksize = 4, 
  sizeacol = "Inf2.04", sizebcol = "Inf.04", sizeccol = "Veg.04", 
  repstracol = "Inf.04", repstrbcol = "Inf2.04", fecacol = "Pod.04", 
  stageassign = cypframe_fb, stagesize = "sizeadded", NAas0 = TRUE,
  age_offset = 4)

seeds_per_fruit <- 5000
sl_mult <- 0.7

cypsupp3_fb <- supplemental(stage3 = c("SD", "SD", "P1", "P1", "P2", "P3", "SL",
    "SL", "SL", "D", "V1", "V2", "V3", "D", "V1", "V2", "V3", "mat", "mat",
    "mat", "mat", "SD", "P1"), 
  stage2 = c("SD", "SD", "SD", "SD", "P1", "P2", "P3", "SL", "SL", "SL", "SL", 
    "SL", "SL", "SL", "SL", "SL", "SL", "D", "V1", "V2", "V3", "rep", "rep"), 
  stage1 = c("SD", "rep", "SD", "rep", "SD", "P1", "P2", "P3", "SL", "P3", "P3",
    "P3", "P3", "SL", "SL", "SL", "SL", "SL", "SL", "SL", "SL", "mat", "mat"), 
  eststage3 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, "D", "V1", "V2", "V3", "D",
    "V1", "V2", "V3", "mat", "mat", "mat", "mat", NA, NA), 
  eststage2 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, "D", "D", "D", "D", "D", 
    "D", "D", "D", "D", "V1", "V2", "V3", NA, NA), 
  eststage1 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, "D", "D", "D", "D", "D", 
    "D", "D", "D", "V1", "V1", "V1", "V1", NA, NA), 
  givenrate = c(0.08, 0.08, 0.1, 0.1, 0.1, 0.1, 0.1, 0.05, 0.05, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA),
  multiplier = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, sl_mult, sl_mult,
    sl_mult, sl_mult, sl_mult, sl_mult, sl_mult, sl_mult, 1, 1, 1, 1,
    0.5 * seeds_per_fruit, 0.5 * seeds_per_fruit),
  type = c("S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S",
    "S", "S", "S", "S", "S", "S", "S", "R", "R"), 
  type_t12 = c("S", "F", "S", "F", "S", "S", "S", "S", "S", "S", "S", "S", "S",
    "S", "S", "S", "S", "S", "S", "S", "S", "S", "S"),
  stageframe = cypframe_fb, historical = TRUE)

Now let’s run the vital rate models for a historical MPM.

cypmodels3p <- modelsearch(cypfb_v1, historical = TRUE, approach = "mixed", 
  vitalrates = c("surv", "obs", "size", "repst", "fec"), patch = "patchid",
  sizedist = "negbin", size.trunc = TRUE, fecdist = "poisson", fec.zero = TRUE,
  suite = "main", size = c("size3added", "size2added", "size1added"),
  quiet = "partial")

Let’s take a peek at the summary of the resulting lefkoMod object.

summary(cypmodels3p)
> This LefkoMod object includes 5 linear models.
> Best-fit model criterion used: aicc&k
> 
> 
> 
> Survival model:
> Generalized linear mixed model fit by maximum likelihood (Laplace
>   Approximation) [glmerMod]
>  Family: binomial  ( logit )
> Formula: alive3 ~ size2added + (1 | year2) + (1 | patchid) + (1 | individ)
>    Data: subdata
>      AIC      BIC   logLik deviance df.resid 
> 130.1321 148.9737 -60.0660 120.1321      315 
> Random effects:
>  Groups  Name        Std.Dev. 
>  individ (Intercept) 1.199e+00
>  year2   (Intercept) 5.117e-05
>  patchid (Intercept) 1.172e-05
> Number of obs: 320, groups:  individ, 74; year2, 5; patchid, 3
> Fixed Effects:
> (Intercept)   size2added  
>      2.0356       0.6343  
> optimizer (Nelder_Mead) convergence code: 0 (OK) ; 0 optimizer warnings; 1 lme4 warnings 
> 
> 
> 
> Observation model:
> Generalized linear mixed model fit by maximum likelihood (Laplace
>   Approximation) [glmerMod]
>  Family: binomial  ( logit )
> Formula: obsstatus3 ~ size2added + (1 | year2) + (1 | patchid) + (1 |  
>     individ)
>    Data: subdata
>      AIC      BIC   logLik deviance df.resid 
> 120.2567 138.8254 -55.1284 110.2567      298 
> Random effects:
>  Groups  Name        Std.Dev.
>  individ (Intercept) 0.0000  
>  year2   (Intercept) 0.8776  
>  patchid (Intercept) 0.0000  
> Number of obs: 303, groups:  individ, 70; year2, 5; patchid, 3
> Fixed Effects:
> (Intercept)   size2added  
>      2.4904       0.3134  
> optimizer (Nelder_Mead) convergence code: 0 (OK) ; 0 optimizer warnings; 1 lme4 warnings 
> 
> 
> 
> Size model:
> Formula:          size3added ~ (1 | year2) + (1 | patchid) + (1 | individ)
> Data: subdata
>       AIC       BIC    logLik  df.resid 
> 1009.9750 1028.2898 -499.9875       283 
> Random-effects (co)variances:
> 
> Conditional model:
>  Groups  Name        Std.Dev.
>  year2   (Intercept) 0.1133  
>  patchid (Intercept) 0.2118  
>  individ (Intercept) 1.0320  
> 
> Number of obs: 288 / Conditional model: year2, 5; patchid, 3; individ, 70
> 
> Dispersion parameter for truncated_nbinom2 family (): 2.73e+07 
> 
> Fixed Effects:
> 
> Conditional model:
> (Intercept)  
>       0.587  
> 
> 
> 
> Secondary size model:
> [1] 1
> 
> 
> 
> Tertiary size model:
> [1] 1
> 
> 
> 
> Reproductive status model:
> Generalized linear mixed model fit by maximum likelihood (Laplace
>   Approximation) [glmerMod]
>  Family: binomial  ( logit )
> Formula: repstatus3 ~ repstatus2 + size2added + (1 | year2) + (1 | patchid) +  
>     (1 | individ)
>    Data: subdata
>       AIC       BIC    logLik  deviance  df.resid 
>  333.4037  355.3815 -160.7019  321.4037       282 
> Random effects:
>  Groups  Name        Std.Dev.
>  individ (Intercept) 0.1776  
>  year2   (Intercept) 0.6636  
>  patchid (Intercept) 0.3501  
> Number of obs: 288, groups:  individ, 70; year2, 5; patchid, 3
> Fixed Effects:
> (Intercept)   repstatus2   size2added  
>     -1.3836       1.5543       0.1788  
> 
> 
> 
> Fecundity model:
> Formula:          
> feca2 ~ size2added + (1 | year2) + (1 | patchid) + (1 | individ)
> Zero inflation:         
> ~size2added + (1 | year2) + (1 | patchid) + (1 | individ)
> Data: subdata
>       AIC       BIC    logLik  df.resid 
>  251.4551  279.1619 -115.7275       108 
> Random-effects (co)variances:
> 
> Conditional model:
>  Groups  Name        Std.Dev. 
>  year2   (Intercept) 5.610e-01
>  patchid (Intercept) 2.283e-01
>  individ (Intercept) 4.630e-08
> 
> Zero-inflation model:
>  Groups  Name        Std.Dev. 
>  year2   (Intercept) 3.340e-07
>  patchid (Intercept) 1.724e-12
>  individ (Intercept) 2.057e-04
> 
> Number of obs: 118 / Conditional model: year2, 5; patchid, 3; individ, 51 / Zero-inflation model: year2, 5; patchid, 3; individ, 51
> 
> Fixed Effects:
> 
> Conditional model:
> (Intercept)   size2added  
>    -0.56501      0.06247  
> 
> Zero-inflation model:
> (Intercept)   size2added  
>       3.840       -1.588  
> 
> 
> Juvenile survival model:
> [1] 1
> 
> 
> 
> Juvenile observation model:
> [1] 1
> 
> 
> 
> Juvenile size model:
> [1] 1
> 
> 
> 
> Juvenile secondary size model:
> [1] 1
> 
> 
> 
> Juvenile tertiary size model:
> [1] 1
> 
> 
> 
> Juvenile reproduction model:
> [1] 1
> 
> 
> 
> Juvenile maturity model:
> [1] 1
> 
> 
> 
> 
> 
> Number of models in survival table: 16
> 
> Number of models in observation table: 16
> 
> Number of models in size table: 16
> 
> Number of models in secondary size table: 1
> 
> Number of models in tertiary size table: 1
> 
> Number of models in reproduction status table: 16
> 
> Number of models in fecundity table: 241
> 
> Number of models in juvenile survival table: 1
> 
> Number of models in juvenile observation table: 1
> 
> Number of models in juvenile size table: 1
> 
> Number of models in juvenile secondary size table: 1
> 
> Number of models in juvenile tertiary size table: 1
> 
> Number of models in juvenile reproduction table: 1
> 
> Number of models in juvenile maturity table: 1
> 
> 
> 
> 
> 
> General model parameter names (column 1), and 
> specific names used in these models (column 2): 
>                       parameter_names mainparams
> 1                              time t      year2
> 2                          individual    individ
> 3                               patch      patch
> 4                   alive in time t+1      surv3
> 5                observed in time t+1       obs3
> 6                   sizea in time t+1      size3
> 7                   sizeb in time t+1     sizeb3
> 8                   sizec in time t+1     sizec3
> 9     reproductive status in time t+1     repst3
> 10              fecundity in time t+1       fec3
> 11                fecundity in time t       fec2
> 12                    sizea in time t      size2
> 13                  sizea in time t-1      size1
> 14                    sizeb in time t     sizeb2
> 15                  sizeb in time t-1     sizeb1
> 16                    sizec in time t     sizec2
> 17                  sizec in time t-1     sizec1
> 18      reproductive status in time t     repst2
> 19    reproductive status in time t-1     repst1
> 20        maturity status in time t+1     matst3
> 21          maturity status in time t     matst2
> 22                      age in time t        age
> 23                  density in time t    density
> 24   individual covariate a in time t   indcova2
> 25 individual covariate a in time t-1   indcova1
> 26   individual covariate b in time t   indcovb2
> 27 individual covariate b in time t-1   indcovb1
> 28   individual covariate c in time t   indcovc2
> 29 individual covariate c in time t-1   indcovc1
> 30              stage group in time t     group2
> 31            stage group in time t-1     group1
> 
> 
> 
> 
> 
> Quality control:
> 
> Survival model estimated with 74 individuals and 320 individual transitions.
> Survival model accuracy is 0.947.
> Observation status model estimated with 70 individuals and 303 individual transitions.
> Observation status model accuracy is 0.95.
> Primary size model estimated with 70 individuals and 288 individual transitions.
> Primary size model R-squared is 0.82.
> Secondary size model not estimated.
> Tertiary size model not estimated.
> Reproductive status model estimated with 70 individuals and 288 individual transitions.
> Reproductive status model accuracy is 0.74.
> Fecundity model estimated with 51 individuals and 118 individual transitions.
> Fecundity model R-squared is 0.535.
> Juvenile survival model not estimated.
> Juvenile observation status model not estimated.
> Juvenile primary size model not estimated.
> Juvenile secondary size model not estimated.
> Juvenile tertiary size model not estimated.
> Juvenile reproductive status model not estimated.
> Juvenile maturity status model not estimated.

In the summary output above, there is of course a section labeled Quality control, but there is more quality control in the output than just this section. Let’s first explore some of the other parts of this output.

First, the best-fit model output is worth studying. The output is actually the output from whatever package and function was used to estimate the model. In this case, where the models were mixed models, the output comes from the packages lme4 and glmmTMB. The most important quality control output comes in the form of the number of observations across the different random factors, and the overall variance or standard deviation of each random factor. If we have 118 observations in a best-fit model, and the random factors include a summed number of observations of around this number, then we likely cannot estimate random factors properly. For example, the conditional model of the fecundity model has 118 observations, and there are 5+3+51 = 59 observations used up by random factors, so everything looks OK. While the models listed above look OK in general in this regard, the observation model does suggest some problems, since the standard deviations associated with individual and patch are equal to 0.0.

Next, let’s look over the section labeled Quality control. Here we see a good deal of information that might be useful to us. First, we see the numbers of individuals and transitions (standardized dataset rows) used to develop each best-fit model. Generally, the higher the numbers of individuals and transitions, the better the overall quality of the model.

Second, the accuracy or R2 of the best-fit model is shown. Accuracy is estimated as the number of predicted responses that are equal to the observed responses divided by the total number of responses in the dataset used to parameterize the model, and is applied to situations in which the response is binomial or a count. R2 is a simple R2 and is applied to all continuous response models. Accuracy works best with binomial models, because when applied to a count, accuracy does not distinguish models in which the predicted response is very wrong situations in which the predicted response is still very close. So, it may be worth exploring the predictions in count models a bit. However, regardless of this, we would argue that the best models have accuracy or R2 greater than or equal to 0.90. Lower values make accurate prediction virtually impossible, and may impede inference.

Let’s now look at quality control in MPMs themselves.

15.3 Quality control in MPMs and discretized IPMs

Let’s first load some MPMs. Here, we will load some ahistorical Cypripedium MPMs from Chapter 4.

seeds_per_fruit <- 5000
sl_mult <- 0.7

cypsupp2_raw <- supplemental(stage3 = c("SD", "P1", "P2", "P3", "SL", "SL", "D",
    "XSm", "Sm", "SD", "P1"),
  stage2 = c("SD", "SD", "P1", "P2", "P3", "SL", "SL", "SL", "SL", "rep", "rep"), 
  eststage3 = c(NA, NA, NA, NA, NA, NA, "D", "XSm", "Sm", NA, NA), 
  eststage2 = c(NA, NA, NA, NA, NA, NA, "XSm", "XSm", "XSm", NA, NA), 
  givenrate = c(0.08, 0.10, 0.10, 0.10, 0.05, 0.05, NA, NA, NA, NA, NA),
  multiplier = c(NA, NA, NA, NA, NA, NA, sl_mult, sl_mult, sl_mult,
    0.5 * seeds_per_fruit, 0.5 * seeds_per_fruit),
  type =c(1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 3),
  stageframe = cypframe_raw, historical = FALSE)

cypmatrix2rp <- rlefko2(data = cypraw_v2, stageframe = cypframe_raw,
  year = "all", patch = "all", stages = c("stage3", "stage2"),
  size = c("size3added", "size2added"), supplement = cypsupp2_raw, 
  yearcol = "year2", patchcol = "patchid", indivcol = "individ")

Now let’s take a look at a summary of this lefkoMat object.

summary(cypmatrix2rp)
> 
> This ahistorical lefkoMat object contains 15 matrices.
> 
> Each matrix is square with 11 rows and columns, and a total of 121 elements.
> A total of 266 survival transitions were estimated, with 17.733 per matrix.
> A total of 70 fecundity transitions were estimated, with 4.667 per matrix.
> This lefkoMat object covers 1 population, 3 patches, and 5 time steps.
> 
> The dataset contains a total of 74 unique individuals and 320 unique transitions.
> 
> Survival probability sum check (each matrix represented by column in order):
>          [,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9] [,10] [,11] [,12]
> Min.    0.000 0.000 0.000 0.000 0.000 0.000 0.050 0.050 0.000 0.050 0.000 0.000
> 1st Qu. 0.075 0.025 0.075 0.025 0.075 0.075 0.140 0.140 0.100 0.140 0.100 0.100
> Median  0.180 0.100 0.180 0.100 0.180 0.180 0.909 0.778 0.686 0.857 0.750 0.575
> Mean    0.457 0.361 0.471 0.328 0.417 0.464 0.631 0.611 0.530 0.631 0.562 0.523
> 3rd Qu. 0.955 0.769 1.000 0.592 0.781 1.000 1.000 1.000 0.955 1.000 1.000 1.000
> Max.    1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
>         [,13] [,14] [,15]
> Min.    0.000 0.000 0.000
> 1st Qu. 0.075 0.075 0.100
> Median  0.180 0.180 0.750
> Mean    0.432 0.450 0.562
> 3rd Qu. 0.875 1.000 1.000
> Max.    1.000 1.000 1.000

There are a few key portions of this output to look at, when assessing MPM quality. First, it is extremely useful to take a look at the number of individuals and transitions used to develop the MPM. The larger the numbers for both, the stronger the inference possible with an MPM. Here, we see that we have a small dataset, and so we need to bear that in mind when assessing our MPMs.

Next, let’s look at the number of estimated transitions per matrix. We notice that this ahistorical MPM has 17.733 + 4.667 = 22.4 estimated non-zero elements per matrix, but there are also 121 elements per matrix overall. So, our dataset is very sparse relative to our stageframe, the latter of which probably requires a larger dataset than we have access to.

Finally, the survival probability sum check gives us the quartile summary of column sums of the survival transition matrices in the MPM. This is very important, because the column sums give the survival probabilities of the stages in the stageframe (or the stage-pairs in a historical MPM, ages in a Leslie MPM, or age-stages in an age-by-stage MPM). So, the summaries should never show survival values greater than 1.0 or less than 0.0. If they do, then there is an error in the MPM construction, and the user should most definitely go back to the drawing board (note that many of the matrices loaded into the COMPADRE and COMADRE matrices have this problem, and so will be flagged if imported into lefko3).

We plan to add further quality control protocols to package lefko3, and will update this manual as we do. Stay tuned!

15.4 Points to remember

  1. Quality control in standardized datasets can be assessed with the summary_hfv() function
  2. Errors in the development of life history models can be assessed when datasets are standardized via the functions verticalize3() and historicalize3().
  3. Quality control in vital rate models can be assessed by looking at the numbers of observations utilized in the models and their random factors, as well by exploring the accuracy or R2 of each model.
  4. MPM quality can be explored with the summary() function applied to a lefkoMat object of interest.