Chapter 15 Further Issues IV: Quality Control
When someone is honestly 55% right, that’s very good and there’s no use wrangling. And if someone is 60% right, it’s wonderful, it’s great luck, and let him thank God. But what’s to be said about 75% right? Wise people say this is suspicious. Well, and what about 100% right? Whoever says he’s 100% right is a fanatic, a thug, and the worst kind of rascal.
Stand up You’ve got to manage I won’t sympathize any more. And if you complain once more You’ll meet an army of me.
Quality control is essential to making good inferences from matric projection analysis. Package lefko3
was made with quality control as a top priority. In this chapter, we will look at a number of the quality control tools and options that are included in this package.
15.1 Quality control in life history models and vertical datasets
Vertical datasets are standardized using the functions verticalize3()
and historicalize3()
. These functions also offer quality control options, particularly to assess whether the standardized datasets and the life history models match properly.
Let’s try an example of using th quality control features of these functions. In the code below, we set up a stageframe for the Cypripedium dataset. However, we have deliberately assigned a smaller binwidth for the Small adult class. This will cause function verticalize3()
to fail in stage assignment for some portion of the data. Let’s see this in action.
data(cypdata)
sizevector <- c(0, 0, 0, 0, 0, 0, 1, 3, 6, 11, 19.5)
stagevector <- c("SD", "P1", "P2", "P3", "SL", "D", "XSm", "Sm", "Md", "Lg",
"XLg")
repvector <- c(0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1)
obsvector <- c(0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1)
matvector <- c(0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1)
immvector <- c(0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0)
propvector <- c(1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
indataset <- c(0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1)
binvec <- c(0, 0, 0, 0, 0, 0.5, 0.5, 0.5, 1.5, 3.5, 5) # 8th entry originally 1.5
comments <- c("Dormant seed", "1st yr protocorm", "2nd yr protocorm",
"3rd yr protocorm", "Seedling", "Dormant adult",
"Extra small adult (1 shoot)", "Small adult (2-4 shoots)",
"Medium adult (5-7 shoots)", "Large adult (8-14 shoots)",
"Extra large adult (>14 shoots)")
cypframe_raw <- sf_create(sizes = sizevector, stagenames = stagevector,
repstatus = repvector, obsstatus = obsvector, matstatus = matvector,
propstatus = propvector, immstatus = immvector, indataset = indataset,
binhalfwidth = binvec, comments = comments)
cypraw_v1 <- verticalize3(data = cypdata, noyears = 6, firstyear = 2004,
patchidcol = "patch", individcol = "plantid", blocksize = 4,
sizeacol = "Inf2.04", sizebcol = "Inf.04", sizeccol = "Veg.04",
repstracol = "Inf.04", repstrbcol = "Inf2.04", fecacol = "Pod.04",
stageassign = cypframe_raw, stagesize = "sizeadded", NAas0 = TRUE,
NRasRep = TRUE, age_offset = 4)
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
> Warning: Some stages occurring in the dataset do not match any characteristics
> in the input stageframe.
The output shows quite a few repeated warnings about some stages in the dataset not matching the input life history model as programmed in the stageframe. Let’s take a look more closely, using the summary_hfv()
function. We will suppress the output as the data frame of errors is quite large.
>
> This hfv dataset contains 320 rows, 57 variables, 1 population,
> 3 patches, 74 individuals, and 5 time steps.
> Problems in stage assignment identified in rows:
>
> [1] 2 5 9 10 11 12 14 16 19 21 24 26 27 28 29 33 34 38
> [19] 39 40 42 46 48 49 52 53 54 59 64 66 69 73 74 75 76 78
> [37] 80 83 85 88 90 91 92 93 94 98 99 101 102 103 104 105 106 107
> [55] 111 112 113 114 115 116 117 118 119 120 125 129 131 134 135 136 140 141
> [73] 142 143 145 148 149 150 152 155 157 158 159 160 161 166 168 169 170 171
> [91] 172 173 174 178 179 180 181 182 183 184 185 188 190 194 195 199 200 204
> [109] 205 209 212 213 216 220 221 222 223 224 229 232 233 234 236 237 239 241
> [127] 243 244 245 246 247 248 250 252 256 257 259 260 261 262 265 266 270 271
> [145] 272 274 275 276 283 285 286 291 294 295 296 298 300 303 304 306 309 310
> [163] 312 316 317 318 319 320
> rowid popid patchid individ year2
> Min. : 1.00 Length:320 A: 93 Min. : 164.0 Min. :2004
> 1st Qu.:21.00 Class :character B:154 1st Qu.: 391.0 1st Qu.:2005
> Median :37.50 Mode :character C: 73 Median : 453.0 Median :2006
> Mean :38.45 Mean : 651.5 Mean :2006
> 3rd Qu.:56.00 3rd Qu.: 476.0 3rd Qu.:2007
> Max. :77.00 Max. :1560.0 Max. :2008
> firstseen lastseen obsage obslifespan
> Min. :2004 Min. :2004 Min. :5.000 Min. :0.000
> 1st Qu.:2004 1st Qu.:2009 1st Qu.:6.000 1st Qu.:5.000
> Median :2004 Median :2009 Median :7.000 Median :5.000
> Mean :2004 Mean :2009 Mean :6.853 Mean :4.556
> 3rd Qu.:2004 3rd Qu.:2009 3rd Qu.:8.000 3rd Qu.:5.000
> Max. :2008 Max. :2009 Max. :9.000 Max. :5.000
> sizea1 sizeb1 sizec1 size1added
> Min. :0.000000 Min. : 0.0000 Min. : 0.0 Min. : 0.000
> 1st Qu.:0.000000 1st Qu.: 0.0000 1st Qu.: 0.0 1st Qu.: 0.000
> Median :0.000000 Median : 0.0000 Median : 1.0 Median : 2.000
> Mean :0.009375 Mean : 0.7469 Mean : 1.9 Mean : 2.656
> 3rd Qu.:0.000000 3rd Qu.: 1.0000 3rd Qu.: 3.0 3rd Qu.: 4.000
> Max. :1.000000 Max. :18.0000 Max. :13.0 Max. :21.000
> repstra1 repstrb1 repstr1added feca1
> Min. : 0.0000 Min. :0.000000 Min. : 0.0000 Min. :0.0000
> 1st Qu.: 0.0000 1st Qu.:0.000000 1st Qu.: 0.0000 1st Qu.:0.0000
> Median : 0.0000 Median :0.000000 Median : 0.0000 Median :0.0000
> Mean : 0.7469 Mean :0.009375 Mean : 0.7562 Mean :0.2656
> 3rd Qu.: 1.0000 3rd Qu.:0.000000 3rd Qu.: 1.0000 3rd Qu.:0.0000
> Max. :18.0000 Max. :1.000000 Max. :18.0000 Max. :7.0000
> fec1added obsstatus1 repstatus1 fecstatus1
> Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.0000
> 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000
> Median :0.0000 Median :1.0000 Median :0.0000 Median :0.0000
> Mean :0.2656 Mean :0.7469 Mean :0.2875 Mean :0.1344
> 3rd Qu.:0.0000 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:0.0000
> Max. :7.0000 Max. :1.0000 Max. :1.0000 Max. :1.0000
> matstatus1 alive1 stage1 stage1index
> Min. :0.0000 Min. :0.0000 Length:320 Min. : 0.000
> 1st Qu.:0.0000 1st Qu.:1.0000 Class :character 1st Qu.: 0.000
> Median :1.0000 Median :1.0000 Mode :character Median : 7.000
> Mean :0.5469 Mean :0.7688 Mean : 4.369
> 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.: 8.000
> Max. :1.0000 Max. :1.0000 Max. :11.000
> sizea2 sizeb2 sizec2 size2added
> Min. :0.000000 Min. : 0.0000 Min. : 0.000 Min. : 0.000
> 1st Qu.:0.000000 1st Qu.: 0.0000 1st Qu.: 1.000 1st Qu.: 1.000
> Median :0.000000 Median : 0.0000 Median : 2.000 Median : 2.000
> Mean :0.009375 Mean : 0.8969 Mean : 2.416 Mean : 3.322
> 3rd Qu.:0.000000 3rd Qu.: 1.0000 3rd Qu.: 3.000 3rd Qu.: 4.000
> Max. :1.000000 Max. :18.0000 Max. :13.000 Max. :24.000
> repstra2 repstrb2 repstr2added feca2
> Min. : 0.0000 Min. :0.000000 Min. : 0.0000 Min. :0.0000
> 1st Qu.: 0.0000 1st Qu.:0.000000 1st Qu.: 0.0000 1st Qu.:0.0000
> Median : 0.0000 Median :0.000000 Median : 0.0000 Median :0.0000
> Mean : 0.8969 Mean :0.009375 Mean : 0.9062 Mean :0.2906
> 3rd Qu.: 1.0000 3rd Qu.:0.000000 3rd Qu.: 1.0000 3rd Qu.:0.0000
> Max. :18.0000 Max. :1.000000 Max. :18.0000 Max. :7.0000
> fec2added obsstatus2 repstatus2 fecstatus2
> Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.0000
> 1st Qu.:0.0000 1st Qu.:1.0000 1st Qu.:0.0000 1st Qu.:0.0000
> Median :0.0000 Median :1.0000 Median :0.0000 Median :0.0000
> Mean :0.2906 Mean :0.9531 Mean :0.3688 Mean :0.1562
> 3rd Qu.:0.0000 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:0.0000
> Max. :7.0000 Max. :1.0000 Max. :1.0000 Max. :1.0000
> matstatus2 alive2 stage2 stage2index
> Min. :1 Min. :1 Length:320 Min. : 0.000
> 1st Qu.:1 1st Qu.:1 Class :character 1st Qu.: 0.000
> Median :1 Median :1 Mode :character Median : 7.000
> Mean :1 Mean :1 Mean : 5.769
> 3rd Qu.:1 3rd Qu.:1 3rd Qu.: 8.000
> Max. :1 Max. :1 Max. :11.000
> sizea3 sizeb3 sizec3 size3added
> Min. :0.000000 Min. : 0.000 Min. : 0.000 Min. : 0.000
> 1st Qu.:0.000000 1st Qu.: 0.000 1st Qu.: 1.000 1st Qu.: 1.000
> Median :0.000000 Median : 0.000 Median : 1.000 Median : 2.000
> Mean :0.009375 Mean : 1.069 Mean : 2.209 Mean : 3.288
> 3rd Qu.:0.000000 3rd Qu.: 1.000 3rd Qu.: 3.000 3rd Qu.: 4.000
> Max. :1.000000 Max. :18.000 Max. :13.000 Max. :24.000
> repstra3 repstrb3 repstr3added feca3
> Min. : 0.000 Min. :0.000000 Min. : 0.000 Min. :0.0000
> 1st Qu.: 0.000 1st Qu.:0.000000 1st Qu.: 0.000 1st Qu.:0.0000
> Median : 0.000 Median :0.000000 Median : 0.000 Median :0.0000
> Mean : 1.069 Mean :0.009375 Mean : 1.078 Mean :0.4562
> 3rd Qu.: 1.000 3rd Qu.:0.000000 3rd Qu.: 1.000 3rd Qu.:0.0000
> Max. :18.000 Max. :1.000000 Max. :18.000 Max. :8.0000
> fec3added obsstatus3 repstatus3 fecstatus3 matstatus3
> Min. :0.0000 Min. :0.0 Min. :0.0 Min. :0.0000 Min. :1
> 1st Qu.:0.0000 1st Qu.:1.0 1st Qu.:0.0 1st Qu.:0.0000 1st Qu.:1
> Median :0.0000 Median :1.0 Median :0.0 Median :0.0000 Median :1
> Mean :0.4562 Mean :0.9 Mean :0.4 Mean :0.2219 Mean :1
> 3rd Qu.:0.0000 3rd Qu.:1.0 3rd Qu.:1.0 3rd Qu.:0.0000 3rd Qu.:1
> Max. :8.0000 Max. :1.0 Max. :1.0 Max. :1.0000 Max. :1
> alive3 stage3 stage3index
> Min. :0.0000 Length:320 Min. : 0.000
> 1st Qu.:1.0000 Class :character 1st Qu.: 0.000
> Median :1.0000 Mode :character Median : 7.000
> Mean :0.9469 Mean : 5.419
> 3rd Qu.:1.0000 3rd Qu.: 8.000
> Max. :1.0000 Max. :11.000
> rowid popid patchid individ year2 firstseen lastseen obsage obslifespan
> 2 2 A 165 2004 2004 2009 5 5
> 5 5 A 243 2004 2004 2009 5 5
> 9 9 A 251 2004 2004 2009 5 5
> 10 10 A 252 2004 2004 2009 5 5
> 11 11 A 253 2004 2004 2009 5 5
> 12 12 A 255 2004 2004 2008 5 4
> 14 15 A 259 2004 2004 2009 5 5
> 16 19 A 264 2004 2004 2007 5 3
> 19 22 A 393 2004 2004 2009 5 5
> 21 24 B 431 2004 2004 2009 5 5
> 24 27 B 437 2004 2004 2009 5 5
> 26 30 B 441 2004 2004 2009 5 5
> 27 31 B 442 2004 2004 2009 5 5
> 28 32 B 443 2004 2004 2009 5 5
> 29 33 B 445 2004 2004 2009 5 5
> 33 37 B 452 2004 2004 2009 5 5
> 34 38 B 454 2004 2004 2009 5 5
> sizea1 sizeb1 sizec1 size1added repstra1 repstrb1 repstr1added feca1
> 2 0 0 0 0 0 0 0 0
> 5 0 0 0 0 0 0 0 0
> 9 0 0 0 0 0 0 0 0
> 10 0 0 0 0 0 0 0 0
> 11 0 0 0 0 0 0 0 0
> 12 0 0 0 0 0 0 0 0
> 14 0 0 0 0 0 0 0 0
> 16 0 0 0 0 0 0 0 0
> 19 0 0 0 0 0 0 0 0
> 21 0 0 0 0 0 0 0 0
> 24 0 0 0 0 0 0 0 0
> 26 0 0 0 0 0 0 0 0
> 27 0 0 0 0 0 0 0 0
> 28 0 0 0 0 0 0 0 0
> 29 0 0 0 0 0 0 0 0
> 33 0 0 0 0 0 0 0 0
> 34 0 0 0 0 0 0 0 0
> fec1added obsstatus1 repstatus1 fecstatus1 matstatus1 alive1 stage1
> 2 0 0 0 0 0 0 NotAlive
> 5 0 0 0 0 0 0 NotAlive
> 9 0 0 0 0 0 0 NotAlive
> 10 0 0 0 0 0 0 NotAlive
> 11 0 0 0 0 0 0 NotAlive
> 12 0 0 0 0 0 0 NotAlive
> 14 0 0 0 0 0 0 NotAlive
> 16 0 0 0 0 0 0 NotAlive
> 19 0 0 0 0 0 0 NotAlive
> 21 0 0 0 0 0 0 NotAlive
> 24 0 0 0 0 0 0 NotAlive
> 26 0 0 0 0 0 0 NotAlive
> 27 0 0 0 0 0 0 NotAlive
> 28 0 0 0 0 0 0 NotAlive
> 29 0 0 0 0 0 0 NotAlive
> 33 0 0 0 0 0 0 NotAlive
> 34 0 0 0 0 0 0 NotAlive
> stage1index sizea2 sizeb2 sizec2 size2added repstra2 repstrb2 repstr2added
> 2 0 0 2 1 3 2 0 2
> 5 0 0 0 5 5 0 0 0
> 9 0 0 0 2 2 0 0 0
> 10 0 0 0 1 1 0 0 0
> 11 0 0 0 1 1 0 0 0
> 12 0 0 0 8 8 0 0 0
> 14 0 0 0 2 2 0 0 0
> 16 0 0 0 2 2 0 0 0
> 19 0 0 2 3 5 2 0 2
> 21 0 0 0 6 6 0 0 0
> 24 0 0 1 3 4 1 0 1
> 26 0 0 0 4 4 0 0 0
> 27 0 0 0 4 4 0 0 0
> 28 0 0 0 4 4 0 0 0
> 29 0 0 0 2 2 0 0 0
> 33 0 0 1 3 4 1 0 1
> 34 0 0 1 2 3 1 0 1
> feca2 fec2added obsstatus2 repstatus2 fecstatus2 matstatus2 alive2 stage2
> 2 1 1 1 1 1 1 1 Sm
> 5 0 0 1 0 0 1 1 Md
> 9 0 0 1 0 0 1 1 NotAlive
> 10 0 0 1 0 0 1 1 XSm
> 11 0 0 1 0 0 1 1 XSm
> 12 0 0 1 0 0 1 1 Lg
> 14 0 0 1 0 0 1 1 NotAlive
> 16 0 0 1 0 0 1 1 NotAlive
> 19 2 2 1 1 1 1 1 Md
> 21 0 0 1 0 0 1 1 Md
> 24 1 1 1 1 1 1 1 NotAlive
> 26 0 0 1 0 0 1 1 NotAlive
> 27 0 0 1 0 0 1 1 NotAlive
> 28 0 0 1 0 0 1 1 NotAlive
> 29 0 0 1 0 0 1 1 NotAlive
> 33 0 0 1 1 0 1 1 NotAlive
> 34 0 0 1 1 0 1 1 Sm
> stage2index sizea3 sizeb3 sizec3 size3added repstra3 repstrb3 repstr3added
> 2 8 0 2 0 2 2 0 2
> 5 9 0 0 2 2 0 0 0
> 9 0 0 0 2 2 0 0 0
> 10 7 0 2 0 2 2 0 2
> 11 7 0 1 1 2 1 0 1
> 12 10 0 1 3 4 1 0 1
> 14 0 0 1 2 3 1 0 1
> 16 0 0 1 0 1 1 0 1
> 19 9 0 3 1 4 3 0 3
> 21 9 0 0 4 4 0 0 0
> 24 0 0 3 1 4 3 0 3
> 26 0 0 2 4 6 2 0 2
> 27 0 0 0 2 2 0 0 0
> 28 0 0 0 2 2 0 0 0
> 29 0 0 0 1 1 0 0 0
> 33 0 0 5 0 5 5 0 5
> 34 8 0 0 2 2 0 0 0
> feca3 fec3added obsstatus3 repstatus3 fecstatus3 matstatus3 alive3 stage3
> 2 0 0 1 1 0 1 1 NoMatch
> 5 0 0 1 0 0 1 1 NoMatch
> 9 0 0 1 0 0 1 1 NoMatch
> 10 0 0 1 1 0 1 1 NoMatch
> 11 0 0 1 1 0 1 1 NoMatch
> 12 1 1 1 1 1 1 1 NoMatch
> 14 0 0 1 1 0 1 1 Sm
> 16 0 0 1 1 0 1 1 XSm
> 19 1 1 1 1 1 1 1 NoMatch
> 21 0 0 1 0 0 1 1 NoMatch
> 24 2 2 1 1 1 1 1 NoMatch
> 26 2 2 1 1 1 1 1 Md
> 27 0 0 1 0 0 1 1 NoMatch
> 28 0 0 1 0 0 1 1 NoMatch
> 29 0 0 1 0 0 1 1 XSm
> 33 2 2 1 1 1 1 1 Md
> 34 0 0 1 0 0 1 1 NoMatch
> stage3index
> 2 0
> 5 0
> 9 0
> 10 0
> 11 0
> 12 0
> 14 8
> 16 7
> 19 0
> 21 0
> 24 0
> 26 9
> 27 0
> 28 0
> 29 7
> 33 9
> 34 0
> [ reached 'max' / getOption("max.print") -- omitted 151 rows ]
The output above shows, from the third line down a few lines, rows in the standardized dataset that have problems in stage assignment. We can use this output to take a look at some of these rows and try to determine where our mistake is, as below.
cypraw_v1[c(2, 5, 9, 10, 11),]
> rowid popid patchid individ year2 firstseen lastseen obsage obslifespan
> 2 2 A 165 2004 2004 2009 5 5
> 5 5 A 243 2004 2004 2009 5 5
> 9 9 A 251 2004 2004 2009 5 5
> 10 10 A 252 2004 2004 2009 5 5
> 11 11 A 253 2004 2004 2009 5 5
> sizea1 sizeb1 sizec1 size1added repstra1 repstrb1 repstr1added feca1
> 2 0 0 0 0 0 0 0 0
> 5 0 0 0 0 0 0 0 0
> 9 0 0 0 0 0 0 0 0
> 10 0 0 0 0 0 0 0 0
> 11 0 0 0 0 0 0 0 0
> fec1added obsstatus1 repstatus1 fecstatus1 matstatus1 alive1 stage1
> 2 0 0 0 0 0 0 NotAlive
> 5 0 0 0 0 0 0 NotAlive
> 9 0 0 0 0 0 0 NotAlive
> 10 0 0 0 0 0 0 NotAlive
> 11 0 0 0 0 0 0 NotAlive
> stage1index sizea2 sizeb2 sizec2 size2added repstra2 repstrb2 repstr2added
> 2 0 0 2 1 3 2 0 2
> 5 0 0 0 5 5 0 0 0
> 9 0 0 0 2 2 0 0 0
> 10 0 0 0 1 1 0 0 0
> 11 0 0 0 1 1 0 0 0
> feca2 fec2added obsstatus2 repstatus2 fecstatus2 matstatus2 alive2 stage2
> 2 1 1 1 1 1 1 1 Sm
> 5 0 0 1 0 0 1 1 Md
> 9 0 0 1 0 0 1 1 NotAlive
> 10 0 0 1 0 0 1 1 XSm
> 11 0 0 1 0 0 1 1 XSm
> stage2index sizea3 sizeb3 sizec3 size3added repstra3 repstrb3 repstr3added
> 2 8 0 2 0 2 2 0 2
> 5 9 0 0 2 2 0 0 0
> 9 0 0 0 2 2 0 0 0
> 10 7 0 2 0 2 2 0 2
> 11 7 0 1 1 2 1 0 1
> feca3 fec3added obsstatus3 repstatus3 fecstatus3 matstatus3 alive3 stage3
> 2 0 0 1 1 0 1 1 NoMatch
> 5 0 0 1 0 0 1 1 NoMatch
> 9 0 0 1 0 0 1 1 NoMatch
> 10 0 0 1 1 0 1 1 NoMatch
> 11 0 0 1 1 0 1 1 NoMatch
> stage3index
> 2 0
> 5 0
> 9 0
> 10 0
> 11 0
In the output above, we find that the five rows we have chosen to investigate show NoMatch
under the stage3
column, meaning that R could not assign stages in time t+1 here. In these five cases, the individuals were observable and mature, though they could be reproductive or not. The size seems to be the common feature, which is 2 for all (see the size3added
column).
Let’s see if we can get to the bottom of the problem by looking at the stageframe.
cypframe_raw
> stage size size_b size_c min_age max_age repstatus obsstatus propstatus
> 1 SD 0.0 NA NA NA NA 0 0 1
> 2 P1 0.0 NA NA NA NA 0 0 0
> 3 P2 0.0 NA NA NA NA 0 0 0
> 4 P3 0.0 NA NA NA NA 0 0 0
> 5 SL 0.0 NA NA NA NA 0 0 0
> 6 D 0.0 NA NA NA NA 0 0 0
> 7 XSm 1.0 NA NA NA NA 1 1 0
> 8 Sm 3.0 NA NA NA NA 1 1 0
> 9 Md 6.0 NA NA NA NA 1 1 0
> 10 Lg 11.0 NA NA NA NA 1 1 0
> 11 XLg 19.5 NA NA NA NA 1 1 0
> immstatus matstatus indataset binhalfwidth_raw sizebin_min sizebin_max
> 1 0 0 0 0.0 0.0 0.0
> 2 1 0 0 0.0 0.0 0.0
> 3 1 0 0 0.0 0.0 0.0
> 4 1 0 0 0.0 0.0 0.0
> 5 1 0 0 0.0 0.0 0.0
> 6 0 1 1 0.5 -0.5 0.5
> 7 0 1 1 0.5 0.5 1.5
> 8 0 1 1 0.5 2.5 3.5
> 9 0 1 1 1.5 4.5 7.5
> 10 0 1 1 3.5 7.5 14.5
> 11 0 1 1 5.0 14.5 24.5
> sizebin_center sizebin_width binhalfwidthb_raw sizebinb_min sizebinb_max
> 1 0.0 0 NA NA NA
> 2 0.0 0 NA NA NA
> 3 0.0 0 NA NA NA
> 4 0.0 0 NA NA NA
> 5 0.0 0 NA NA NA
> 6 0.0 1 NA NA NA
> 7 1.0 1 NA NA NA
> 8 3.0 1 NA NA NA
> 9 6.0 3 NA NA NA
> 10 11.0 7 NA NA NA
> 11 19.5 10 NA NA NA
> sizebinb_center sizebinb_width binhalfwidthc_raw sizebinc_min sizebinc_max
> 1 NA NA NA NA NA
> 2 NA NA NA NA NA
> 3 NA NA NA NA NA
> 4 NA NA NA NA NA
> 5 NA NA NA NA NA
> 6 NA NA NA NA NA
> 7 NA NA NA NA NA
> 8 NA NA NA NA NA
> 9 NA NA NA NA NA
> 10 NA NA NA NA NA
> 11 NA NA NA NA NA
> sizebinc_center sizebinc_width group comments
> 1 NA NA 0 Dormant seed
> 2 NA NA 0 1st yr protocorm
> 3 NA NA 0 2nd yr protocorm
> 4 NA NA 0 3rd yr protocorm
> 5 NA NA 0 Seedling
> 6 NA NA 0 Dormant adult
> 7 NA NA 0 Extra small adult (1 shoot)
> 8 NA NA 0 Small adult (2-4 shoots)
> 9 NA NA 0 Medium adult (5-7 shoots)
> 10 NA NA 0 Large adult (8-14 shoots)
> 11 NA NA 0 Extra large adult (>14 shoots)
The key to assessing where the problem lies is in assessing what is missing from the size bins here. To assess this, we can look at the sizebin_min
and sizebin_max
columns. Doing so shows us that, in the adult stages, stage XSm
ranges in size from 0.5 to 1.5, and the next bigger stage ranges in size from 2.5 to 3.5. In fact, looking further, we also see that stage Md
ranges in size from 4.5 to 7.5, meaning that a size of 4 is also not included in any stage. With this knowledge in hand, we can revise our stageframe to expand the bin width of stage Sm
an extra 2 sprouts, as below.
binvec <- c(0, 0, 0, 0, 0, 0.5, 0.5, 1.5, 1.5, 3.5, 5)
cypframe_raw <- sf_create(sizes = sizevector, stagenames = stagevector,
repstatus = repvector, obsstatus = obsvector, matstatus = matvector,
propstatus = propvector, immstatus = immvector, indataset = indataset,
binhalfwidth = binvec, comments = comments)
cypraw_v2 <- verticalize3(data = cypdata, noyears = 6, firstyear = 2004,
patchidcol = "patch", individcol = "plantid", blocksize = 4,
sizeacol = "Inf2.04", sizebcol = "Inf.04", sizeccol = "Veg.04",
repstracol = "Inf.04", repstrbcol = "Inf2.04", fecacol = "Pod.04",
stageassign = cypframe_raw, stagesize = "sizeadded", NAas0 = TRUE,
NRasRep = TRUE, age_offset = 4)
summary_hfv(cypraw_v2)
>
> This hfv dataset contains 320 rows, 57 variables, 1 population,
> 3 patches, 74 individuals, and 5 time steps.
> rowid popid patchid individ year2
> Min. : 1.00 Length:320 A: 93 Min. : 164.0 Min. :2004
> 1st Qu.:21.00 Class :character B:154 1st Qu.: 391.0 1st Qu.:2005
> Median :37.50 Mode :character C: 73 Median : 453.0 Median :2006
> Mean :38.45 Mean : 651.5 Mean :2006
> 3rd Qu.:56.00 3rd Qu.: 476.0 3rd Qu.:2007
> Max. :77.00 Max. :1560.0 Max. :2008
> firstseen lastseen obsage obslifespan
> Min. :2004 Min. :2004 Min. :5.000 Min. :0.000
> 1st Qu.:2004 1st Qu.:2009 1st Qu.:6.000 1st Qu.:5.000
> Median :2004 Median :2009 Median :7.000 Median :5.000
> Mean :2004 Mean :2009 Mean :6.853 Mean :4.556
> 3rd Qu.:2004 3rd Qu.:2009 3rd Qu.:8.000 3rd Qu.:5.000
> Max. :2008 Max. :2009 Max. :9.000 Max. :5.000
> sizea1 sizeb1 sizec1 size1added
> Min. :0.000000 Min. : 0.0000 Min. : 0.0 Min. : 0.000
> 1st Qu.:0.000000 1st Qu.: 0.0000 1st Qu.: 0.0 1st Qu.: 0.000
> Median :0.000000 Median : 0.0000 Median : 1.0 Median : 2.000
> Mean :0.009375 Mean : 0.7469 Mean : 1.9 Mean : 2.656
> 3rd Qu.:0.000000 3rd Qu.: 1.0000 3rd Qu.: 3.0 3rd Qu.: 4.000
> Max. :1.000000 Max. :18.0000 Max. :13.0 Max. :21.000
> repstra1 repstrb1 repstr1added feca1
> Min. : 0.0000 Min. :0.000000 Min. : 0.0000 Min. :0.0000
> 1st Qu.: 0.0000 1st Qu.:0.000000 1st Qu.: 0.0000 1st Qu.:0.0000
> Median : 0.0000 Median :0.000000 Median : 0.0000 Median :0.0000
> Mean : 0.7469 Mean :0.009375 Mean : 0.7562 Mean :0.2656
> 3rd Qu.: 1.0000 3rd Qu.:0.000000 3rd Qu.: 1.0000 3rd Qu.:0.0000
> Max. :18.0000 Max. :1.000000 Max. :18.0000 Max. :7.0000
> fec1added obsstatus1 repstatus1 fecstatus1
> Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.0000
> 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000
> Median :0.0000 Median :1.0000 Median :0.0000 Median :0.0000
> Mean :0.2656 Mean :0.7469 Mean :0.2875 Mean :0.1344
> 3rd Qu.:0.0000 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:0.0000
> Max. :7.0000 Max. :1.0000 Max. :1.0000 Max. :1.0000
> matstatus1 alive1 stage1 stage1index
> Min. :0.0000 Min. :0.0000 Length:320 Min. : 0.000
> 1st Qu.:1.0000 1st Qu.:1.0000 Class :character 1st Qu.: 6.000
> Median :1.0000 Median :1.0000 Mode :character Median : 8.000
> Mean :0.7688 Mean :0.7688 Mean : 6.144
> 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.: 8.000
> Max. :1.0000 Max. :1.0000 Max. :11.000
> sizea2 sizeb2 sizec2 size2added
> Min. :0.000000 Min. : 0.0000 Min. : 0.000 Min. : 0.000
> 1st Qu.:0.000000 1st Qu.: 0.0000 1st Qu.: 1.000 1st Qu.: 1.000
> Median :0.000000 Median : 0.0000 Median : 2.000 Median : 2.000
> Mean :0.009375 Mean : 0.8969 Mean : 2.416 Mean : 3.322
> 3rd Qu.:0.000000 3rd Qu.: 1.0000 3rd Qu.: 3.000 3rd Qu.: 4.000
> Max. :1.000000 Max. :18.0000 Max. :13.000 Max. :24.000
> repstra2 repstrb2 repstr2added feca2
> Min. : 0.0000 Min. :0.000000 Min. : 0.0000 Min. :0.0000
> 1st Qu.: 0.0000 1st Qu.:0.000000 1st Qu.: 0.0000 1st Qu.:0.0000
> Median : 0.0000 Median :0.000000 Median : 0.0000 Median :0.0000
> Mean : 0.8969 Mean :0.009375 Mean : 0.9062 Mean :0.2906
> 3rd Qu.: 1.0000 3rd Qu.:0.000000 3rd Qu.: 1.0000 3rd Qu.:0.0000
> Max. :18.0000 Max. :1.000000 Max. :18.0000 Max. :7.0000
> fec2added obsstatus2 repstatus2 fecstatus2
> Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.0000
> 1st Qu.:0.0000 1st Qu.:1.0000 1st Qu.:0.0000 1st Qu.:0.0000
> Median :0.0000 Median :1.0000 Median :0.0000 Median :0.0000
> Mean :0.2906 Mean :0.9531 Mean :0.3688 Mean :0.1562
> 3rd Qu.:0.0000 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:0.0000
> Max. :7.0000 Max. :1.0000 Max. :1.0000 Max. :1.0000
> matstatus2 alive2 stage2 stage2index
> Min. :1 Min. :1 Length:320 Min. : 6.000
> 1st Qu.:1 1st Qu.:1 Class :character 1st Qu.: 7.000
> Median :1 Median :1 Mode :character Median : 8.000
> Mean :1 Mean :1 Mean : 7.919
> 3rd Qu.:1 3rd Qu.:1 3rd Qu.: 8.000
> Max. :1 Max. :1 Max. :11.000
> sizea3 sizeb3 sizec3 size3added
> Min. :0.000000 Min. : 0.000 Min. : 0.000 Min. : 0.000
> 1st Qu.:0.000000 1st Qu.: 0.000 1st Qu.: 1.000 1st Qu.: 1.000
> Median :0.000000 Median : 0.000 Median : 1.000 Median : 2.000
> Mean :0.009375 Mean : 1.069 Mean : 2.209 Mean : 3.288
> 3rd Qu.:0.000000 3rd Qu.: 1.000 3rd Qu.: 3.000 3rd Qu.: 4.000
> Max. :1.000000 Max. :18.000 Max. :13.000 Max. :24.000
> repstra3 repstrb3 repstr3added feca3
> Min. : 0.000 Min. :0.000000 Min. : 0.000 Min. :0.0000
> 1st Qu.: 0.000 1st Qu.:0.000000 1st Qu.: 0.000 1st Qu.:0.0000
> Median : 0.000 Median :0.000000 Median : 0.000 Median :0.0000
> Mean : 1.069 Mean :0.009375 Mean : 1.078 Mean :0.4562
> 3rd Qu.: 1.000 3rd Qu.:0.000000 3rd Qu.: 1.000 3rd Qu.:0.0000
> Max. :18.000 Max. :1.000000 Max. :18.000 Max. :8.0000
> fec3added obsstatus3 repstatus3 fecstatus3 matstatus3
> Min. :0.0000 Min. :0.0 Min. :0.0 Min. :0.0000 Min. :1
> 1st Qu.:0.0000 1st Qu.:1.0 1st Qu.:0.0 1st Qu.:0.0000 1st Qu.:1
> Median :0.0000 Median :1.0 Median :0.0 Median :0.0000 Median :1
> Mean :0.4562 Mean :0.9 Mean :0.4 Mean :0.2219 Mean :1
> 3rd Qu.:0.0000 3rd Qu.:1.0 3rd Qu.:1.0 3rd Qu.:0.0000 3rd Qu.:1
> Max. :8.0000 Max. :1.0 Max. :1.0 Max. :1.0000 Max. :1
> alive3 stage3 stage3index
> Min. :0.0000 Length:320 Min. : 0.000
> 1st Qu.:1.0000 Class :character 1st Qu.: 7.000
> Median :1.0000 Mode :character Median : 8.000
> Mean :0.9469 Mean : 7.544
> 3rd Qu.:1.0000 3rd Qu.: 8.000
> Max. :1.0000 Max. :11.000
We no longer see any issues popping up in the summary_hfv()
output.
In addition to the above, the function hfv_qc()
is extremely useful in assessing the quality of our data. Let’s use this function to explore our vertical dataset.
hfv_qc(cypraw_v2)
> Survival:
>
> Data subset has 58 variables and 320 transitions.
>
> Variable alive3 has 0 missing values.
> Variable alive3 is a binomial variable.
>
>
> Primary size:
>
> Data subset has 58 variables and 303 transitions.
>
> Variable sizea3 has 0 missing values.
> Variable sizea3 appears to be an integer variable.
>
> Variable sizea3 is fully non-negative.
>
> Overdispersion test:
> Mean sizea3 is 0.009901
> The variance in sizea3 is 0.009835
> The probability of this dispersion level by chance assuming that
> the true mean sizea3 = variance in sizea3,
> and an alternative hypothesis of overdispersion, is 1
> Dispersion level in sizea3 matches expectation.
>
> Zero-inflation and truncation tests:
> Mean lambda in sizea3 is 0.9901
> The actual number of 0s in sizea3 is 300
> The expected number of 0s in sizea3 under the null hypothesis is 300
> The probability of this deviation in 0s from expectation by chance is 0.9025
> Variable sizea3 is not significantly zero-inflated.
>
>
> Fecundity:
>
> Data subset has 58 variables and 320 transitions.
>
> Variable feca2 has 0 missing values.
> Variable feca2 appears to be an integer variable.
>
> Variable feca2 is fully non-negative.
>
> Overdispersion test:
> Mean feca2 is 0.2906
> The variance in feca2 is 0.7084
> The probability of this dispersion level by chance assuming that
> the true mean feca2 = variance in feca2,
> and an alternative hypothesis of overdispersion, is 1
> Dispersion level in feca2 matches expectation.
>
> Zero-inflation and truncation tests:
> Mean lambda in feca2 is 0.7478
> The actual number of 0s in feca2 is 270
> The expected number of 0s in feca2 under the null hypothesis is 239.3
> The probability of this deviation in 0s from expectation by chance is 2.189e-26
> Variable feca2 is significantly zero-inflated.
The output gives us quite a lot to work with. All of the variables that we might be interested in assessing as vital rates are examined. Naturally, variables coding for probabilities need to be binomial, and so we see that we have variables tested for whether they fit the characteristics of a binomial variable. We see that size and fecundity are explored to assess whether they fit the characteristics required of the associated distribution. So, they are examined for whether they are count variables or continuous, and they are also assessed to see whether they match the characteristics of distributions such as the Gaussian, the Poisson, and the negative binomial. Lastly, we see that the output includes information about the data subsets that will be used to assess the various vital rates, including the numbers of individuals and the the numbers of transitions (standardized dataset rows) to parameterize the vital rate models.
15.2 Quality control in vital rate models
The function modelsearch()
, and its associated summary()
function for lefkoMod
objects, both provide critical quality control for vital rate models used to develop function-based MPMs, including discretized MPMs. The two key processes are actually conducted by function modelsearch()
itself, but summary()
provides easy access to the results. In particular, modelsearch()
assesses the numbers of individuals and transitions used to develop each vital rate model, and the overall accuracy of each model.
Let’s take a look at how this works using the function-based version of the Cypripedium analysis, as given in Chapter 5. Here, we load all of the preliminaries for the historical analysis.
data(cypdata)
stagevector <- c("SD", "P1", "P2", "P3", "SL", "D", "V1", "V2", "V3", "V4",
"V5", "V6", "V7", "V8", "V9", "V10", "V11", "V12", "V13", "V14", "V15", "V16",
"V17", "V18", "V19", "V20", "V21", "V22", "V23", "V24", "F1", "F2", "F3",
"F4", "F5", "F6", "F7", "F8", "F9", "F10", "F11", "F12", "F13", "F14", "F15",
"F16", "F17", "F18", "F19", "F20", "F21", "F22", "F23", "F24")
indataset <- c(0, 0, 0, 0, 0, rep(1, 49))
sizevector <- c(0, 0, 0, 0, 0, seq(from = 0, t = 24), seq(from = 1, to = 24))
repvector <- c(0, 0, 0, 0, 0, rep(0, 25), rep(1, 24))
obsvector <- c(0, 0, 0, 0, 0, 0, rep(1, 48))
matvector <- c(0, 0, 0, 0, 0, rep(1, 49))
immvector <- c(0, 1, 1, 1, 1, rep(0, 49))
propvector <- c(1, rep(0, 53))
comments <- c("Dormant seed", "Yr1 protocorm", "Yr2 protocorm", "Yr3 protocorm",
"Seedling", "Veg dorm", "Veg adult 1 stem", "Veg adult 2 stems",
"Veg adult 3 stems", "Veg adult 4 stems", "Veg adult 5 stems",
"Veg adult 6 stems", "Veg adult 7 stems", "Veg adult 8 stems",
"Veg adult 9 stems", "Veg adult 10 stems", "Veg adult 11 stems",
"Veg adult 12 stems", "Veg adult 13 stems", "Veg adult 14 stems",
"Veg adult 15 stems", "Veg adult 16 stems", "Veg adult 17 stems",
"Veg adult 18 stems", "Veg adult 19 stems", "Veg adult 20 stems",
"Veg adult 21 stems", "Veg adult 22 stems", "Veg adult 23 stems",
"Veg adult 24 stems", "Flo adult 1 stem", "Flo adult 2 stems",
"Flo adult 3 stems", "Flo adult 4 stems", "Flo adult 5 stems",
"Flo adult 6 stems", "Flo adult 7 stems", "Flo adult 8 stems",
"Flo adult 9 stems", "Flo adult 10 stems", "Flo adult 11 stems",
"Flo adult 12 stems", "Flo adult 13 stems", "Flo adult 14 stems",
"Flo adult 15 stems", "Flo adult 16 stems", "Flo adult 17 stems",
"Flo adult 18 stems", "Flo adult 19 stems", "Flo adult 20 stems",
"Flo adult 21 stems", "Flo adult 22 stems", "Flo adult 23 stems",
"Flo adult 24 stems")
cypframe_fb <- sf_create(sizes = sizevector, stagenames = stagevector,
repstatus = repvector, obsstatus = obsvector, matstatus = matvector,
propstatus = propvector, immstatus = immvector, indataset = indataset,
comments = comments)
cypfb_v1 <- verticalize3(data = cypdata, noyears = 6, firstyear = 2004,
patchidcol = "patch", individcol = "plantid", blocksize = 4,
sizeacol = "Inf2.04", sizebcol = "Inf.04", sizeccol = "Veg.04",
repstracol = "Inf.04", repstrbcol = "Inf2.04", fecacol = "Pod.04",
stageassign = cypframe_fb, stagesize = "sizeadded", NAas0 = TRUE,
age_offset = 4)
seeds_per_fruit <- 5000
sl_mult <- 0.7
cypsupp3_fb <- supplemental(stage3 = c("SD", "SD", "P1", "P1", "P2", "P3", "SL",
"SL", "SL", "D", "V1", "V2", "V3", "D", "V1", "V2", "V3", "mat", "mat",
"mat", "mat", "SD", "P1"),
stage2 = c("SD", "SD", "SD", "SD", "P1", "P2", "P3", "SL", "SL", "SL", "SL",
"SL", "SL", "SL", "SL", "SL", "SL", "D", "V1", "V2", "V3", "rep", "rep"),
stage1 = c("SD", "rep", "SD", "rep", "SD", "P1", "P2", "P3", "SL", "P3", "P3",
"P3", "P3", "SL", "SL", "SL", "SL", "SL", "SL", "SL", "SL", "mat", "mat"),
eststage3 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, "D", "V1", "V2", "V3", "D",
"V1", "V2", "V3", "mat", "mat", "mat", "mat", NA, NA),
eststage2 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, "D", "D", "D", "D", "D",
"D", "D", "D", "D", "V1", "V2", "V3", NA, NA),
eststage1 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, "D", "D", "D", "D", "D",
"D", "D", "D", "V1", "V1", "V1", "V1", NA, NA),
givenrate = c(0.08, 0.08, 0.1, 0.1, 0.1, 0.1, 0.1, 0.05, 0.05, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA),
multiplier = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, sl_mult, sl_mult,
sl_mult, sl_mult, sl_mult, sl_mult, sl_mult, sl_mult, 1, 1, 1, 1,
0.5 * seeds_per_fruit, 0.5 * seeds_per_fruit),
type = c("S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S",
"S", "S", "S", "S", "S", "S", "S", "R", "R"),
type_t12 = c("S", "F", "S", "F", "S", "S", "S", "S", "S", "S", "S", "S", "S",
"S", "S", "S", "S", "S", "S", "S", "S", "S", "S"),
stageframe = cypframe_fb, historical = TRUE)
Now let’s run the vital rate models for a historical MPM.
cypmodels3p <- modelsearch(cypfb_v1, historical = TRUE, approach = "mixed",
vitalrates = c("surv", "obs", "size", "repst", "fec"), patch = "patchid",
sizedist = "negbin", size.trunc = TRUE, fecdist = "poisson", fec.zero = TRUE,
suite = "main", size = c("size3added", "size2added", "size1added"),
quiet = "partial")
Let’s take a peek at the summary of the resulting lefkoMod
object.
summary(cypmodels3p)
> This LefkoMod object includes 5 linear models.
> Best-fit model criterion used: aicc&k
>
>
>
> Survival model:
> Generalized linear mixed model fit by maximum likelihood (Laplace
> Approximation) [glmerMod]
> Family: binomial ( logit )
> Formula: alive3 ~ size2added + (1 | year2) + (1 | patchid) + (1 | individ)
> Data: subdata
> AIC BIC logLik deviance df.resid
> 130.1321 148.9737 -60.0660 120.1321 315
> Random effects:
> Groups Name Std.Dev.
> individ (Intercept) 1.199e+00
> year2 (Intercept) 5.117e-05
> patchid (Intercept) 1.172e-05
> Number of obs: 320, groups: individ, 74; year2, 5; patchid, 3
> Fixed Effects:
> (Intercept) size2added
> 2.0356 0.6343
> optimizer (Nelder_Mead) convergence code: 0 (OK) ; 0 optimizer warnings; 1 lme4 warnings
>
>
>
> Observation model:
> Generalized linear mixed model fit by maximum likelihood (Laplace
> Approximation) [glmerMod]
> Family: binomial ( logit )
> Formula: obsstatus3 ~ size2added + (1 | year2) + (1 | patchid) + (1 |
> individ)
> Data: subdata
> AIC BIC logLik deviance df.resid
> 120.2567 138.8254 -55.1284 110.2567 298
> Random effects:
> Groups Name Std.Dev.
> individ (Intercept) 0.0000
> year2 (Intercept) 0.8776
> patchid (Intercept) 0.0000
> Number of obs: 303, groups: individ, 70; year2, 5; patchid, 3
> Fixed Effects:
> (Intercept) size2added
> 2.4904 0.3134
> optimizer (Nelder_Mead) convergence code: 0 (OK) ; 0 optimizer warnings; 1 lme4 warnings
>
>
>
> Size model:
> Formula: size3added ~ (1 | year2) + (1 | patchid) + (1 | individ)
> Data: subdata
> AIC BIC logLik df.resid
> 1009.9750 1028.2898 -499.9875 283
> Random-effects (co)variances:
>
> Conditional model:
> Groups Name Std.Dev.
> year2 (Intercept) 0.1133
> patchid (Intercept) 0.2118
> individ (Intercept) 1.0320
>
> Number of obs: 288 / Conditional model: year2, 5; patchid, 3; individ, 70
>
> Dispersion parameter for truncated_nbinom2 family (): 2.73e+07
>
> Fixed Effects:
>
> Conditional model:
> (Intercept)
> 0.587
>
>
>
> Secondary size model:
> [1] 1
>
>
>
> Tertiary size model:
> [1] 1
>
>
>
> Reproductive status model:
> Generalized linear mixed model fit by maximum likelihood (Laplace
> Approximation) [glmerMod]
> Family: binomial ( logit )
> Formula: repstatus3 ~ repstatus2 + size2added + (1 | year2) + (1 | patchid) +
> (1 | individ)
> Data: subdata
> AIC BIC logLik deviance df.resid
> 333.4037 355.3815 -160.7019 321.4037 282
> Random effects:
> Groups Name Std.Dev.
> individ (Intercept) 0.1776
> year2 (Intercept) 0.6636
> patchid (Intercept) 0.3501
> Number of obs: 288, groups: individ, 70; year2, 5; patchid, 3
> Fixed Effects:
> (Intercept) repstatus2 size2added
> -1.3836 1.5543 0.1788
>
>
>
> Fecundity model:
> Formula:
> feca2 ~ size2added + (1 | year2) + (1 | patchid) + (1 | individ)
> Zero inflation:
> ~size2added + (1 | year2) + (1 | patchid) + (1 | individ)
> Data: subdata
> AIC BIC logLik df.resid
> 251.4551 279.1619 -115.7275 108
> Random-effects (co)variances:
>
> Conditional model:
> Groups Name Std.Dev.
> year2 (Intercept) 5.610e-01
> patchid (Intercept) 2.283e-01
> individ (Intercept) 4.630e-08
>
> Zero-inflation model:
> Groups Name Std.Dev.
> year2 (Intercept) 3.340e-07
> patchid (Intercept) 1.724e-12
> individ (Intercept) 2.057e-04
>
> Number of obs: 118 / Conditional model: year2, 5; patchid, 3; individ, 51 / Zero-inflation model: year2, 5; patchid, 3; individ, 51
>
> Fixed Effects:
>
> Conditional model:
> (Intercept) size2added
> -0.56501 0.06247
>
> Zero-inflation model:
> (Intercept) size2added
> 3.840 -1.588
>
>
> Juvenile survival model:
> [1] 1
>
>
>
> Juvenile observation model:
> [1] 1
>
>
>
> Juvenile size model:
> [1] 1
>
>
>
> Juvenile secondary size model:
> [1] 1
>
>
>
> Juvenile tertiary size model:
> [1] 1
>
>
>
> Juvenile reproduction model:
> [1] 1
>
>
>
> Juvenile maturity model:
> [1] 1
>
>
>
>
>
> Number of models in survival table: 16
>
> Number of models in observation table: 16
>
> Number of models in size table: 16
>
> Number of models in secondary size table: 1
>
> Number of models in tertiary size table: 1
>
> Number of models in reproduction status table: 16
>
> Number of models in fecundity table: 241
>
> Number of models in juvenile survival table: 1
>
> Number of models in juvenile observation table: 1
>
> Number of models in juvenile size table: 1
>
> Number of models in juvenile secondary size table: 1
>
> Number of models in juvenile tertiary size table: 1
>
> Number of models in juvenile reproduction table: 1
>
> Number of models in juvenile maturity table: 1
>
>
>
>
>
> General model parameter names (column 1), and
> specific names used in these models (column 2):
> parameter_names mainparams
> 1 time t year2
> 2 individual individ
> 3 patch patch
> 4 alive in time t+1 surv3
> 5 observed in time t+1 obs3
> 6 sizea in time t+1 size3
> 7 sizeb in time t+1 sizeb3
> 8 sizec in time t+1 sizec3
> 9 reproductive status in time t+1 repst3
> 10 fecundity in time t+1 fec3
> 11 fecundity in time t fec2
> 12 sizea in time t size2
> 13 sizea in time t-1 size1
> 14 sizeb in time t sizeb2
> 15 sizeb in time t-1 sizeb1
> 16 sizec in time t sizec2
> 17 sizec in time t-1 sizec1
> 18 reproductive status in time t repst2
> 19 reproductive status in time t-1 repst1
> 20 maturity status in time t+1 matst3
> 21 maturity status in time t matst2
> 22 age in time t age
> 23 density in time t density
> 24 individual covariate a in time t indcova2
> 25 individual covariate a in time t-1 indcova1
> 26 individual covariate b in time t indcovb2
> 27 individual covariate b in time t-1 indcovb1
> 28 individual covariate c in time t indcovc2
> 29 individual covariate c in time t-1 indcovc1
> 30 stage group in time t group2
> 31 stage group in time t-1 group1
>
>
>
>
>
> Quality control:
>
> Survival model estimated with 74 individuals and 320 individual transitions.
> Survival model accuracy is 0.947.
> Observation status model estimated with 70 individuals and 303 individual transitions.
> Observation status model accuracy is 0.95.
> Primary size model estimated with 70 individuals and 288 individual transitions.
> Primary size model R-squared is 0.82.
> Secondary size model not estimated.
> Tertiary size model not estimated.
> Reproductive status model estimated with 70 individuals and 288 individual transitions.
> Reproductive status model accuracy is 0.74.
> Fecundity model estimated with 51 individuals and 118 individual transitions.
> Fecundity model R-squared is 0.535.
> Juvenile survival model not estimated.
> Juvenile observation status model not estimated.
> Juvenile primary size model not estimated.
> Juvenile secondary size model not estimated.
> Juvenile tertiary size model not estimated.
> Juvenile reproductive status model not estimated.
> Juvenile maturity status model not estimated.
In the summary output above, there is of course a section labeled Quality control
, but there is more quality control in the output than just this section. Let’s first explore some of the other parts of this output.
First, the best-fit model output is worth studying. The output is actually the output from whatever package and function was used to estimate the model. In this case, where the models were mixed models, the output comes from the packages lme4
and glmmTMB
. The most important quality control output comes in the form of the number of observations across the different random factors, and the overall variance or standard deviation of each random factor. If we have 118 observations in a best-fit model, and the random factors include a summed number of observations of around this number, then we likely cannot estimate random factors properly. For example, the conditional model of the fecundity model has 118 observations, and there are 5+3+51 = 59 observations used up by random factors, so everything looks OK. While the models listed above look OK in general in this regard, the observation model does suggest some problems, since the standard deviations associated with individual and patch are equal to 0.0.
Next, let’s look over the section labeled Quality control
. Here we see a good deal of information that might be useful to us. First, we see the numbers of individuals and transitions (standardized dataset rows) used to develop each best-fit model. Generally, the higher the numbers of individuals and transitions, the better the overall quality of the model.
Second, the accuracy or R2 of the best-fit model is shown. Accuracy is estimated as the number of predicted responses that are equal to the observed responses divided by the total number of responses in the dataset used to parameterize the model, and is applied to situations in which the response is binomial or a count. R2 is a simple R2 and is applied to all continuous response models. Accuracy works best with binomial models, because when applied to a count, accuracy does not distinguish models in which the predicted response is very wrong situations in which the predicted response is still very close. So, it may be worth exploring the predictions in count models a bit. However, regardless of this, we would argue that the best models have accuracy or R2 greater than or equal to 0.90. Lower values make accurate prediction virtually impossible, and may impede inference.
Let’s now look at quality control in MPMs themselves.
15.3 Quality control in MPMs and discretized IPMs
Let’s first load some MPMs. Here, we will load some ahistorical Cypripedium MPMs from Chapter 4.
seeds_per_fruit <- 5000
sl_mult <- 0.7
cypsupp2_raw <- supplemental(stage3 = c("SD", "P1", "P2", "P3", "SL", "SL", "D",
"XSm", "Sm", "SD", "P1"),
stage2 = c("SD", "SD", "P1", "P2", "P3", "SL", "SL", "SL", "SL", "rep", "rep"),
eststage3 = c(NA, NA, NA, NA, NA, NA, "D", "XSm", "Sm", NA, NA),
eststage2 = c(NA, NA, NA, NA, NA, NA, "XSm", "XSm", "XSm", NA, NA),
givenrate = c(0.08, 0.10, 0.10, 0.10, 0.05, 0.05, NA, NA, NA, NA, NA),
multiplier = c(NA, NA, NA, NA, NA, NA, sl_mult, sl_mult, sl_mult,
0.5 * seeds_per_fruit, 0.5 * seeds_per_fruit),
type =c(1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 3),
stageframe = cypframe_raw, historical = FALSE)
cypmatrix2rp <- rlefko2(data = cypraw_v2, stageframe = cypframe_raw,
year = "all", patch = "all", stages = c("stage3", "stage2"),
size = c("size3added", "size2added"), supplement = cypsupp2_raw,
yearcol = "year2", patchcol = "patchid", indivcol = "individ")
Now let’s take a look at a summary of this lefkoMat
object.
summary(cypmatrix2rp)
>
> This ahistorical lefkoMat object contains 15 matrices.
>
> Each matrix is square with 11 rows and columns, and a total of 121 elements.
> A total of 266 survival transitions were estimated, with 17.733 per matrix.
> A total of 70 fecundity transitions were estimated, with 4.667 per matrix.
> This lefkoMat object covers 1 population, 3 patches, and 5 time steps.
>
> The dataset contains a total of 74 unique individuals and 320 unique transitions.
>
> Survival probability sum check (each matrix represented by column in order):
> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
> Min. 0.000 0.000 0.000 0.000 0.000 0.000 0.050 0.050 0.000 0.050 0.000 0.000
> 1st Qu. 0.075 0.025 0.075 0.025 0.075 0.075 0.140 0.140 0.100 0.140 0.100 0.100
> Median 0.180 0.100 0.180 0.100 0.180 0.180 0.909 0.778 0.686 0.857 0.750 0.575
> Mean 0.457 0.361 0.471 0.328 0.417 0.464 0.631 0.611 0.530 0.631 0.562 0.523
> 3rd Qu. 0.955 0.769 1.000 0.592 0.781 1.000 1.000 1.000 0.955 1.000 1.000 1.000
> Max. 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
> [,13] [,14] [,15]
> Min. 0.000 0.000 0.000
> 1st Qu. 0.075 0.075 0.100
> Median 0.180 0.180 0.750
> Mean 0.432 0.450 0.562
> 3rd Qu. 0.875 1.000 1.000
> Max. 1.000 1.000 1.000
There are a few key portions of this output to look at, when assessing MPM quality. First, it is extremely useful to take a look at the number of individuals and transitions used to develop the MPM. The larger the numbers for both, the stronger the inference possible with an MPM. Here, we see that we have a small dataset, and so we need to bear that in mind when assessing our MPMs.
Next, let’s look at the number of estimated transitions per matrix. We notice that this ahistorical MPM has 17.733 + 4.667 = 22.4 estimated non-zero elements per matrix, but there are also 121 elements per matrix overall. So, our dataset is very sparse relative to our stageframe, the latter of which probably requires a larger dataset than we have access to.
Finally, the survival probability sum check gives us the quartile summary of column sums of the survival transition matrices in the MPM. This is very important, because the column sums give the survival probabilities of the stages in the stageframe (or the stage-pairs in a historical MPM, ages in a Leslie MPM, or age-stages in an age-by-stage MPM). So, the summaries should never show survival values greater than 1.0 or less than 0.0. If they do, then there is an error in the MPM construction, and the user should most definitely go back to the drawing board (note that many of the matrices loaded into the COMPADRE and COMADRE matrices have this problem, and so will be flagged if imported into lefko3
).
We plan to add further quality control protocols to package lefko3
, and will update this manual as we do. Stay tuned!
15.4 Points to remember
- Quality control in standardized datasets can be assessed with the
summary_hfv()
function - Errors in the development of life history models can be assessed when datasets are standardized via the functions
verticalize3()
andhistoricalize3()
. - Quality control in vital rate models can be assessed by looking at the numbers of observations utilized in the models and their random factors, as well by exploring the accuracy or R2 of each model.
- MPM quality can be explored with the
summary()
function applied to alefkoMat
object of interest.