6 Survival

Smith 1894. Information about survival rate is valuable for managing exploited stocks and working to rebuild rare or declining populations. Survival is affected by two broad categories of threats: fishing and natural mortality. Fishing obviously refers to harvest, from commercial or recreational sectors, but could include mortality associated with fish that are caught and released. Catch-and-release fishing primarily occurs in the recreational sector, but can occur in the commercial sector due to regulations or lack of markets (sometimes referred to as discard). Natural mortality is a generic term for mortality sources other than fishing. There is typically a strong negative correlation between natural mortality rate and body size, indicating the predominant role of predation (Lorenzen 1996). Other sources of natural mortality include starvation, disease and senescence. It is important as a first step to estimate overall survival reliably, but management can be improved by partitioning total mortality into fishing and natural component rates. We begin with approaches for estimating overall survival, then in Section 7 consider tagging and telemetry methods that provide insights about sources of mortality.

6.1 Age-composition

The simplest approach for estimating survival is to examine the age composition of a sample of fish. Age would typically be determined through laboratory work using otoliths or other hard parts that show annual patterns (Campana 2001). This approach is usually applied to adult fish that would be expected to have similar survival rates. We begin with simulation code for generating the age distribution:

rm(list=ls()) # Clear Environment

# Simulate age composition sampling
MaxAge <- 10
S.true <- 0.7 # Annual survival probability

# Simulated age vector at equilibrium (expected values)
N <- array(data=NA, dim=MaxAge)
N[1] <- 1000
for (j in 2:(MaxAge)){
  N[j] <- N[j-1] * S.true
} #j
age.vec <- seq(1,MaxAge, by=1)  # Sequence function to provide x axis values for plotting
barplot(N, names.arg=age.vec, xlab="Age", ylab="Number")

SampleSize <- 30
AgeMatrix <- rmultinom(n=1, size=SampleSize, prob=N) # prob vector is rescaled internally
AgeSample <- as.vector(t(AgeMatrix)) # Transpose matrix then convert to vector
barplot(AgeSample, names.arg=age.vec, xlab="Age", ylab="Frequency")

The simulation tracks fish up to age 10 (maximum age tracked, not lifespan) with an arbitrary annual survival rate of 0.7. The looping code for age composition starts with 1000 fish at age 1 and reduces that number at each successive age by the annual survival rate. This creates an equilibrium age vector, based on assumptions of constant recruitment (1000 each year) and survival. For example, the 700 age-2 fish would have been the survivors from the previous year’s 1000 age-1 recruits. The initial number is arbitrary; the shape of the distribution and age proportions would be the same if we began with 1 or 1000 fish. The next section of code uses the equilibrium age vector as proportions and draws a sample of size 30 using a multinomial distribution (Section 3.1.3). To build some intuition about age sampling, try running the simulation code multiple times with different values for sample size, maximum age, and survival rate. As usual, keep in mind that this is a very optimistic scenario, with constant recruitment and survival and unbiased survey sampling. The only variation reflected in these samples is due to sampling from the multinomial distribution (as is evident if a large age sample is drawn).

The JAGS code for analysis is straightforward and mirrors the simulation code. Note that the equilibrium age vector used as proportions in the dmulti() function must be rescaled manually to sum to 1. Convergence is fast and the estimate of survival rate is moderately precise, despite the small (but perfectly unbiased) sample. For example, the 95% credible interval for one run was 0.57-0.80.

# Load necessary library
library(rjags)
library(R2jags)

# JAGS code
sink("AgeComp.txt")
cat("
model{
    # Priors
    S.est ~ dunif(0,1)  # Constant survival over sampled age range

    N[1] <- 1000
    # Generate equilibrium age proportions
    for (j in 2:MaxAge){
      N[j] <- N[j-1]*S.est
    } # j
    N.sum <- sum(N[]) # Rescale so that proportions sum to 1
    for (j in 1:MaxAge){
    p[j] <- N[j]/N.sum
    } #j
    # Likelihood
    AgeSample[] ~ dmulti(p[], SampleSize)  # Fit multinomial distribution based on constant survival
}
    ",fill=TRUE)
sink()

# Bundle data
jags.data <- list("AgeSample", "SampleSize", "MaxAge")

# Initial values
jags.inits <- function(){ list(S.est=runif(n=1, min=0, max=1))}

model.file <- 'AgeComp.txt'

# Parameters monitored
jags.params <- c("S.est")

# Call JAGS from R
jagsfit <- jags(data=jags.data, jags.params, inits=jags.inits,
                n.chains = 3, n.thin=1, n.iter = 4000, n.burnin=2000,
                model.file)
print(jagsfit)
plot(jagsfit)

# Look at posterior distribution versus true value
hist(jagsfit$BUGSoutput$sims.list$S.est, xlab="Estimated survival rate", main="")
abline(v=S.true, col="red")

Note that we are not estimating the age proportions as parameters but rather the single parameter for survival rate (which is used to predict the equilibrium age proportions). Thus the number of effective parameters (pD) should be close to one. A few lines of code can be added to compare the predicted and observed age distributions:

# Add code to compare observed and predicted age comp
N.hat <- array(data=NA, dim=MaxAge)
N.hat[1] <- 1000
for (j in 2:(MaxAge)){
  N.hat[j] <- N.hat[j-1] * jagsfit$BUGSoutput$median$S.est
} #j
par(mfrow=c(2,1))
barplot(AgeSample, names.arg=age.vec, xlab="Age", ylab="Number", main="Observed")
barplot(N.hat, names.arg=age.vec, xlab="Age", ylab="Number", main="Predicted")
par(mfrow=c(1,1)) # Reset plot window

The predicted age distribution looks like a smoothed version of the observed, which makes sense given that we are estimating average survival over the sampled range of ages. When a poor estimate of the survival rate occurs, you will likely notice an odd pattern in the simulated age composition sample. Try running the code several times, starting from where the random age sample is drawn. Increasing the sample size will bring the observed pattern closer to the equilibrium age composition.

Thus far, we have used small sample sizes for these simulated studies. They are sufficient here because the simulation only includes variation due to sampling from the multinomial distribution. It is also convenient to work with a small sample size when studying the code and simulation data. Sample sizes are sometimes small in field studies, and it is important to establish that a study design will provide reliable results when using an achievable sample size. But it can also be informative to test the analytical approach with a very large sample size. This provides a best-case scenario for study results, and ensures that estimates approach the true values (are essentially unbiased). For example, three replicate median estimates of the survival rate were 0.71, 0.69, and 0.70 for a sample size of 2000 fish.

6.1.1 Accounting For Gear Selectivity

The traditional approach for estimating survival from age composition data was known as catch curve analysis. It could be done using linear regression (thus doable by hand in the pre-computer era) by log-transforming the age sample (Ricker 1975). Gear selectivity was addressed subjectively, by viewing a plot of the log-transformed catches. Ricker (1975) described the catch curve pattern as having ascending and descending limbs, with the ascending limb assumed to reflect selectivity and survival. The descending limb was assumed to reflect only survival (age range of “fully selected” fish). The recommended approach was to begin the catch curve analysis at the age of peak catch or the next oldest age, in order to limit the analysis to fully selected fish. For catch curve analysis or the approach given here, declining selectivity for older fish would cause bias in the estimate of survival.

Given sufficient data, it is possible to use age composition data to estimate jointly the survival rate and gear selectivity. A practical example might be a trawl survey where small (younger) fish are not retained by the net. Fish larger than the net mesh size might be assumed to be fully selected, so a logistic function could be used to model gear selectivity that increases up an asymptote: \(S_{a} = \frac{1}{1+e^{k*(a-a_{0})}}\)

# Add in selectivity
k <- 2 # Slope for logistic function
a_50 <- 4 # Age at 0.5 selectivity

# Age selectivity
Sel <- array(data=NA, dim=MaxAge)
for (j in 1:(MaxAge)){
  Sel[j] <- 1/(1+exp(-k*(j-a_50)))
} #j
age.vec <- seq(1,MaxAge, by=1)  # Sequence function to provide x axis values for plotting
par(mfrow=c(1,1))
plot(age.vec, Sel, xlab="Age", ylab="Gear selectivity")

Values for the slope (k) and age at 50% selectivity (a0.50) are arbitrary, but chosen to increase to a well-defined asymptote before age 10. Try different values for the slope and a0.50 to understand how the two parameters affect the shape of the curve. Note that any analysis of gear selectivity will only be successful if all ages share the survival rate and there is a sufficient age range to fully characterize the asymptote. For example, imagine the outcome of an analysis using data only for ages 1-5.

Now let’s extend the age composition simulation to include less than fully selected fish:

rm(list=ls()) # Clear Environment

# Simulate age composition sampling with age-specific selectivity
MaxAge <- 10
S.true <- 0.7 # Annual survival probability

# Parameters for selectivity curve
k <- 2 # Slope for logistic function
a_50 <- 4 # Age at 0.5 selectivity

# Age selectivity
Sel <- array(data=NA, dim=MaxAge)
for (j in 1:(MaxAge)){
  Sel[j] <- 1/(1+exp(-k*(j-a_50)))
} #j
age.vec <- seq(1,MaxAge, by=1)  # Sequence function to provide x axis values for plotting
par(mfrow=c(1,1))
plot(age.vec, Sel, xlab="Age", ylab="Gear selectivity")

# Simulated age vector, including gear selectivity
N <- array(data=NA, dim=MaxAge)
N[1] <- 1000
for (j in 2:(MaxAge)){
  N[j] <- N[j-1] * S.true
} #j
N_Sel <- N * Sel
age.vec <- seq(1,MaxAge, by=1)  # Sequence function to provide x axis values for plotting
barplot(N_Sel, names.arg=age.vec, xlab="Age", ylab="Number", main="Expected age distribution")

SampleSize <- 100
AgeMatrix <- rmultinom(n=1, size=SampleSize, prob=N_Sel) # prob vector is rescaled internally
AgeSample <- as.vector(t(AgeMatrix)) # Transpose matrix then convert to vector
barplot(AgeSample, names.arg=age.vec, xlab="Age", ylab="Frequency", main="Sample age distribution")

The sample age distribution now reflects both gear selectivity and survival. We use a relatively large sample size and test whether it is possible to estimate all three parameters (survival rate and two parameters for selectivity):

# JAGS analysis
library(rjags)
library(R2jags)

sink("AgeComp.txt")
cat("
model{
    # Priors
    S.est ~ dunif(0,1)  # Constant survival over sampled age range
    a50.est ~ dunif(0, MaxAge)
    k.est ~ dlnorm(0, 1E-6) # Slope > 0 for increasing selectivity

    N[1] <- 1000
    # Generate equilibrium age proportions
    for (j in 2:MaxAge){
      N[j] <- N[j-1]*S.est
    } # j
    for (j in 1:MaxAge){
    Sel[j] <- 1/(1+exp(-k.est*(j-a50.est)))
     N.Sel[j] <- N[j] * Sel[j]
    }
    N.sum <- sum(N.Sel[]) # Rescale so that proportions sum to 1
    for (j in 1:MaxAge){
    p[j] <- N.Sel[j]/N.sum
    } #j
    # Likelihood
    AgeSample[] ~ dmulti(p[], SampleSize)  # Fit multinomial distribution based on constant survival
}
    ",fill=TRUE)
sink()

# Bundle data
jags.data <- list("AgeSample", "SampleSize", "MaxAge")

# Initial values
jags.inits <- function(){ list(S.est=runif(n=1, min=0, max=1),
                               a50.est=runif(n=1, min=0, max=MaxAge),
                               k.est=rlnorm(n=1, meanlog=0, sdlog=1))} # Arbitrary low initial values

model.file <- 'AgeComp.txt'

# Parameters monitored
jags.params <- c("S.est", "a50.est", "k.est")

# Call JAGS from R
jagsfit <- jags(data=jags.data, jags.params, inits=jags.inits,
                n.chains = 3, n.thin=1, n.iter = 20000,
                model.file)
print(jagsfit)
plot(jagsfit)

The age at 50% selectivity should be less than the maximum age so a uniform prior distribution is appropriate. The slope of the selectivity curve does not have an obvious upper bound, so following McCarthy (2007), we use an uninformative lognormal prior distribution. Restricting the slope to positive values is consistent with a survey gear that underrepresents smaller fish. Convergence can be slow when using a lognormal prior distribution so check Rhat values and increase the number of updates (n.iter) if needed.

There are several reasons why this analysis works (at least for these assumed parameter values). We have an unbiased sample of size 100 from the expected age distribution (which reflects both survival and selectivity). The most important reason is that survival is constant over the entire age range. Selectivity and survival are confounded for young fish but not for older fish that are fully vulnerable to the survey. Thus the shape of the selectivity function matters, as does the age range included in the analysis. It would not be possible to jointly estimate selectivity and survival if selectivity increased across the surveyed age range. And as before, the analysis is a best-case scenario in that the only variation is due to sampling from the multinomial distribution. Selectivity varies by age but all ages share the same selectivity parameters.

One way to improve on this analysis would be to include auxiliary data on selectivity. We could accomplish this by conducting a short-term tagging study with (for simplicity) equal numbers tagged by age. Having a tagged subpopulation with a known age structure provides direct information on selectivity, unlike the age composition data where selectivity and mortality are partially confounded. The logistical challenge in adding this auxiliary study is in obtaining sufficient tag returns through survey sampling.

The following version of the code includes auxiliary tag-return data:

rm(list=ls()) # Clear Environment

# Simulate age composition sampling with age-specific selectivity
MaxAge <- 10
S.true <- 0.7 # Annual survival probability

# Parameters for selectivity curve
k <- 2 # Slope for logistic function
a_50 <- 4 # Age at 0.5 selectivity

# Age selectivity
Sel <- array(data=NA, dim=MaxAge)
for (j in 1:(MaxAge)){
  Sel[j] <- 1/(1+exp(-k*(j-a_50)))
} #j
age.vec <- seq(1,MaxAge, by=1)  # Sequence function to provide x axis values for plotting
par(mfrow=c(1,1))
plot(age.vec, Sel, xlab="Age", ylab="Gear selectivity")

# Simulated age vector, including gear selectivity
N <- array(data=NA, dim=MaxAge)
N[1] <- 1000
for (j in 2:(MaxAge)){
  N[j] <- N[j-1] * S.true
} #j
N_Sel <- N * Sel
age.vec <- seq(1,MaxAge, by=1)  # Sequence function to provide x axis values for plotting
barplot(N_Sel, names.arg=age.vec, xlab="Age", ylab="Number", main="Expected age distribution")

SampleSize <- 100
AgeMatrix <- rmultinom(n=1, size=SampleSize, prob=N_Sel) # prob vector is rescaled internally
AgeSample <- as.vector(t(AgeMatrix)) # Transpose matrix then convert to vector
barplot(AgeSample, names.arg=age.vec, xlab="Age", ylab="Frequency", main="Sample age distribution")

# Auxiliary tag-return data
TagReturns <- 30
RetMatrix <- rmultinom(n=1, size=TagReturns, prob=Sel) # prob vector is rescaled internally
RetVector <- as.vector(t(RetMatrix)) # Transpose matrix then convert to vector
barplot(RetVector, names.arg=age.vec, xlab="Age", ylab="Frequency", main="Tag returns")

# JAGS analysis
library(rjags)
library(R2jags)

sink("AgeComp.txt")
cat("
model{
    # Priors
    S.est ~ dunif(0,1)  # Constant survival over sampled age range
    a50.est ~ dunif(0, MaxAge)
    k.est ~ dlnorm(0, 1E-6) # Slope > 0 for increasing selectivity

    N[1] <- 1000
    # Generate equilibrium age proportions
    for (j in 2:MaxAge){
      N[j] <- N[j-1]*S.est
    } # j
    for (j in 1:MaxAge){
    Sel[j] <- 1/(1+exp(-k.est*(j-a50.est)))
     N.Sel[j] <- N[j] * Sel[j]
    }
    N.sum <- sum(N.Sel[]) # Rescale so that proportions sum to 1
    for (j in 1:MaxAge){
    p[j] <- N.Sel[j]/N.sum
    } #j
    # Likelihood
    AgeSample[] ~ dmulti(p[], SampleSize)  # Fit multinomial distribution based on constant survival

    # Additional likelihood for tag-return data
    Sel.sum <- sum(Sel[])
    for (j in 1:MaxAge){
    Sel.p[j] <- Sel[j]/Sel.sum
    } #j
    RetVector[] ~ dmulti(Sel.p[], TagReturns)    
}
    ",fill=TRUE)
sink()

# Bundle data
jags.data <- list("AgeSample", "SampleSize", "MaxAge", "TagReturns", "RetVector")

# Initial values
jags.inits <- function(){ list(S.est=runif(n=1, min=0, max=1),
                               a50.est=runif(n=1, min=0, max=MaxAge),
                               k.est=rlnorm(n=1, meanlog=0, sdlog=1))} # Arbitrary low initial values

model.file <- 'AgeComp.txt'

# Parameters monitored
jags.params <- c("S.est", "a50.est", "k.est")

# Call JAGS from R
jagsfit <- jags(data=jags.data, jags.params, inits=jags.inits,
                n.chains = 3, n.thin=1, n.iter = 20000,
                model.file)
print(jagsfit)
plot(jagsfit)

There are three changes to the code in this version. First, the vector of random tag returns is generated just below the random survey catch vector. You can examine the tag-return vector in the Environment window to get a sense of what the auxiliary study contributes to the overall analysis. The next code change is adding a second likelihood component to make use of the tag-return data. The third change is to pass these additional data (tag return total and by age) to JAGS.

Having direct information on gear selectivity improves estimates of survival and selectivity, especially if the age composition sample size is small. And more importantly, this modification illustrates how auxiliary data can be added in JAGS to refine an analysis. Writing code to fit the specifics of a field study is valuable for planning and analysis, and is a major advantage over using canned software. The remaining chapters will include other examples where multiple data sources can be combined to improve estimates.

6.1.2 Cohort model

One limitation of the above analysis (Section 6.1) or a traditional catch curve is that recruitment is assumed to be constant (or can be averaged over). Catch curve analyses are generally robust to this assumption (Allen 1997), but year-class variation can sometimes be quite extreme. The following approach addresses that concern by modeling each year-class (or cohort) separately. The approach is similar to a virtual population analysis (VPA) or cohort analysis (Ricker 1975) in that a catch matrix provides age- and year-specific information that is used to track cohorts across years. A VPA is traditionally done using harvest data, but here, we assume the catch-at-age matrix contains survey data. Survey information allows for modeling relative abundance across time, whereas a traditional VPA is used to estimate absolute abundance.

The following simulation code provides the survey catch-at-age matrix. The number of ages and years is arbitrary, but we are estimating survival rate for each year so there should be enough ages to provide a decent sample size. The annual survival rate is drawn from a uniform distribution between arbitrary lower and upper bounds. Having a fair degree of annual variation in survival makes for a better test of our ability to estimate those values.

rm(list=ls()) # Clear Environment

# Simulation to create survey catch-at-age matrix
# Assume ages included in matrix are fully selected
A <- 4 # Arbitrary number of ages in survey catch matrix
Y <- 10
LowS <- 0.2 # Arbitrary lower bound
HighS <- 0.9 # Arbitrary upper bound
MeanS <- (LowS+HighS)/2
AnnualS <- array(data=NA, dim=(Y-1))
AnnualS <- runif((Y-1), min=LowS, max=HighS)

MeanR <- 400 # Arbitrary mean initial cohort size (relative)
N <- array(data=NA, dim=c(A, Y))
N[1,] <- rpois(Y, MeanR) # Starting size of each cohort at first age
for (a in 2:A){
  N[a,1] <- rpois(1,MeanS^(a-1)*MeanR) # Starting sizes for ages 2+, year 1
  for (y in 2:Y){
    N[a,y] <- rbinom(1, N[a-1, y-1], AnnualS[y-1]) # Initial number yr 1, older ages
    } #y
  } #a

SurveyCapProb <- 0.05 # Capture probability
SurveyC <- array(data=NA, dim=c(A, Y))
# Generate survey catches (account for survey catchability)
for (a in 1:A){
  for (y in 1:Y){
    SurveyC[a,y] <- rbinom(1, N[a, y], SurveyCapProb)
  } #y
} #a

The model tracks coded ages 1:A, which could represent a range of older ages with comparable survival rates. True abundance at age 1 (N[1,]) is drawn from a Poisson distribution, based on an arbitrary mean initial cohort size. Initial numbers for ages 2:A in the first year are also obtained from a Poisson distribution, where the expected number at age a is obtained by a product of survival rates multiplied by mean recruitment. For example, the expected number is MeanR * S at age 2 (adjusting for survival over one year), MeanR * S^2 for age 3, etc. The code then loops over years (columns) and ages (rows) to fill in the rest of the matrix, using a binomial distribution to simulate the survival process (and using the year-specific survival rate). The survey catch-at-age matrix includes variation due to sampling (binomally-distributed using an assumed capture probability of 0.05), as well as year-class strength and survival. Click on N and SurveyC in the Environment window to compare the two matrices in Source windows.

The JAGS code for estimating cohort size and annual survival is generally similar to the simulation code. Prior distributions and initial values for predicted survey catches use lognormal distributions. The log-scale mean for the initial values was set at a high value to reduce the risk of a runtime JAGS error due to an inconsistent parent node (e.g., estimated survey “abundance” at time y-1 less than observed survey catch at time y). The likelihood for the survey catch matrix uses a Poisson distribution.

# Load necessary library packages
library(rjags)   # Package for fitting JAGS models from within R
library(R2jags)  # Package for fitting JAGS models. Requires rjags

# JAGS code
sink("CatchAge.txt")
cat("
model {
    # Priors
    for(y in 1:Y) {
      Recruits[y] ~ dlnorm(0, 1E-6) # Non-negative values
    SurveyC.est[1,y]<-Recruits[y] }

    for(a in 2:A){
      Initials[a-1] ~ dlnorm(0, 1E-6) # Non-negative; offset index to run vector from 1 to A-1
    SurveyC.est[a,1]<-Initials[a-1] }

  meanS <- prod(S.est)^(1/(Y-1))
  for (y in 1:(Y-1)) {
    S.est[y] ~ dunif(0,1)
    } #y

    # Population projection
    for(a in 2:A) {
   for(y in 2:Y) {
     SurveyC.est[a,y] ~ dbin(S.est[y-1], trunc(SurveyC.est[a-1,y-1])) } }

    #Likelihood
    for (a in 1:A) {
   for (y in 1:Y) {
      SurveyC[a,y]~dpois(SurveyC.est[a,y])
     } }
}
    ",fill=TRUE)
sink()

# Bundle data
jags.data <- list("Y", "A", "SurveyC")

# Initial values
jags.inits <- function(){ list(Recruits=rlnorm(n=Y, meanlog=10, sdlog=1),
                               Initials=rlnorm(n=(A-1), meanlog=10, sdlog=1))}

model.file <- 'CatchAge.txt'

# Parameters monitored
jags.params <- c("S.est", "meanS")

# Call JAGS from R
jagsfit <- jags(data=jags.data, jags.params, inits=jags.inits,
                n.chains = 3, n.thin=1, n.iter = 10000,
                model.file)
print(jagsfit)
plot(jagsfit)

plot(seq(1:(Y-1)), jagsfit$BUGSoutput$mean$S.est, ylab="Survival", xlab="Year",
     type="b", pch=19, ylim=c(0,1))
points(AnnualS, col="blue", type="b")

The final three lines of code compare the annual survival estimates and true values. The plot option “type=b” produces a line of connected dots (i.e., both points and line segments), and the “pch=19” option selects for a filled dot. An online search for “r plot options pch” will provide the various choices for plotting symbols. The range for the y axis is fixed at 0-1 to ensure that the plot will contain the full range of both the estimates and the true values. The points() function adds to the plot the true values. The model provides reliable estimates of annual (and mean) survival and initial cohort size for the simulated data. Run the simulation and model fitting code multiple times to observe model performance. It would be worthwhile to test the approach using more variability in initial cohort sizes; for example, by using a negative binomial distribution rather than a Poisson.

It is important to keep in mind our predicted survey catches are not estimates of true abundance but simply scaled to the survey catches. Millar and Meyer (2000) illustrate a Bayesian approach for estimating true abundance from survey and harvest catch-at-age matrices.

6.2 Telemetry-based

The focus for this section will be on estimating survival using fixed receiver stations and telemetry tags such as transmitters or PIT tags. Letting fish “detect themselves” eliminates the need for costly field surveys that often result in low detection probabilities. For example, Hewitt et al. (2010) showed that survival estimates using fixed stations and automated detections were much better than those obtained using only survey recaptures. Similarly, Hightower et al. (2015) estimated survival of sonic-tagged Atlantic sturgeon which were detected as they migrated past receivers located in estuaries and rivers. The key in studies of this type is either to have many monitoring sites or to position sites in areas where tagged fish are likely to pass.

These studies produces detections that can be organized as a capture history matrix. Each row of the matrix is a tagged fish, and each column is a time period. The period can be chosen based on whatever time scale is relevant and practical; for example, monthly or quarterly detections would provide insights about seasonal survival. Each matrix cell indicates whether a fish was detected (1) or not (0). A fish not detected could be dead or simply did not move within range of a receiver during that period. Interpreting the capture history for an individual depends on the pattern. There is no ambiguity about an individual with a capture history of 11011. It was alive to the end of the study but was not detected on the third occasion. A sequence such as 11000 is not as straightforward because its fate is uncertain for the final three occasions. A model is essential for making inferences in such cases.

The capture history data can be analyzed using a Cormack-Jolly-Seber (CJS) model (Kéry and Schaub 2011). The model is based only on captures or detections of tagged fish, which are assumed to be representative of the population as a whole. Technically, the model provides an estimate of apparent survival because we are not able to distinguish between mortality and permanent emigration. If the study area is closed to migration, then the estimate can be interpreted as a true survival rate.

Our simulation is for the most basic case, with survival and detection probabilities that are constant over time. Extended models that allow for time variation are presented by Kéry and Schaub (2011). The number of occasions and sample size are arbitrary but they serve as useful replicates here because the underlying parameters are constant over time and individuals. We assume a moderate survival rate (0.5) and a high detection probability (0.8). Detection probabilities of that magnitude are difficult to achieve by survey sampling but are reasonable when using an automated monitoring system (Hewitt et al. 2010).

rm(list=ls()) # Clear Environment

# Modified from Kery and Schaub 2012, Section 7.3
# Define parameter values
n.occasions <- 5            # Number of capture occasions
marked <- 50                # Number marked (all at time 1)
S.true <- 0.5
p.true <- 0.8

# Define function to simulate a capture-history (CH) matrix
simul.cjs <- function(S.true, p.true, marked, n.occasions){
  Latent <- CH <- matrix(0, ncol = n.occasions, nrow = marked)
  # Fill the CH matrix
  for (i in 1:marked){
    Latent[i,1] <- CH[i,1] <- 1    # Known alive state at release occasion
    for (t in 2:n.occasions){
      # Bernoulli trial: does individual survive occasion?
      Latent[i,t] <- rbinom(1, 1, (Latent[i,(t-1)]*S.true))
      # Bernoulli trial: is individual recaptured?
      CH[i,t] <- rbinom(1, 1, (Latent[i,t]*p.true))
    } #t
  } #i
  return(CH)
}

# Execute function
CH <- simul.cjs(S.true, p.true, marked, n.occasions)

Our code, modified from Kéry and Schaub (2011), introduces the creation of a user-defined function, which is a way of setting off a discrete block of code that has a specific purpose. We pass to the function the necessary arguments (S.true, p.true, marked, n.occasions) and the function returns the desired result (capture history matrix CH). The latent (true) state and the capture history observation are modeled using Bernoulli trials. The latent state at time t is based on the survival rate between times t-1 and t, but multiplied by the true state at time t-1. That results in live fish having a survival probability of 1 * S.true, whereas the probability for dead fish is 0 * S.true (ensuring that dead fish remain dead). A similar trick is used to generate the observations, given the known true states. Detection probability is 1 * p.true for live fish and 0 * p.true for dead fish. The matrix for latent state is internal to the function, but (for instructional purposes) can be compared to the observed matrix by selecting and running all lines between the second and next-to-last lines of the function. You should see the connection between the true latent state and observations (e.g., a true sequence of 11100 versus an observed sequence of 10100).

# Load necessary library
library(rjags)
library(R2jags)

# Specify model in JAGS
sink("CJS_StateSpace.txt")
cat("
model {

# Priors and constraints
S.est ~ dunif(0, 1)
p.est ~ dunif(0, 1)

# Likelihood
for (i in 1:marked){
   z[i,1] <- 1 # Latent state 1 (alive) at time of release (time 1 here)
   for (t in 2:n.occasions){
      # State process. If z[t-1]=1, prob=S. If z=0, prob=0
      z[i,t] ~ dbern(z[i,t-1]*S.est)

      # Observation process. If z[t]=1, prob=p. If z=0, prob=0
      CH[i,t] ~ dbern(z[i,t]*p.est)
      } #t
   } #i
}
",fill = TRUE)
sink()

# Bundle data
jags.data <- list(CH = CH, marked = marked, n.occasions = n.occasions)

# Initial values for latent state z
init.z <- function(ch){
  state <- ch
  for (i in 1:dim(ch)[1]){
    n1 <- min(which(ch[i,]==1)) # Column with first detection (col 1 here)
    n2 <- max(which(ch[i,]==1)) # Last detection (so known to be alive between n1 and n2)
    state[i,n1:n2] <- 1
    state[i,n1] <- NA
  }
  state[state==0] <- NA
  return(state)
}

jags.inits <- function(){list(S.est = runif(1, 0, 1), p.est = runif(1, 0, 1),
                              z = init.z(CH))}

# Parameters monitored
jags.params <- c("S.est", "p.est")

model.file <- 'CJS_StateSpace.txt'

jagsfit <- jags(data=jags.data, jags.params, inits=jags.inits,
                n.chains = 3, n.thin=1, n.iter = 4000, n.burnin=2000,
                model.file)
print(jagsfit)
plot(jagsfit)

The JAGS code for fitting the model uses our standard uninformative prior distribution for the two probabilities (survival, detection probability). The likelihood mirrors the simulation code. The matrix z contains the latent state, which is sometimes known but inferred in situations where the pattern is ambiguous. JAGS requires good initial values (or will terminate execution if it encounters an initial value that is inconsistent with the observations, such as detection of a fish that the model considers to be dead). To avoid that, we use the Kéry and Schaub (2011) approach where the latent state is set to 1 between the first and last detections of an individual (user-defined function init.z()).

Estimates appear to be very reliable for this scenario, where capture histories for 50 fish are used to estimate only two time-independent parameters. Would you obtain usefully precise results with a sample size of 25 or 30 fish? How does reliability change as you reduce the assumed capture probability?

One important advantage of this telemetry method is that estimates are essentially produced in real-time and on any time scale. For example, the method could be used to estimate average annual survival over the first few years after a regulation change. Even better would be to modify the code to generate annual survival estimates, so that monitoring could be done over a block of years with a management action taken midway.

6.3 Tag-based

Survival can be estimated by a multi-period study using tag returns. An important advantage of this approach is that field effort is limited to the tagging process because the fishery generates the tag returns. The design requires a release of tagged fish at the start of each period. The releases need not be the same but here we assume a constant number released for simplicity. Like the telemetry method (Section 6.2), estimates can be produced for whatever time interval is relevant (e.g., monthly, yearly).

This study design is similar to band-recovery models used in wildlife studies (Brownie et al. 1985; Kéry and Schaub 2011). Those methods were pioneered for migratory birds, with bands returned from individuals that were harvested. In fisheries tagging studies, tags may be also reported from fish that are caught and released but here we assume that returned tags reflect only harvest. Catch-and-release complicates the analysis because the tag “dies” but the fish is not killed. Jiang et al. (2007) provide analytical methods for addressing the combination of harvest and catch-and-release.

We begin by setting up matrices for tags-at-risk (tagged fish still alive) and the outcomes from each release. The matrix of outcomes or fates is analyzed row-by-row using a multinomial distribution. For example, a fish tagged in period 1 could be caught in any period up to the end of the study. It could also end up in the final column for fish not seen again, which would account for fish that were still alive or died of natural causes or were harvested but the tag was not reported. We do not assume here any knowledge of the tag-reporting rate. If that information were available (e.g., through use of high-reward tags that provide an assumed reporting rate of 100%), then it would be possible to partition total mortality into fishing and natural component rates (Brownie et al. 1985). Here we keep things simple and estimate only the tag-recovery rate, which is the product of the exploitation rate and tag-reporting rate. In planning a tag-return study, the best results will be obtained by ensuring a tag-reporting rate close to 100%. Non-reporting results in lost information (a reduced sample size) and poorer precision.

rm(list=ls()) # Clear Environment

# Parameter values for three-yr experiment
Periods <- 3  # Equal number of release and return periods assumed
NTags <- 1000 # Number released each period
S.true <- 0.8 # Survival rate
r.true <- 0.1 # Tag recovery rate (exploitation rate * tag-reporting rate)

# Calculated value(s), array creation
TagsAtRisk <- array(data=NA, dim=c(Periods, Periods))
TagFate <- array(data=NA, dim=c(Periods, Periods+1))   # Extra column for tags not seen again

for (i in 1:Periods){ # Expected values for tags-at-risk, returned/not-seen-again
    TagsAtRisk[i,i] <- NTags
    TagFate[i,i] <-  TagsAtRisk[i,i] * r.true # Returns in release period
  j <- i+1
  while(j <= Periods) {
      TagsAtRisk[i,j] <- TagsAtRisk[i,(j-1)]*S.true
      TagFate[i,j] <- TagsAtRisk[i,j] * r.true
      j <- j+1
     } #while
   TagFate[i, Periods+1] <- NTags-sum(TagFate[i,i:Periods]) # Tags not seen again
  } #i

#Random trial (tags returned by period or not seen again)
RandRet <- array(data=NA, dim=c(Periods, Periods+1))
  for (i in 1:Periods) # Create upper diagonal matrix of random outcomes
  {
    RandRet[i,i:(Periods+1)] <- t(rmultinom(1, NTags, TagFate[i,i:(Periods+1)]))
  } #i

Tags at risk are contained in an upper triangular matrix, with the diagonal element being the number released (NTags) and subsequent elements obtained as the previous number at risk multiplied by the survival rate. We arbitrarily assume Periods=3, a true survival rate (S.true) of 0.8 and a tag-recovery rate of 0.1. You should inspect the matrix by clicking on the name in the Environment window. The above parameters result in a vector of 1000, 800, and 640 tags at risk from a release in period 1. Cells in the matrix of fates are obtained as the number at risk times the tag-recovery rate, plus a final column for fish not seen again. The number released (1000) is arbitrary (and ambitious), but using a large value is helpful for reviewing the calculations that produce the two matrices. The parameter values can be changed to mimic any planned study design.

This section of code introduces a while loop. This different looping construct is needed because all rows except the last one include periods after the release. We need to be able to skip the additional processing in the last row, which has only the diagonal element. The final column for each row is the not-seen-again calculation. That value is highest for the most recent release, because most of those fish will still be alive. That uncertainty about whether fish are still at risk or have died is reduced by extending the number of periods for tag returns. Our example has the same number of releases and return periods, but the code could be modified to allow for additional periods of tag returns.

Inspecting the matrix of fates is helpful for understanding how survival is estimated. The expected tag returns within any column form ratios that equal the survival rate. For example, we would expect 80 tags returned in the second period from the first release compared to 100 from the second release, because the first release has been at large for one additional period. Similarly the ratio from successive rows of returns in period 3 also equal the survival rate. Thus we have several cells (and ratios) that provide information about survival, which we assumed to be constant over the three periods.

The final lines of code in this section introduce the stochasticity associated with a multinomial distribution. The matrix of fates (expected values) provide the cell probabilities (once rescaled, internal to the rmultinom() function). It is instructive to compare the expected values and random realizations to see the level of variability associated with the multinomial distribution. Now the ratios of random values within a column no longer equals (but will vary around) the true survival rate.

# Load necessary library
library(rjags)
library(R2jags)

# Specify model in JAGS
sink("TagReturn.txt")
cat("
model {
  # Priors
  S.est ~ dunif(0,1) # Time-independent survival rate
  r.est ~ dunif(0,1) # Time-independent tag-recovery rate

  # Calculated value
  A.est <- 1-S.est # Rate of mortality

# Cell probabilities
for (i in 1:Periods) {
   p.est[i,i] <- r.est
   for (j in (i+1):Periods) {
      p.est[i,j] <- p.est[i,(j-1)] * S.est
      } #j
    p.est[i,Periods+1] <- 1 - sum(p.est[i, i:Periods])   # Prob of not being seen again
    } #i

# Likelihood
  for (i in 1:Periods) {
    RandRet[i,i:(Periods+1)] ~ dmulti(p.est[i,i:(Periods+1)], NTags)
    }#i
 } # model

",fill=TRUE)
sink()

     # Bundle data
  jags.data <- list("RandRet", "Periods", "NTags")

  S.init <- runif(1,min=0,max=1)
  r.init <- 1-S.init

  # Initial values
  jags.inits <- function(){list(S.est = S.init, r.est=r.init)}

  model.file <- 'TagReturn.txt'

  # Parameters monitored
  jags.params <- c("S.est", "r.est", "A.est")

   # Call JAGS from R
  jagsfit <- jags(data=jags.data, jags.params, inits=jags.inits,
                  n.chains = 3, n.thin=1, n.iter = 4000, n.burnin=2000,
                  model.file)
  print(jagsfit)
  plot(jagsfit)

The analysis begins by establishing the usual uninformative prior distributions for the two rates. We calculate the total mortality rate (A.est) as the complement of the survival rate (1-S.est). One important advantage of a Bayesian analysis is that we automatically obtain measures of uncertainty for functions of model parameters. Here the function is simply 1-S.est, but this advantage applies regardless of the function’s complexity.

The likelihood calculations (dmulti()) requires cell probabilities for the multinomial distribution, so we calculate those probabilities directly rather than keeping track of numbers of tags by fate. The probability of a tag from release i being returned in period j is p.est[i,j] <- p.est[i,(j-1)] * S.est. Make sure that you understand this expression for specific values of i and j. Can you think of another way of coding this calculation? Note that for loops work differently in JAGS compared to R, so we are able to use them for row and column processing. In JAGS, the ‘j’ loop is not executed when i=3 because j>Periods. Note that our initial value for the tag-recovery rate is obtained as 1-S.init. The tag recovery rate cannot exceed 1-S.init, and would typically be less due to the other possible outcome, natural death.

Estimates are good for this case, which is not surprising given the very large releases and only two time-independent parameters to estimate. The estimated number of effective parameters was 1.9 for one run, which is consistent with the two apparent model parameters (S.est, r.est) and A.est which is simply a function of S.est.

If you think about a local system with which you are familiar, how practical would it be to tag 1000 fish at the start of each period (e.g., annually)? If you were responsible for planning the tagging operation, what sampling gear would you recommend? Would there be any potential concern about mortality due to handling? What size field crew would be needed and how much time would the tagging operation require? Could you use simulation to explore different sample sizes before recommending a specific field design?

6.4 Exercises

  1. For the default age-composition analysis without gear selectivity (Section 6.1), compare credible intervals for survival rate at sample sizes of 30 versus 200 fish.

  2. For the telemetry method of estimating survival (Section 6.2), approximately what sample size would produce an estimate with the levels of precision recommended by Robson and Regier (1964) for management (25%) or preliminary studies (50%)?

  3. For the tag-return method (Section 6.3), compare credible intervals for survival rate at releases of 50, 100, and 200 fish. How do the results compare to the Robson and Regier (1964) recommendations? In what ways is this simulated field study optimistic about the reliability of results?

References

Allen, M. S. 1997. Effects of Variable Recruitment on Catch-Curve Analysis for Crappie Populations. North American Journal of Fisheries Management 17(1):202–205.
Brownie, C., D. R. Anderson, K. P. Burnham, and D. S. Robson. 1985. Statistical inference from band recovery data: A handbook. U.S. Fish and Wildlife, Federal Government Series 156, Washington, D.C.
Campana, S. E. 2001. Accuracy, precision and quality control in age determination, including a review of the use and abuse of age validation methods. Journal of Fish Biology 59(2):197–242.
Hewitt, D. A., E. C. Janney, B. S. Hayes, and R. S. Shively. 2010. Improving Inferences from Fisheries Capture-Recapture Studies through Remote Detection of PIT Tags. Fisheries 35(5):217–231.
Hightower, J. E., M. Loeffler, W. C. Post, and D. L. Peterson. 2015. Estimated Survival of Subadult and Adult Atlantic Sturgeon in Four River Basins in the Southeastern United States. Marine and Coastal Fisheries 7(1):514–522.
Jiang, H., K. H. Pollock, C. Brownie, J. M. Hoenig, R. J. Latour, B. K. Wells, and J. E. Hightower. 2007. Tag Return Models Allowing for Harvest and Catch and Release: Evidence of Environmental and Management Impacts on Striped Bass Fishing and Natural Mortality Rates. North American Journal of Fisheries Management 27(2):387–396.
Kéry, M., and M. Schaub. 2011. Bayesian Population Analysis using WinBUGS: A Hierarchical Perspective1st edition. Academic Press, Boston.
Lorenzen, K. 1996. The relationship between body weight and natural mortality in juvenile and adult fish: A comparison of natural ecosystems and aquaculture. Journal of Fish Biology 49(4):627–642.
McCarthy, M. A. 2007. Bayesian Methods for Ecology1st edition. Cambridge University Press, Cambridge, UK ; New York.
Millar, R. B., and R. Meyer. 2000. Bayesian state-space modeling of age-structured data: Fitting a model is just the beginning. Canadian Journal of Fisheries and Aquatic Sciences 57(1):43–50.
Ricker, W. E. 1975. Computation and interpretation of biological statistics of fish populations. Dept. of Fisheries and Oceans : Minister of Supply and Services Canada, Ottawa.
Robson, D. S., and H. A. Regier. 1964. Sample Size in Petersen MarkRecapture Experiments. Transactions of the American Fisheries Society 93(3):215–226.