Abstract
Estimates of age‐specific mortality are regularly used in ecology, evolution, and conservation research. However, estimating mortality of the dispersing sex, in species where one sex undergoes natal dispersal, is difficult. This is because it is often unclear whether members of the dispersing sex that disappear from monitored areas have died or dispersed. Here, we develop an extension of a multievent model that imputes dispersal state (i.e., died or dispersed) for uncertain records of the dispersing sex as a latent state and estimates age‐specific mortality and dispersal parameters in a Bayesian hierarchical framework. To check the performance of our model, we first conduct a simulation study. We then apply our model to a long‐term data set of African lions. Using these data, we further study how well our model estimates mortality of the dispersing sex by incrementally reducing the level of uncertainty in the records of male lions. We achieve this by taking advantage of an expert's indication on the likely fate of each missing male (i.e., likely died or dispersed). We find that our model produces accurate mortality estimates for simulated data of varying sample sizes and proportions of uncertain male records. From the empirical study, we learned that our model provides similar mortality estimates for different levels of uncertainty in records. However, a sensitivity of the mortality estimates to varying uncertainty is, as can be expected, detectable. We conclude that our model provides a solution to the challenge of estimating mortality of the dispersing sex in species with data deficiency due to natal dispersal. Given the utility of sex‐specific mortality estimates in biological and conservation research, and the virtual ubiquity of sex‐biased dispersal, our model will be useful to a wide variety of applications.
Keywords: African lion, age‐specific mortality, dispersal, sex differences in mortality, Siler model, true survival
Introduction
Mortality estimates of both sexes for wild animal populations are fundamental for testing hypotheses derived from ecological and evolutionary theory, and for predicting population size and structure for population management purposes. However, estimating mortality of at least one of the sexes is commonly hindered by incomplete data on dispersing individuals. For example, in many large mammal species, males leave their natal place or social group around the age of maturity, while females are philopatric. If individuals of the dispersing sex, in this case males, leave the areas monitored by field studies that collect resighting data on marked individuals, these dispersing individuals impede the quality of gathered data in the following way.
Dispersing individuals are usually unavailable for collecting age‐at‐death data because following dispersing individuals using telemetry or GPS technology is costly and labor‐intensive. Furthermore, for many species deaths are rarely observed in the field. Instead, deaths are inferred from permanent disappearances of individuals from the study area. However, missing members of the dispersing sex, which were old enough for dispersal, may have died or dispersed. This uncertain fate of disappeared members of the dispersing sex makes the estimation of the mortality difficult using existing methods. The estimation of mortality for the philopatric sex is in comparison relatively straightforward because missing members of the philopatric sex are likely dead, even if their bodies are not found, as these individuals do not disperse.
Models to infer mortality using capture–mark–recapture/resighting (CMRR) data derived from the Cormack–Jolly–Seber framework (CJS; after Cormack 1964; Jolly 1965; Seber 1965) can accommodate both uncensored and right‐censored records (i.e., individuals known to be alive after the last observation). These approaches exploit the fact that each type of record contributes different information (White and Burnham 1999). Extensions to the initial models have been developed that accommodate species‐specific life histories and data issues arising from the movement of the individuals in relation to the spatial and temporal distribution of the marking and resighting effort. Accordingly, these models, known as multistate models (Arnason 1973; Schwarz et al. 1993), incorporate incomplete and heterogeneous resighting probabilities, multiple states, and multiple locations (e.g., Lebreton and Pradel 2002; Mackenzie et al. 2009; Cubaynes et al. 2010). Pradel (2005) extended the multistate framework to account for unobservable states, particularly in the context of movement between sites. This extension, known as multievent models, incorporates the estimation of uncertain states into the modeling of survival while accounting for dispersal rates and site fidelity (Avril et al. 2012; Lagrange et al. 2014). Alternatively, Ergon and Gardner (2014) extended the CJS model into a robust‐design spatial capture–recapture (RD‐SCR) model to jointly model survival and dispersal where the activity centers are treated as a latent state. Similarly, Schaub and Royle (2014) have recently developed a spatially explicit Cormack–Jolly–Seber approach that jointly models mortality and dispersal using movement data for species in which dispersal can be described as a random walk.
These approaches provide a fundamental framework to estimate survival under state uncertainty, particularly in the context of dispersal. Further complications arise when information on sex or ages is missing. In order to address issues with missing records in CMRR data, Bayesian approaches have been developed that estimate survival probabilities and transition probabilities between states and locations while augmenting data (Dupuis 1995, 2002; King and Brooks 2002). Some of these approaches estimate latent (unknown) states jointly with all other model parameters in a hierarchical framework using Markov chain Monte Carlo (MCMC) algorithms (Clark et al. 2005; Colchero and Clark 2012; Colchero et al. 2012). As latent states can be both finite sets of discrete states (e.g., locations or stages) or continuous variables (e.g., date of birth or death), this framework is suitable for developing a survival model that treats dispersal as a latent state, and can therefore accommodate uncertain records due to natal dispersal. This is particularly important for data sets where individuals of one or both sexes disperse but information on their movements is missing. In such cases, there is no information of the fate of potential dispersing individuals at the last time they are detected. At this time, their dispersal state (i.e., either dispersed or died) is unknown, and thus, the estimation of survival can be biased if this latent state is not explicitly modeled.
Here, we present a Bayesian hierarchical model that builds upon the multievent framework (Pradel 2005) and that estimates age‐specific mortality and dispersal for species where one sex is philopatric and one sex undergoes natal dispersal. The model fits a parametric age‐specific mortality model as a function of age and sex jointly with the estimation of the distribution of ages at dispersal, treating potential dispersal as a latent state. Using simulated data, we first validated the model. We then applied the model to estimate age‐specific mortality of both sexes for Serengeti lions (Panthera leo) in Tanzania. As this particular data set contains the expert opinion from the head of the study (C. Packer, unpublished data) on whether a missing male is likely to have dispersed or died, we used this information to gain further insights into the workings of our method. In particular, using the expert's opinion, we varied whether missing males entered the model as potential or known dispersers, and compared the mortality estimates among the different models in order to evaluate the influence of the imputation of dispersal as a latent state on our mortality estimates. For simplicity, we will refer to the philopatric sex as being female, and to the dispersing sex as being male. However, the model is flexible as to which sex is the dispersing sex, while it can be extended for the case where both sexes disperse.
Methods
We focus on species in which individuals disperse out of the study area only once at around the age of maturity (“natal dispersal”) and where information on individual dispersal events is unavailable. In addition, our model is developed for data sets where movements within the study area are missing. To isolate the effect of uncertainty in male records on mortality estimates from other effects, we focus on data that meet the following assumptions. We assume that individuals are resighted with certainty if they are alive and in the study area. For estimating the age‐specific probabilities of dispersal for the dispersing sex, we further assume that mortality in‐ and outside the study area is equal and that individuals born outside the study area disperse into the study area with equal probabilities as individuals born in the study area disperse out of it. We also assume that ages of individuals whose birth was not observed (left‐truncated records) can be estimated with sufficient certainty by a trained observer to allow us to not include time of birth as a latent state in the model and to model ages at death as a continuous variable, although this can be included following Colchero and Clark (2012); Colchero et al. (2012). However, as the data available to us for the empirical application contained individuals that died before sexing was possible, we did construct the model to accommodate this type of record, treating the sex of unsexed individuals as another latent state. Finally, we further make one assumption that we know is not met for data from wild animal populations, and that is that mortality only depends on age and sex and not on any other covariates. However, this assumption allows us to develop a model to estimate baseline mortality for pooled data, which can later on be easily extended to incorporate other covariates.
Life history data
Data structure
The life history data used to estimate age‐ and sex‐specific mortality included records for native‐borns and immigrants. Native‐borns were born in the study population, defined as all individually recognizable and constantly monitored individuals. Immigrants entered the study population some time after their birth due to migration (Fig. 1). Similarly, individuals that were located in the study area at the time the study began had a first detection age . The recorded types of departure from the population included death, censoring due to being alive at the end of the study, or uncertain fate (death or censoring through dispersal). Uncertain fates through dispersal were only caused by dispersals from the study population to an external population, and not by dispersals within the study population. Here, we refer to this out‐migration from the study population when we use the term “dispersal.”
Serengeti population
The study population occupied a 2000 km region of Serengeti National Park, Tanzania, that lies at the heart of the Serengeti–Mara ecosystem. The study site is characterized by seasonal rainfall and a southeast to northwest gradient in vegetation from short to tall grassland to open woodlands (Packer 2005; Mosser et al. 2009). We analyzed demographic data collected between 1966 and 2013. Observations were opportunistic between 1966 and 1984, and most animals were sighted 1–3 times per month. Study prides have been monitored with radio telemetry since 1984, allowing each animal to be observed 2–6 times per month. All individuals are identified from natural markings (Packer et al. 1991), and birth dates of cubs born in the study area are deduced from lactation stains on the mothers. A large number of nomadic males enter the area, and a small proportion become resident in one or more of the resident prides. Our analyses exclude all nomadic males that never became residents in the study area (N = 548, ∼25% of all observations on males). These left‐truncated and right‐censored records contain little survival information. As a consequence, a model that included these records did not converge. Individuals with unknown dates of birth were assigned an estimated age by a trained observer, using age indicators (e.g., relative body size, nose coloration, and eruption and wear of teeth) (Smuts et al. 1978; Whitman et al. 2004). The data set contained a large number of individuals of unknown sex. As the vast majority of these unsexed individuals died within the first weeks after birth, we excluded all individuals with last detection ages younger than 0.25 years of age. The final data set contained observations on 1341 females, 1263 native‐born males, 316 immigrants, and 269 unsexed native‐born individuals. The proportion of females among all native‐born individuals (excluding immigrants), assuming a sex ratio of 1 to 1 among individuals that died before their sex could be determined, was 0.51.
Mortality analysis
Model variables and functions
We developed a model that estimates both age‐specific mortality and dispersal where the dispersing state is unknown. Thus, at the time of last detection the dispersing state, , for an individual i that belongs to the dispersing sex is treated as a latent state that needs to be estimated. Accordingly, the model requires defining random variables and probability functions for the age at death, X, and for the age at natal dispersal, Y, as well as for the binary latent state, D, with support given by if an individual is imputed to have dispersed and otherwise. Furthermore, we have extended the model to account for unknown sex, S (see Table 1 for a summary of all random variables, parameters, and indicators).
Table 1.
Modeled random variables | ||
X | Random variable for age at death, where x is any age element | |
Y | Random variable for age at natal dispersal with elements y | |
D | Binary random variable for disperser or nondisperser | |
S | Binary random variable for sex | |
Observed variables and indicators | ||
|
Vector of times of first detection | |
|
Vector of times of last detection | |
b | Vector of times of birth | |
|
Vector of ages at first detection () | |
|
Vector of ages at last detection () | |
m | Indicator vector for immigrants ( if immigrant) | |
Updated indicators | ||
d | Indicator vector for dispersers ( if disperser and otherwise) | |
s | Indicator vector for sex ( if female and otherwise) | |
Parameters | ||
θ | Vector of mortality parameters | |
γ | Vector of natal dispersal parameters | |
Functions | ||
Mortality | ||
μ(x¦θ) | Mortality (Siler model) | |
S(x¦θ) | Survival | |
F(x¦θ) | CDF for age at death (F(x) = 1−S(x)) | |
f(x¦θ) | PDF for age at death | |
Dispersal | ||
g(y¦γ) | PDF for age at natal dispersal (gamma distribution) | |
G(y¦γ) | CDF for age at natal dispersal |
The age‐specific mortality model requires defining the mortality function or hazard rate as
(1) |
where θ is a vector of mortality parameters to be estimated. The estimated mortality can be used to calculate the probability of survival from birth to age x, or survival function,
(2a) |
the probability that death occurs before age x, or the cumulative density function (CDF),
(2b) |
and the probability density function (PDF) for age at death
(2c) |
To capture the bathtub‐shaped mortality rates typical of large mammals, we used the Siler model (Siler 1979) in the form
(3) |
where , with and . The Siler model is a competing risk model constituted by three additive mortality hazards. The parameters capture different aspects of the shape of the age trajectory with the exponential of being the initial level of mortality rates and governing the exponential decrease in mortality over infant and juvenile ages. The c parameter scales mortality rates up or down and is usually interpreted as reflecting age‐independent causes of mortality. This parameter is also dominant in capturing mortality in early adult ages when infant mortality has declined and senescence mortality not yet risen. The exponential of the parameter represents the initial mortality of the age‐dependent increase of mortality and determines the rate of this increase (Siler 1979).
To model the ages at dispersal, we defined the random variable Y for age at dispersal, where the age at natal dispersal was for ages y > 0, with being the Gamma distribution function with parameter vector . This distribution yields the probability density function (PDF) of age at natal dispersal given by
(4) |
where α is the minimum age at natal dispersal and .
At the age of last detection, , individuals belonging to the dispersing sex can have dispersed, with a probability conditioned on X and Y given by
(5a) |
It is the joint probability that these individuals have not died and have dispersed shortly after the last detection age. The probability that these individuals have died and have not dispersed, but would have dispersed at later ages, is accordingly
(5b) |
As we specify above, the dispersal state is treated as a latent state and is therefore imputed. Below we explain how the likelihoods are specified and how the latent states are imputed. A summary of all the functions, parameters, indicators, and variables is provided in Table 1. R code to simulate data and fit the model can be downloaded from the link provided in the Supporting Information.
Likelihood and posterior
To construct the mortality likelihood, we assigned a different probability to each type of record in Figure 1. The likelihood for the nondispersing individuals (i.e., members of the nondispersing sex or members of the dispersing sex that disappeared at ages younger than the minimum age at dispersal α) is given by
(6a) |
where corresponds to the age at last detection and is the age at first detection (i.e., for individuals born in the study area and for immigrants or individuals that were located in the study area when the study began). As we mentioned above, we defined dispersal state for all members of the dispersing sex with last seen ages older than the minimum age at dispersal α as a random variable D. It took the value if an individual i, born at and last detected at , dispersed in its last detection age, , and 0 if otherwise. For some individuals, is known either because the individuals were known to be alive and in the study area at the end of the study (i.e., right‐censored observations), or because their disappearance was known to be a death or a dispersal. For all other individuals, was imputed as a latent state.
Based on equations (5), the joint mortality and dispersal likelihood for members of the dispersing sex with is given by
(6b) |
where is an indicator for individuals that joined the study population as immigrants, and thus, these individuals contribute important information on ages at death and dispersal.
Furthermore, we also defined a binary variable S for the sex of the individual. With this, we could construct the full Bayesian model as
(7) |
where d was the vector of dispersal states and s was the indicator vector for sex ( if female and if male), and and are vectors of prior hyperparameters for the mortality and dispersal parameters. Each of these vectors had two subsets represented by the subscripts u for unknown and k for known.
MCMC and conditional posteriors
We used a Markov chain Monte Carlo (MCMC) algorithm to fit the model in equation (7). For all implementations, we ran four parallel MCMC sequences with different randomly drawn starting values and set the number of iterations to 15,000 steps with a burn‐in of 5000 initial steps and a thinning factor of 20. We used a hierarchical framework that only needed the conditionals for posterior simulation by Metropolis sampling (Metropolis et al. 1953; Gelfand and Smith 1990; Clark 2007). This means that for this particular case, the algorithm divided the posterior for the joint distribution of unknowns into four sections: (a) estimation of mortality parameters, (b) estimation of dispersal parameters, (c) estimation of unknown dispersal state, and (d) estimation of unknown sexes. Here, we present each section, specifying the conditional posterior and the acceptance probability for the Metropolis Sampler algorithm.
Section a: Posterior for mortality parameters
The conditional posterior to estimate the mortality parameters θ required only the ages at first and last detection and and the dispersal states . The posterior for a given individual i was
(8) |
where was a vector of prior hyperparameters. If the individual was a native‐born, then and the denominator in both expressions was equal to 1. At every iteration and for a given parameter θ ∈ θ with conditional posterior p(θ¦⋯), the algorithm proposes a new parameter value for each element of and accepts it with acceptance probability
(9) |
Section b: Posterior for dispersal parameters
The conditional posterior to estimate the parameters γ for the distribution of ages at natal dispersal for a given individual i was
(10) |
where was a vector of prior hyperparameters for γ, was an indicator that assigns 1 if an individual was a potential disperser (i.e., if it belonged to the dispersing sex and disappeared at an age older than the minimum age at dispersal α), and was an indicator for immigrants. We set the minimum age at dispersal to α = 1.75 years for the simulated data and α = 1.5 for the Serengeti data. The age α corresponded to the earliest age at which immigrants could be detected and potential dispersers could be last seen. For a parameter γ ∈ γ with conditional posterior density p(γ⋯), the acceptance probability for a proposed parameter of was
(11) |
Section c: Posterior for dispersal states
Dispersal state was evaluated for individuals that were potential dispersers (i.e., ). The joint probabilities for dispersal state were
(12) |
The first terms on the right‐hand side of equation (12) correspond to the likelihood function as defined in equations (6), while the second terms are the priors for dispersal state. For this section, the acceptance probability for the sampling given the last detection ages, the dispersal states, the potential disperser states, and the immigration states was
(13) |
Section d: Posterior for unknown sexes
Some individuals disappeared before the minimum age at dispersal without their sex being determined. The conditional posterior for the latent state of sex was
(14) |
where the second term on the right‐hand side is a prior for sex based on the sex ratio at birth, or if the analysis was conditioned on survival to age x, based on the sex ratio at age x.
The indicator for potential dispersers (see Section c) was updated in each iteration. Individuals of undetermined sex and last detection ages older than the minimum age at dispersal were assigned 1 if imputed to be male and 0 if imputed to be female. The acceptance probability given the last detection ages and the mortality parameters was
(15) |
Mortality and dispersal priors
We set the Siler parameters for the prior for both sexes to (σ = 0.5), (σ = 0.25), (σ = 0.25), (σ = 0.5), and (σ = 0.25). For dispersal, the Gamma parameters (shape and scale) for the prior were set to with . Priors were normally distributed and truncated at 0, apart from the level parameters of the Siler model ( and ), which were not truncated. Both the mortality and dispersal priors were fairly uninformative. The priors for the probability of being female was 0.5 for the simulated data and 0.51 for to the Serengeti data (see also subsection “Serengeti population”).
Model application and posterior analysis
We fitted the model to the Serengeti data with sex as a covariate, which was imputed for individuals with unknown sex. We included the covariate by making the mortality parameters contained in θ functions of the covariate, namely
(16) |
where if female and 0 otherwise.
In order to gain deeper insights into the performance of our model, we further exploited a unique source of information that is contained in this data set. A Serengeti lion expert used the circumstances accompanying the disappearances of males to deduce whether the individuals may have dispersed (C. Packer, unpublished data). For example, as young males often leave their natal prides with brothers, a simultaneous disappearance of brothers hints that this is likely to be a dispersal event. We fitted the model with three different settings. First, all males with uncertain fates and last detection ages older than minimum age at dispersal were assigned the state of “potential dispersers” and entered in the model as described in “Section c” above (Model A). Second, all males that were indicated by the expert to potentially have dispersed were entered as “known dispersers” (see equation (6b)b) (Model B). And third, all males that were indicated by the expert to potentially have dispersed were entered as “potential dispersers” while other uncertain male records were treated as having died at the last detection age (Model C).
To avoid problems arising from the large number of unsexed individuals that died within the first weeks after birth, we fitted the model from the start age of 0.25 years. We predicted mortality rates for each sex using the parameter estimates of every step of the MCMC after burn‐in and thinning and used these predictions to calculate mean and credible intervals of mortality rates. To compare the three models to each other, we computed the life expectancy at the model start age and the Kullback–Leibler (KL) divergences of the mortality parameter posterior densities (Kullback and Leibler 1951; McCulloch 1989; Burnham and Anderson 2001) (see Methods S1 for details on the calculation and the interpretation of KL values).
Simulated data
To validate the performance of our model, we used known mortality parameters to simulate data of the described structure and checked whether our model accurately retrieved these parameters. To simulate the data, we first randomly assigned a sex for an initial number of individuals by drawing from a binomial distribution, assuming an equal probability of being born male or female. We then randomly drew ages at death () for each individual i by inverse sampling from a Siler CDF (see equations (2b)b and (3)) with parameters for females and for males. The subscripts f and m denote females and males, respectively. We then randomly drew ages at dispersal for all males by inverse sampling from a gamma CDF with parameters γ = {10,3} and adding the minimum age of dispersal α = 1.75. We assigned every individual a last detection age depending on its sex and dispersal status. For females and for those males whose ages at death were simulated to be younger than their ages at dispersal (i.e., they died before they could disperse), the last detection ages were the ages at death. For the other males, who were simulated to have died after dispersal, the last detection ages were set to be the ages at dispersal. Finally, to add immigrants to the data, we simulated the same number of males being born in the external population. For these males, as before, we randomly drew ages at deaths and ages at dispersal, and if they were simulated to have dispersed before death, we added them to the data as immigrants with their ages at death recorded as last detection ages and their ages at dispersal recorded as first detection ages .
We simulated data sets of two different initial numbers of native‐borns (small sample size N = 500 and large sample size N = 2000). Within each sample size, we also produced further data sets where the sexes of all individuals were known, and data sets where we randomly assigned, with a probability of 0.3, the state of “unknown sex” to all individuals that died at <1 year of age. Finally, we simulated data that varied in the proportion of observed or “known” deaths among individuals that were no longer resighted. We used three settings: 1, 5, and 10% known deaths. In total, we thus simulated 12 data sets. All simulations and analyses were conducted using the statistical computing language R (R Core Team 2012).
Results
Simulation study
We used a simulation study to validate our model. For all 12 simulations, the mortality rates used to simulate the data lay within the 95% credible intervals of the estimated mortality for both sexes (Fig. 2). Of all the introduced variations in data quality (sample size, unsexed individuals, proportion “known” deaths), the only one with a marked effect on the performance of the model was varying the sample size. As could be expected, smaller sample sizes resulted in wider credible intervals particularly for males and for older ages of females. Due to the wider confidence bands for smaller sample sizes, the respective estimated mortality rates could appear to be less variable over the life span than the mortality rates used to simulate the data. This manifested as a less pronounced U‐shape of the estimated mortality rates when compared to the “real” mortality rates (e.g., second panel in second row of Fig. 2). The proportion of unsexed individuals dying at <1 year of age, and the proportion of known deaths among disappearances did not discernibly affect the retrieval of the mortality parameters.
Application
The empirical models for Serengeti lions converged for all estimated parameters (Fig. 3; see also Figs. S1–S3 for traces). To supplement the visual inspection of the chains, we further confirmed convergence for the c parameters using the potential scale reduction (Gelman et al. 2013). We obtained values very close to 1 (between 0.999 and 1.002) for five of the six estimated c parameters (Model A, B, C and both sexes). Only one c parameter for females had a value of 1.05, which is still within the limits of having reached convergence. Overall mortality of both sexes was U‐shaped with high initial cub mortality, low mortality of prime‐aged adults, and an age‐dependent increase in mortality at older ages (Fig. 4). Mortality of males was higher than mortality of females across all ages (Fig. 4), except for very young ages, up until 1 year, during which confidence bands of male and female mortality overlapped. However, this may be due to the large proportion of unsexed individuals at these ages (see data description) and the imputation of sex as a latent state for these individuals, which introduced uncertainty. Due to the higher male mortality rates across most ages, female life expectancy (4.7 years at model start age) exceeded that of males by approximately 2 years.
Now we turn to the comparison between the models with varying settings for potential dispersers. Model A (Fig. 4A) treated the data as if no further information was available on dispersal status of males with uncertain fates (i.e., the default setting of the model). Model B took advantage of expert knowledge on lion behavior and treated all males that a lion expert believed were dispersers, as known dispersers (Fig. 4B). Finally, Model C treated all expert‐indicated potential dispersers as potential dispersers and thus considered all other uncertain male records to represent deaths (Fig. 4C). The number of potential dispersers whose dispersal state was imputed as a latent state was therefore smaller in Model C when compared to Model A.
We compare these models by examining the estimated mortality rates (Fig. 4), the posterior density distributions (Fig. 3), and the KL divergences (Fig. 5). As females were treated the same way in all three models, the posterior distributions of parameters for females were congruent among the three models (Fig. 3). Consequently, the KL divergences were close to, or equal to, 0.5 (Fig. 5), and female mortality rates were almost identical across all three models (Fig. 4).
For males, the three models gave slightly varying results. The different settings regarding potential dispersers mostly affected the estimation of the Siler parameters that describe initial mortality (), the age‐dependent decrease in mortality at young ages (), and the age‐independent mortality (c) (Fig. 5). The initial mortality was higher in Model B, and lower in Model C, when compared to the default model A (Fig. 3, Table S1). The age‐dependent decrease in mortality was steeper in Model B compared to Model A but similar between Model A and C. The age‐independent mortality was higher in both Model B and C when compared to the default Model A.
The differences among the three models can be more fully understood by comparing the male mortality rates predicted from the three models (Fig. 4). Due to the steep decline in age‐dependent mortality at younger ages when all expert‐indicated dispersers were treated as dispersers (Model B), mortality rates during the juvenile ages up to approximately three years of age were lower in Model B when compared to both models that imputed dispersal state for potential dispersers (Model A and C). However, for the prime‐adult ages, Model B gave the highest mortality estimates, followed by Model C, and then Model A, which gave the lowest estimates. Mortality rates at older ages were highest in Model A and B. Despite these differences in the shape of the mortality rates curves, the life expectancies at 0.25 years of age were predicted to be identical by Model A and B (2.7 years), and only slightly different by Model C (2.4 years).
Discussion
Life history data of wild animals are often incomplete because animals, even though alive and well, may temporarily or permanently be absent when researchers try to observe them at a given location. This has far reaching consequences for the estimation of biological properties from these data. Accordingly, various statistical approaches have been developed that account for temporal and spatial heterogeneity in recapture probabilities. For example, multistate CMRR methods have been applied to estimate survival rates while accounting for migration between locations within study sites (Arnason 1973; Schwarz et al. 1993; Lebreton and Pradel 2002; Pradel 2005; Mackenzie et al. 2009; Lagrange et al. 2014). And spatially explicit CMRR methods have been developed to estimate survival probabilities and population size (Borchers and Efford 2008; Efford and Mowat 2014; Ergon and Gardner 2014). Furthermore, a recently developed spatially explicit Cormack–Jolly–Seber approach jointly models dispersal and survival hierarchically for species in which dispersal movements can be assumed to follow a random walk (Schaub and Royle 2014).
However, these models require some information on movement within the study area to estimate mortality parameters and latent states. Our model is an alternative to these models for data sets where no information on movement within the study is available and thus dispersal state is entirely unknown. Instead, potentially dispersing individuals are resighted with certainty as long as they are alive and in the study area, and they are not resighted after they dispersed. To meet these challenges, our model does not model spatially heterogeneous detection probabilities and dispersal distances but rather imputes the dispersal state of the uncertain male records (i.e., died or dispersed) as a latent state variable in a Bayesian hierarchical framework (Clark et al. 2005; Colchero and Clark 2012; Colchero et al. 2012). We therefore show that for species with sex‐specific natal dispersal, mortality and dispersal can be jointly modeled without using movement data. Of course, movement data could potentially be used to inform the dispersal process. However, we decided to develop a model that does not rely on spatial data so that the model can easily be applied to data sets that differ in the structure of available spatial data.
To gauge the possibility of estimating sex‐ and age‐specific mortality in species with sex‐biased natal dispersal, we focused on data with incomplete records for sex and age at death. We assumed that this uncertainty could arise from one of two mechanisms. Firstly, native‐born males that disperse from the study area can cause uncertainty in male records of age at death, and secondly, individuals dying as juveniles before their sex could be determined resulted in uncertain sex records. Implicitly, the model therefore assumes that all birth dates are known and that all other types of records can be treated as complete records. Consequently, the model treated the last detection ages of potential dispersers that were imputed to be nondispersers and of immigrants as certain ages at death. The accuracy of the model therefore hinges on the assumption that potential dispersers disperse only once during their life. During our study, it became apparent that while this assumption holds for some lion populations (A. Loveridge, unpublished data), it does not hold for the Serengeti population.
Relaxing the assumption and accounting for higher‐order dispersal necessitates a customized extension of the mortality model we present here. The effectiveness of fitting this more complex model depends on the availability of information on both known deaths and dispersal events among immigrants. In the case of the Serengeti population, we took advantage of the expert's indication on likely dispersal state of disappearing immigrants and extended the default model (Model A) to treat all immigrants that were indicated to be likely dispersers as censored at last seen ages. The difference between the male mortality estimates from the default model and the extended model provides an indication of the amount by which male mortality is overestimated if secondary dispersal is not accounted for (Fig. 6). To improve mortality estimates, in future extensions of the model secondary dispersal can be imputed as a further latent state, similarly to what we have showcased here for natal dispersal.
Another consequence of the treatment of immigrants’ last detection ages as ages at death is that the ratio of immigrants to dispersers is likely to influence the estimation of male mortality parameters. Problems may arise if the number of individuals that disperse out of the study area is much higher than the number of individuals that immigrate into it (see Fig. S4 for a simulation). This may be the case for field sites that are established in protected areas and act as a source population for surrounding habitats of lower quality. Mortality in these habitats, and mortality during the dispersal process itself, may also be higher than mortality within the study area. Our model cannot account for this heterogeneity because the data only contain information collected within the study area.
Finally, the comparison of the different models for the lion data allows us to draw some conclusions about the sensitivity of mortality estimates to varying levels of uncertainty in male records. If all expert‐indicated dispersers were in fact dispersers (Model B), then by comparing the mortality rates estimated by this model to the one with the default treatment of uncertain records (Model A), we learn that the default model may have the tendency to overestimate mortality during juvenile ages (lower in Model A than B). The default model may furthermore slightly underestimate mortality during prime‐adult ages. As the model that treats all expert‐indicated dispersers as potential dispersers and treats all other uncertain records as deaths (Model C) shares properties of both Models A and B (similar c to Model B, similar and to Model A), and may come closest to reality, it seems like a promising avenue for future development to directly include expert knowledge in the Bayesian framework via priors. However, this information is an idiosyncrasy of the data set that we used here. Making the model dependent on this information would therefore preclude the application of the model to estimate mortality for other populations and species.
In conclusion, we have discussed here how the model hinges on various assumptions. If these are met, then the model performs well at estimating mortality of the dispersing sex, as we have shown in the simulation study. The assumptions appear to restrict the utility of the model because many ecological data sets may not comply with them. However, we have explained how the different assumption can be relaxed by extending the basic, here‐presented model. The hierarchical framework and the modeling of the joint probabilities of ages at death and dispersal for potential dispersers provide flexibility that can be exploited to adapt the model to the specific data structure of each data set. Extensions can include other covariates, information on interval censoring, and imperfect detection probabilities. For example, an extension to account for secondary dispersal, dispersal of both sexes, and unknown times of birth is currently developed for a comparative study of six primate populations (F. Colchero, unpublished data). Overall, we conclude that our model provides a good solution to the challenge of estimating mortality of the dispersing sex in species with data deficiency for the dispersing sex due to natal dispersal.
Conflict of Interest
None declared.
Supporting information
Acknowledgments
We thank Tim Coulson, Jacques Deere, Owen Jones, Jason Matthiopoulos, Ben Sheldon, and Emily Simmonds for helpful comments on previous versions of the manuscript.
References
- Arnason, A. N. 1973. The estimation of population size, migration rates and survival in a stratified population. Res. Popul. Ecol. 15:1–8. [Google Scholar]
- Avril, A. , Letty J., Pradel R., Léonard Y., Santin‐Janin H., and Pontier D.. 2012. A multi‐event model to study stage‐dependent dispersal in radio‐collared hares: when hunting promotes costly transience. Ecology 93:1305–1316. [DOI] [PubMed] [Google Scholar]
- Borchers, D. L. , and Efford M. G.. 2008. Spatially explicit maximum likelihood methods for capture–recapture studies. Biometrics 64:377–385. [DOI] [PubMed] [Google Scholar]
- Burnham, K. P. , and Anderson D. R.. 2001. Kullback‐Leibler information as a basis for strong inference in ecological studies. Wildl. Res. 28:111–119. [Google Scholar]
- Clark, J. S. 2007. Models for ecological data. Princeton University Press, Princeton, NJ. [Google Scholar]
- Clark, J. S. , Ferraz G. A., Oguge N., Hays H., and DiCostanzo J.. 2005. Hierarchical Bayes for structured, variable populations: from recapture data to life‐history prediction. Ecology 86:2232–2244. [Google Scholar]
- Colchero, F. , and Clark J. S.. 2012. Bayesian inference on age‐specific survival for censored and truncated data. J. Anim. Ecol. 81:139–149. [DOI] [PubMed] [Google Scholar]
- Colchero, F. , Jones O. R., and Rebke M.. 2012. BaSTA an R package for Bayesian estimation of age‐specific survival from incomplete mark‐recapture/recovery 568 data with covariates. Methods Ecol. Evol. 3:466–470. [Google Scholar]
- Cormack, R. M. 1964. Estimates of survival from the sighting of marked animals. Biometrika 51:429–438. [Google Scholar]
- Cubaynes, S. , Pradel R., Choquet R., Duchamp C., Gaillard J. M., Lebreton J. D., et al. 2010. Importance of accounting for detection heterogeneity when estimating abundance: the case of French wolves. Conserv. Biol. 24:621–626. [DOI] [PubMed] [Google Scholar]
- Dupuis, J. A. 1995. Bayesian estimation of movement and survival probabilities from capture–recapture data. Biometrika 82:761–772. [Google Scholar]
- Dupuis, J. A. 2002. Prior distributions for stratified capture–recapture models. J. Appl. Stat. 29:225–237. [Google Scholar]
- Efford, M. G. , and Mowat G.. 2014. Compensatory heterogeneity in spatially explicit capture–recapture data. Ecology 95:1341–1348. [DOI] [PubMed] [Google Scholar]
- Ergon, T. , and Gardner B.. 2014. Separating mortality and emigration: modelling space use, dispersal and survival with robust‐design spatial capture–recapture data. Methods Ecol. Evol. 5:1327–1336. [Google Scholar]
- Gelfand, A. E. , and Smith A. F. M.. 1990. Sampling‐based approaches to calculating marginal densities. J. Am. Stat. Assoc. 85:398–409. [Google Scholar]
- Gelman, A. , Carlin J. B., Stern H. S., Dunson D. B., Vehtari A., and Rubin D. B.. 2013. Bayesian data analysis. 3rd ed., CRC Press, Boca Raton, Florida, USA. [Google Scholar]
- Jolly, G. M. 1965. Explicit estimates from capture–recapture data with both death and immigration‐stochastic model. Biometrika 52:225–247. [PubMed] [Google Scholar]
- King, R. , and Brooks S. P.. 2002. Bayesian model discrimination for multiple strata capture–recapture data. Biometrika 89:785–806. [Google Scholar]
- Kullback, S. , and Leibler R. A.. 1951. On information and sufficiency. Ann. Mathemat. Stat. 22:142–143. [Google Scholar]
- Lagrange, P. , Pradel R., Belisle M., and Gimenez O.. 2014. Estimating dispersal among numerous sites using capture–recapture data. Ecology 95:2316–2323. [DOI] [PubMed] [Google Scholar]
- Lebreton, J. , and Pradel R.. 2002. Multistate recapture models: modelling incomplete individual histories. J. Appl. Stat. 29:353–369. [Google Scholar]
- Mackenzie, D. I. , Nichols J. D., Seamans M. E., and Gutierrez R. J.. 2009. Modeling species occurrence dynamics with multiple states and imperfect detection. Ecology 90:823–835. [DOI] [PubMed] [Google Scholar]
- McCulloch, R. E. 1989. Local model influence. J. Am. Stat. Assoc. 84:473–478. [Google Scholar]
- Metropolis, N. , Rosenbluth A. W., Rosenbluth M. N., Teller A. H., and Teller E.. 1953. Equation of state calculations by fast computing machines. J. Chem. Phys. 21:1087–1092. [Google Scholar]
- Mosser, A. , Fryxell J. M., Eberly L., and Packer C.. 2009. Serengeti real estate: density vs. fitness‐based indicators of lion habitat quality. Ecol. Lett. 12:1050–1060. [DOI] [PubMed] [Google Scholar]
- Packer, C. 2005. Ecological change, group territoriality, and population dynamics in Serengeti lions. Science 307:390–393. [DOI] [PubMed] [Google Scholar]
- Packer, C. , Pusey A. E., Rowley H., Gilbert D. A., Martenson J., and O’Brien S. J.. 1991. Case‐study of a population bottleneck – lions of the Ngorongoro Crater. Conserv. Biol. 5:219–230. [Google Scholar]
- Pradel, R. 2005. Multievent: an extension of multistate capture–recapture models to uncertain states. Biometrics 61:442–447. [DOI] [PubMed] [Google Scholar]
- R Core Team . 2012. R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria: URL http://www.Rproject.org/. [Google Scholar]
- Schaub, M. , and Royle J. A.. 2014. Estimating true instead of apparent survival using spatial Cormack‐Jolly‐Seber models. Methods Ecol. Evol. 5:1316–1326. [Google Scholar]
- Schwarz, C. J. , Schweigert J. F., and Arnason A. N.. 1993. Estimating migration rates using tag‐recovery data. Biometrics 49:177–193. [Google Scholar]
- Seber, G. A. 1965. A note on the multiple‐recapture census. Biometrika 52:249–259. [PubMed] [Google Scholar]
- Siler, W. 1979. A competing‐risk model for animal mortality. Ecology 60:750–757. [Google Scholar]
- Smuts, G. L. , Anderson J. L., and Austin J. C.. 1978. Age determination of the African lion (Panthera leo). J. Zool. 185:115–146. [Google Scholar]
- White, G. C. , and Burnham K. P.. 1999. Program MARK: survival estimation from populations of marked animals. Bird Study 46:120–139. [Google Scholar]
- Whitman, K. , Starfield A. M., Quadling H. S., and Packer C.. 2004. Sustainable trophy hunting of African lions. Nature 428:175–178. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.