Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 May 23.
Published in final edited form as: Ecology. 2016 Mar;97(3):765–775.

Integrating occupancy models and structural equation models to understand species occurrence

Maxwell B Joseph 1, Daniel L Preston 2, Pieter T J Johnson 3
PMCID: PMC4877056  NIHMSID: NIHMS780711  PMID: 27197402

Abstract

Understanding the drivers of species occurrence is a fundamental goal in basic and applied ecology. Occupancy models have emerged as a popular approach for inferring species occurrence because they account for problems associated with imperfect detection in field surveys. Current models, however, are limited because they assume covariates are independent (i.e., indirect effects do not occur). Here, we combined structural equation and occupancy models to investigate complex influences on species occurrence while accounting for imperfect detection. These two methods are inherently compatible because they both provide means to make inference on latent or unobserved quantities based on observed data. Our models evaluated the direct and indirect roles of cattle grazing, water chemistry, vegetation, nonnative fishes, and pond permanence on the occurrence of six pond-breeding amphibians, two of which are threatened: the California tiger salamander (Ambystoma californiense), and the California red-legged frog (Rana draytonii). While cattle had strong effects on pond vegetation and water chemistry, their overall effects on amphibian occurrence were small compared to the consistently negative effects of nonnative fish. Fish strongly reduced occurrence probabilities for four of five native amphibians, including both species of conservation concern. These results could help to identify drivers of amphibian declines and to prioritize strategies for amphibian conservation. More generally, this approach facilitates a more mechanistic representation of ideas about the causes of species distributions in space and time. As shown here, occupancy modeling and structural equation modeling are readily combined, and bring rich sets of techniques that may provide unique theoretical and applied insights into basic ecological questions.

Keywords: occupancy model, latent variable, imperfect detection, structural equation model, indirect effects, cattle grazing

Introduction

Much of ecology is concerned with explaining and predicting where species occur in space and time. While species distributions are directly and indirectly influenced by a diverse suite of abiotic and biotic factors, most statistical treatments do not formally differentiate between direct and indirect effects (Guisan and Thuiller 2005). Structural equation modeling (SEM) provides one means by which to formally represent causal assumptions in the form of direct and indirect causal pathways, helping to close the gap between biological mechanisms and statistical methodology (Bollen 1989, Grace 2006).

Structural equation models combine causal assumptions with observations to generate causal inferences (Pearl 2000). Often, these assumptions are displayed in a path diagram that represents model components and causal relationships (i.e. how one variable affects another). These path diagrams encode both direct and indirect effects, such as those involved in complex processes that structure communities (Clough 2012, Alsterberg et al. 2013). As a result, SEM provides a framework for clarifying, representing, and evaluating hypotheses in ecology that can outperform traditional associational statistical methods. For example, the relationship between productivity and biodiversity has historically been evaluated statistically in a bivariate context, despite the multidimensionality of the hypotheses that have been put forward to explain the relationship. By explicitly representing causal relationships between productivity and diversity, along with disturbance and stressors in a SEM, the mechanistic relationship between productivity and biodiversity becomes more clear, with more predictive power than bivariate approaches (Grace et al. 2014). In addition, SEM is flexible, with recent developments facilitating non-linear effects and a wide range of distributions, broadening the potential applicability of SEM in ecology (Lee 2007). Finally, SEM differentiates between latent and observable variables. Latent variables cannot be observed directly, but observations can provide information on latent quantities. A classic example is the presence or absence of a species.

In practice, studies of species occurrence are notoriously plagued by imperfect detection because species may be present but unobserved. This can be problematic when trying to explain species distributions because false absences create bias in estimated coefficients for covariates that explain occurrence. For instance, if turbidity increases the probability of occurrence of an aquatic organism, but decreases detectability (i.e. the organism is harder to observe), detection events will not appear to be related to turbidity. However, adjusting for imperfect detection facilitates unbiased estimation of the true relationship between such a factor and species occurrence. Such adjustments can be made via occupancy modeling, a method in which repeat surveys are used to estimate probabilities of detection and draw inferences on true occupancy states (MacKenzie et al. 2002). Occupancy models are classic hierarchical models, where the latent binary occurrence state is partly observed, and the extra zeros in detection histories are modeled as potentially arising from a combination of true and false absences. As the appreciation and application of occupancy models has increased, they have been extended to include multiple species, temporal dynamics including colonization and extinction, estimation of species richness, habitat ephemerality, and even disease dynamics (Dorazio et al. 2010, MacKenzie et al. 2011, Miller 2012). These developments have been instrumental in improving inferences about the processes that govern species distributions. However, many existing models require independent covariates for occurrence and detection. This limitation represents a challenge to understanding even moderately complex processes that drive species distributions that may include a combination of direct and indirect effects on occurrence and detectability.

Here, we show that combining SEM with occupancy models overcomes this limitation and provides a means by which to more directly represent processes that influence species occurrence. We aim to illustrate the practical and conceptual compatibility of occupancy modeling and structural equation modeling by developing a causal model of species occurrence and community composition that explicitly accounts for non-detection. We use pond-breeding amphibians as an applied case example. First, we outline the logic behind the model by summarizing the relevant empirical knowledge for our system in terms of the effects of pond permanence, cattle grazing, nonnative fishes, water chemistry, and vegetation on amphibian occurrence. We then discuss the data collection scheme and formally relate our observations to the processes under consideration by developing observation, process, and parameter (prior) models. As a model verification step, we demonstrate the recovery of parameters from our SEM using simulation (Rykiel 1996). Last, we present results for the case study and discuss the value, limitations, and future directions for this approach.

Study system

The amphibian community in freshwater ponds of the San Francisco Bay Area of California, USA includes the Pacific chorus frog (Pseudacris regilla), western toad (Anaxyrus boreas), California newt (Taricha torosa), California tiger salamander (Ambystoma cailforniense), California red-legged frog (Rana draytonii), and nonnative American bullfrog (Lithobates catesbeianus) (Stebbins 2003). Of the native species, two are federally protected and are of broad conservation concern (A. californiense and R. draytonii), and A. boreas has declined in specific regions (Lannoo 2005). Some populations of these species are currently threatened by invasive species including nonnative fishes and bullfrogs, which potentially act as predators and competitors for native amphibians (Fisher and Shaffer 1996, Lawler et al. 1999, Preston et al. 2012). Further, widespread cattle grazing in this region has the potential to affect the population dynamics of amphibians at breeding sites directly via trampling, and indirectly by removing vegetation and increasing the concentration of nitrogenous compounds in wetlands (Robins and Vollmar 2002, Roche et al. 2012). The ponds we considered included both temporary and permanent water bodies up to 7 meters in depth.

Data collection

From 2009 to 2013, field crews used standardized methods to assess amphibian site occupancy in 171 wetlands in the San Francisco Bay Area, USA (Contra Costa, Alameda, Santa Clara, and San Mateo counties) (Johnson et al. 2013). For simplicity, we used only one year of data from each site (with years chosen at random). The sites consisted of both artificially created livestock ponds and natural wetlands. Crews surveyed wetlands twice per summer using a combination of visual encounter, seine, and dipnet surveys (Crump and Bury 1994). Visual encounter surveys involved a single observer walking the perimeter of the wetland and recording all species observed. Dipnet sweeps were conducted every 3 m around the pond perimeter (1.4 mm mesh and 2600 cm2 net area) and three to four seine net hauls were performed in the deepest regions of each pond (4 mm mesh size and 1 × 2 m net area). For each survey, we recorded all life stages of each observed amphibian species. We considered sites “occupied” if larval amphibians were present, which indicates adult presence, successful fertilization, some degree of larval development, and potential for metamorphosis (i.e. breeding activity).

We quantified cattle grazing intensity, percentage of shoreline vegetated, pond permanence, water chemistry, and the presence of nonnative fish. To measure cattle grazing intensity, we recorded the number of cow paddies within three meters of shoreline and recorded shoreline perimeter with a handheld GPS unit to quantify cow paddy density at each pond. As a second measure of grazing intensity, we made a qualitative judgment of whether the wetland was disturbed by cattle based on evidence of tracks and trampling. Water samples were collected in acid-washed Nalgene bottles, filtered, frozen, and analyzed for ammonium (NH4+) and total dissolved nitrogen (TDN) concentrations using standard protocols (see: http://snobear.colorado.edu/Kiowa/Kiowaref/procedure.html). Imagery from Google Earth was used to help determine whether wetlands were permanent, based on consistent presence of water year round. Last, crews estimated the percentage of pond shoreline that was vegetated and recorded whether fish were present or absent using observations from all sampling methods. The most common fish species were nonnative, including mosquitofish (Gambusia affinis), largemouth bass (Micropterus salmoides), and bluegill sunfish (Lepomis macrochirus).

Conceptual model

Drawing upon previous literature, we developed a multivariate hypothesis about the drivers of amphibian community composition in this system (Figure 1). We expected fish to strongly affect community composition, particularly species with poor avoidance strategies and high palatability (Kruse and Stone 1984, Kats et al. 1988, Adams 2000, Welsh et al. 2006). Further, pond permanence should increase bullfrog occurrence because bullfrogs have a multi-year larval development period that is longer than most western native species (Collins 1979). We hypothesized that livestock would physically alter wetland ecosystems via trampling and grazing, and chemically alter wetlands via inputs of nitrogenous waste products in urine and feces (Kauffman et al. 1983, Jansen and Healey 2003, Knutson et al. 2004, Schmutzer et al. 2008, Adams et al. 2009). Previous studies indicate that characteristics of shoreline vegetation influence amphibian breeding and that water chemistry can alter reproductive success and occupancy probability (Freda and Dunson 1986, Rowe and Dunson 1995, Rouse et al. 1999, Jansen and Healey 2003, Brodman et al. 2003, Egan and Paton 2004, Burne and Griffin 2005, Earl and Whiteman 2009). Separating out the causal pathways through which grazing alters amphibian occupancy is important from a practical standpoint, as each one might be targeted differently with management interventions (e.g., limiting cattle-induced damage to vegetation versus limiting the total intensity of grazing).

Figure 1.

Figure 1

Conceptual model showing influences on amphibian breeding success. Nodes represent concepts of interest, and directional arrows represent causal effects (e.g. grazing intensity affects shoreline vegetation)

Model formalization

To formalize the model of Figure 1, we adopted a hierarchical Bayesian approach, developing model components related to observations, processes in the system, and priors for unknown parameters (Lee 2007, Cressie et al. 2009, Dorazio et al. 2010).

Observation model

While early structural equation models assumed normally distributed observed indicator variables, current methods are extremely flexible in terms of likelihood functions and relationships between variables, broadening the applicability of these methods in ecology (Grace 2006, Lee 2007). We consider the presence/absence state of a species at a location to be a hidden (latent) binary variable which gives rise to imperfect detection/nondetection data. The use of a latent state variable is a key conceptual link between the fields of SEM and occupancy modeling. The flexibility of this approach is demonstrated in the following sections where we apply a variety of observation models from the exponential family, a moderately complex finite-mixture distribution, and a combination of continuous and discrete latent variables.

We considered grazing intensity by cattle to be a continuous latent quantity that cannot be observed directly, whose value is indicated by evidence of disturbance and the density of cow paddies in the vicinity of a wetland. Specifically, cattle disturbance was modeled as a Bernoulli random variable with latent grazing intensity (ξ) as a continuous covariate:

Y1[j,k]Bernoulli(pY1[j])logit(pY1[j])=βY1,0+βY1,1ξ[j]

for the jth site and the kth survey, with square brackets representing indexing. Observed cow paddy counts are treated as a second indicator (specifically, a multi-method indicator) (Grace 2006) and modeled as a Poisson random variable with an offset for shoreline perimeter of site j (μperim[j]) in meters:

Y2[j,k]Poisson(λ[j,k])λj,kμperim[j,k]=eβY2,0+βY2,1ξ[j]

Perimeter observations are subject to measurement error and variation within a season. Therefore, we modeled latent mean perimeter values for each site (μperim), which represents the expected pond perimeter value for site j:

log(perim[j,k])N(μperim[j],σw)μperim[j]N(αperim,σa)

where σw represents measurement error and variation within a season, αperim is the (log) average perimeter across all wetlands, and σa is the variability in perimeter among wetlands.

Shoreline vegetation, v, ranged from 0–100% with a non-negligble number of 0%, 100%, and intermediate observations. Therefore, we treated these observations as arising from a zero-one inflated beta distribution, which is a finite mixture distribution with a Bernoulli component that produces 0’s and 1’s, and a beta component that produces values on the interval (0, 1) (Ospina and Ferrari 2012):

P(v;α,μv,ϕ)={α(1μv)v=0αμvv=0(1α)f(v;μv,ϕ)0<v<1logit(μv[j])=βv,0+βv,1η1[j]logit(α)=a0+a2μv2

Here, α determines the extent to which the beta or binomial mixture components dominate the probability density function. The second degree polynomial term with coefficient a2 causes extreme values of the logit-expected shoreline vegetation cover μv to increase the probability of an observer recording either 0% or 100% shoreline vegetation cover. This formulation essentially imposes a minimum probability of observers recording a discrete value when the true shoreline vegetation is 50%, and increases the probability of 0% or 100% observations as the true cover approaches those values. Last, f(v; μv, ϕ) is the probability density function of the beta distribution, parameterized in terms of its mean (μv) and variance (ϕ), which represents the combination of observation error and within-summer variation in true shoreline vegetation cover (Ospina and Ferrari 2012).

Log-transformed concentrations of ammonium and total dissolved N in the water were used as multi-method indicators of latent N concentration. Because both total dissolved N and NH4+ concentrations tend to increase over the course of the summer due to exogenous inputs and pond drying, we also included an effect of survey number k, coded as an indicator variable for the 2nd survey, such that I(k = 2) = 1, and I(k = 1) = 0.

log(NH4+[j,k])N(βNH4+,0+βNH4+, 1η2[j]+βNH4+,2I(k),σNH4+)
log(N[j,k])N(βN,0+βN,1η2[j]+βN,2I(k),σN)

We adopt an occupancy modeling approach for the data model describing amphibian detection and non-detection. Observations of the ith species at the jth site on the kth repeat survey are represented by Y [i, j, k]. We treated these observations as Bernoulli random variables with probability p[i, j, k]z[i, j], where p is the probability of detection and z is the latent binary presence/absence state. The true occurrence state z is only partly observed. If species i was seen at site j on any survey, it was present, but if it was not seen on any survey, it is possible that it was present but unobserved (MacKenzie et al. 2002):

Y[i,j,k]Bernoulli(p[i,j,k]z[i,j])

We treated fish presence and pond permanence as directly observed quantities, which is supported by consistency of observations within and across years in this system.

Process model

Our process model represents the latent processes connecting the latent quantities: grazing intensity ξ, shoreline vegetation η1, N concentration η2, the true occupancy states z, and the probability of detection P (Figure 2). In the SEM lexicon, process models are sometimes referred to as structural models. Note that there is an additional pathway from fish to nitrogen that was not initially hypothesized, but revealed in the process of model evaluation (see Model Assessment below).

Figure 2.

Figure 2

Directed acyclic graph illustrating relationships between unknown quantities (circles) and observed indicator variables (rectangles).

We treat grazing intensity as an exogenous latent variable, unaffected by the other latent quantities, with mean 0 and standard deviation 1 as an identifiability constraint. Due to cattle grazing on and trampling of shoreline vegetation, we modeled a linear effect of cattle grazing intensity ξ on shoreline vegetation η1:

η1N(γ1ξ,1)

where the standard deviation term, similarly set to 1 for identifiability, represents the influence of other, unmodeled factors on shoreline vegetation cover. Nitrogenous inputs from cattle excretion in and around wetlands are treated similarly. Following a graphical check of independence assertions, we included an effect of fish presence (γ3) on nitrogen:

η2N(γ2ξ+γ3fish,1).

We represented true occupancy states as Bernoulli random variables with probability of occupancy ψ[i, j] for the ith species at the jth site, such that z[i, j] ∼ Bernoulli(ψ[i, j]). We use a logit-link to model the effects of observed and latent covariates on ψ:

logit(ψ[i,j])=αspecies[i]+αregion[i,r[j]]+αsite[i,j]

where αspecies accounts for species-specific differences in overall occupancy across the entire study area, αregion accounts for regional differences in occupancy rates within each species (with region r being indexed by site j). Last, αsite represents the local effects of fish, pond permanence, grazing, shoreline vegetation and nitrogen concentrations. These terms can be decomposed as follows:

αregion[i,j]N(0,σαregion [i])

This varying intercept term accounts for among-region variation within species in occupancy, with varying species-specific standard deviations, to account for the fact that the degree of regional variation in occupancy rates varies among species. Local covariate effects enter the final term:

αsite[i,j]=βψ[i,1]fish[j]+βψ[i,2]perm[j]+βψ[i,3]η1[j]+βψ[i,4]η2[j]+βψ[i,5]ξ[j]

Species vary in their detection probabilities and p may also vary between first and second visits:

logit(p[i,j,k])=αp[i]+βpk

where αp is a species-specific mean and the last term represents the effect of early vs. late summer surveys.

Priors

We assumed logit-normal species responses to covariates with covariate-specific community means and variance parameters that represent among-species variability, such that:

βψN(μβψ,σβψ)βpN(μβp,σβp)

Further, we assumed that for each species, mean detection probabilities would be logit-normally distributed around community level means:

αpN(μαp,σαp)

Hierarchical parameters corresponding to community-level variance terms received semi-informative half-Cauchy priors that were weighted towards small values to reduce bias relative to commonly used uniform priors (Gelman 2006). We adopt vague priors for all other parameters except the loading terms for indicator variables which were constrained to be positive (e.g., increases in latent grazing intensity ξ correspond to increases in its indicators Y1 and Y2). Last, prior information based on previous work has entered the model in the form of included effect pathways (e.g., we assume that amphibian community composition does not affect grazing intensity).

Estimation

We used Stan, and the R package rstan to draw samples from the joint posterior distribution of all parameters (R Core Team 2014, Stan Development Team 2014a, 2014b, 2014c). Although this model could be implemented with WinBUGS, OpenBUGS, or JAGS, the simultaneous updating of all parameters via the No-U-Turn sampler in Stan results in faster convergence and more efficient sampling (Hoffman and Gelman 2014). Running three chains in parallel with 30000 iterations took about 80 minutes on a quad-core i7 laptop. Convergence was assessed using visual inspections of trace plots and the Gelman-Rubin potential scale reduction factor (Gelman and Rubin 1992, Brooks and Gelman 1998).

Model assessment

As this is a new method, we conducted a simulation analysis to ensure adequate recovery of parameter estimates across a range of known values. We simulated approximately 100 datasets with structure identical to our model, and the same amount of information (observations) present in our dataset. We then attempted to recover the known parameters by fitting the model to our simulated datasets (Gimenez et al. 2012). Any simulations that did not reach convergence at the MCMC step were discarded.

Independence assumptions were evaluated graphically using scatter plots to detect missing causal pathways (Grace et al. 2012). This revealed a positive correlation between fish presence and N (both total dissolved N and NH4+), leading to the inclusion of an additional effect of fish on N that we had not initially hypothesized. Mechanistically, this pathway may represent the joint effects of fish locking up N in their tissues, and suppressing zooplankton through predation, leading to release of phytoplankton from grazing and greater uptake of N from the water (Andersson et al. 1978, Henrikson et al. 1980).

Results

Parameter recovery

Our simulations demonstrated parameter recovery with 95% highest density posterior intervals (HDIs) including the true population level parameters over 90% of the time (Figure A1). Furthermore, we were able to recover effects of local factors (βψ), identifying large effects as being non-zero (Figure A2). These results increased confidence in parameter estimates for the empirical data and gave an indication of the power that we might have to detect effects of local drivers of occurrence with a dataset of comparable size.

Empirical results

Nonnative fish and pond permanence directly affected amphibian community composition, with fish exerting the most consistent and strongest effects. Nonnative fish reduced the probability of occurrence for four of five native amphibians (all species except western toads and bullfrogs; Figure 3). Species-specific effects were observed for pond permanence, which had a strong positive effect on bullfrog and red-legged frog occurrence. Shoreline vegetation, nitrogenous compounds, and cattle grazing exerted relatively weak direct effects, with all HDIs including zero. Similarly, all HDIs for indirect effects of fish (via nitrogen) and cattle (via nitrogen and shoreline vegetation) on amphibian occurrence included zero.

Figure 3.

Figure 3

Estimated direct effects on occurrence probabilities for each species. Black corresponds to parameters for which HDIs excluded zero, and grey correpsonds to HDIs including zero

Consistent with our expectations and previous work, cattle grazing decreased shoreline vegetation (HDI: (−0.556, −0.206)) and increased nitrogenous compounds in the water column (HDI: (0.097, 0.450)). The grazing submodel combined information from cow paddy density counts and disturbance classifications to generate values of latent grazing intensity for each site (Figure 4A–B). As we expected based on field observations, many sites experience moderate to high levels of grazing, while fewer experience very low levels of grazing.

Figure 4.

Figure 4

Fit of the submodels to the observed indicator variables. Shaded regions encompass the 95% HDI, with observed data shown as jittered points (x-axis values represent posterior medians).

Regionally, some species were far more variable in their distribution among parks than others. For example, although local factors tended to have minimal effects on the distribution of western toads (Figure 3), among-region variability was quite high for this species (Figure A3). In contrast, Pacific chorus frogs had low among-park variability, being nearly ubiquitous.

Most amphibian species were easier to detect during the first survey, probably due to metamorphosis occurring before or during the late-summer visit when evidence of successful breeding is no longer detectable (Figure A4). The inclusion of the visit number covariate (first vs. second) accounted for this discrepancy in the sense that species-specific covariate effects can still be recovered (Figure A2).

The shoreline vegetation submodel performed well, capturing the fact that some shorelines were either devoid or completely covered by vegetation, with variability between these two extremes (Figure 4C). The latent N variable and observed log-transformed total dissolved N and NH4+ concentrations showed good fit, with increasing N concentrations in late summer compared to early summer (Figure 4D–E). Graphical checks of independence assertions indicated no further causal pathways for inclusion (Figure A5).

Discussion

Our approach demonstrates the integration of SEM with occupancy modeling using a large-scale survey of pond breeding amphibians to gain a better understanding of the drivers of community composition. Importantly, this approach allows for explicit differentiation between observed data and underlying processes, accounting for a variety of measurement error models including the imperfect measurement process that gives rise to species detection data. We also embedded other types of observation models to account for more complicated likelihood functions, including the zero-one inflated beta distribution that was used to model observations of shoreline vegetation cover. With this approach, latent ecological processes hypothesized to drive occurrence can be explicitly represented and their direct and indirect consequences formally evaluated.

The potential advantages of integrating occupancy and structural equation frameworks include: 1) developing more mechanistic approaches for understanding species distributions influenced by simultaneous related processes; 2) inheriting a formal method for evaluating potential outcomes that would result from management interventions; and 3) clarifying hypotheses by requiring that assumptions be represented in a formal causal model. Unlike purely associational methods, the causal assumptions embedded within structural equation models facilitate unique predictions that can be used to answer applied management questions.

Well-specified SEMs yield unique management insights based on “do” operators, which can estimate the consequences of targeted interventions from effect decompositions (Pearl 1998). “Do” operators could be used to predict changes in occupancy following a management intervention that reduces grazing intensity or removes fish. Indeed, this was one of the original arguments for the development of structural equation models in the mid-20th century (Marschak 1950, Koopmans 1953). Causal inferences drawn from such analyses can be no better than the validity of the assumptions used to construct the model, however. Processes that drive the species distributions can be represented best when there is substantive knowledge to construct such models. This brings up two issues: how mechanistic is “mechanistic enough”, and how valid are the assumed causal relationships? First, it would be unreasonable to expect to construct an infinitely accurate and specific causal model for the occurrence of any species (Shipley 2002). However, an adequate model may capture the most important drivers of occurrence and provide “good-enough” predictions for management interventions. This point underlines the role of interventional tests of such models - if the true outcome of management intervention deviates from anticipated effects, then the validity of the underlying causal model must be questioned.

Although not a substitute for interventions, tests of model fit to observational data can also be useful for identifying potentially missing pathways based on independence relationships among variables in the causal network. Here, we employed graphical checks of independence assumptions for observed variables, but better methods could be developed. Information theoretic and d-separation tests exist for path analytic models, and other indices of model fit have long been used for traditional linear structural equation models that assume multivariate normality of errors (Grace 2006, Shipley 2013). However, methods for evaluating occupancy model fit are still relatively new and mostly rely on out-of-sample data (Zipkin et al. 2012) or bootstrapping (MacKenzie and Bailey 2004). Developing methods to evaluate the fit of hierarchical occupancy-type structural equation models with non-normal indicators and binary, partially observed latent variables is a non-trivial task, but one that would increase the utility of this method. Advances in the study of probabilistic graphical models may provide solutions, but such advances are only just beginning to be applied by ecologists (Koller and Friedman 2009, Grace et al. 2012). Aside from d-separation tests, management experiments provide another way to test the predictive power of these types of models.

Management implications

Our results have direct relevance to the management of threatened amphibian populations within lowland wetlands in California, particularly when land managers are faced with multiple potential challenges simultaneously. Livestock grazing, which is common throughout the western United States (Fleischner 1994), has been a topic of uncertainty with regard to amphibian conservation. While several studies indicate negative overall effects of grazing on populations of specific amphibian species (Knutson et al. 2004, Schmutzer et al. 2008), others indicate the potential for positive effects of grazing on diversity and the perseverance of native communities (Marty 2005).

Based on our analysis, livestock grazing in the Bay Area of California had minimal effects on the occurrence of six amphibian species, two of which are native species of conservation concern. This finding suggests that current grazing levels employed on these parks may be compatible with management aimed at conserving threatened amphibians. However, we did not explore the effects of grazing on amphibian abundance or temporal dynamics such as persistence and colonization. Such approaches may reveal effects of grazing that were not seen in this study.

Most ponds in this data set were initially constructed to serve as watering sites for livestock. Because most original natural wetlands in California have been destroyed for agriculture and development, particularly in the Central Valley (Garone 2011), such livestock ponds may now serve as vital habitat refuges for declining species. This trend is particularly important for the California tiger salamander and California red-legged frog, which are the focus of considerable conservation efforts due largely to habitat destruction (Lannoo 2005).

In sharp contrast to the minimal effects of grazing, we found strong negative effects of nonnative fish on native amphibian occupancy for all species except western toads and nonnative bullfrogs. This finding supports a large body of research showing that native pond-breeding amphibians that lack evolutionary history with fishes are unlikely to persist once predatory fish have been introduced to a breeding site. The most common fish species at our field sites were mosquitofish and centrarchids (bass, bluegill and other sunfish), which are native to the eastern United States. These species likely prevent amphibian reproduction through direct predation on multiple life stages of native amphibians. The lack of an effect on toads is consistent with prior work showing that toxicity and schooling behavior of toad larvae provides resistance to predation by fish (Kruse and Stone 1984, Welsh et al. 2006). Bullfrogs, which require permanent water bodies to complete metamorphosis, have co-evolved with fish in their native range, are unpalatable, and are not strongly affected by fish presence (Walters 1975, Kruse and Francis 1977, Szuroczki and Richardson 2011).

Taken together, our results indicate that land management strategies should prioritize removal of nonnative fish rather than limitation of livestock grazing. Fish removal via pond draining has been shown to be effective in restoring populations of threatened amphibians within the study region (Alvarez et al. 2003). Considering the needs of interests groups involved and the prevalence of grazing, this strategy is perhaps more feasible than dramatically limiting access of livestock to ponds, which may have unintended side effects including elimination of pond breeding habitat due to overgrowth by vegetation. While more research is needed to evaluate both temporal effects of grazing, and how grazing affects amphibian abundance, our analysis indicates that fish introductions have more severe impacts than cattle grazing.

Conclusion

Structural equation modeling provides a framework to evaluate why species occur in some areas and not others, allowing a more direct confrontation of some of the most fundamental questions in ecology. Occupancy modeling provides a solution to the problem of imperfect detection and can account for many different processes giving rise to detection data. Combining these approaches provides a means to evaluate complex causal processes driving occurrence while accounting for false-absences in empirical occurrence data. This framework potentially facilitates deeper insights into biological processes, making a clear separation between imperfectly observed data and underlying states. From a pragmatic standpoint, it is clearly advantageous to be able to represent both direct and indirect determinants of species occurrence rather than being limited to treating covariates as being independent. This approach also provides the practical advantage of inheriting a suite of methods to anticipate the effects of management interventions (“do” operators), and to account for many different sampling schemes in the occupancy modeling literature. Future extensions of this method could deepen connections between SEM and other classes of occupancy models, including dynamic multi-year occupancy models, models of abundance such as N-mixture models, Dail-Madsen models, their multi-species extensions, and spatial models (Royle 2004, Royle and Kéry 2007, Dail and Madsen 2011, Dorazio and Connor 2014, Lamb et al. 2014).

The full value of this approach may be most apparent, and perhaps palatable to individuals unfamiliar with SEM, when coupled with controlled field experiments, where the validity of the underlying causal model may be tested directly. Compared to fields like sociology and economics, ecology as a field is perhaps in a unique position to reap benefits from SEM because it is both complex and relatively amenable to experimental manipulation.

Supplementary Material

Supp

Acknowledgments

This work used the Janus supercomputer, which is supported by the National Science Foundation (award number CNS-0821794) and the University of Colorado Boulder. The Janus supercomputer is a joint effort of the University of Colorado Boulder, the University of Colorado Denver and the National Center for Atmospheric Research. We thank Joseph Mihaljevic and Helen McCreery for comments on the manuscript. We also thank Travis McDevitt-Galles and Katherine Richgels for assistance in the field. This work was directly funded by grants from NSF (DEB-1311467, DEB-1149308, and DEB-0841758) and NIH (R01GM109499). Maxwell Joseph and Daniel Preston received support from the NSF Graduate Research Fellowship Program.

Appendix A

Supplemental figures (1–5) containing simulation results, among-region standard deviations, detection probabilities, and graphical independence checks.

Footnotes

Supplements

Supplement 1. Code supplement for model fitting. This includes the Stan model, data file, and an R script to fit the model.

Contributor Information

Maxwell B. Joseph, Email: maxwell.b.joseph@colorado.edu, Dept. of Ecology and Evolutionary Biology, University of Colorado, Boulder, CO, USA, 80309.

Daniel L. Preston, Email: daniel.preston@oregonstate.edu, Dept. of Integrative Biology, Oregon State University, Corvallis, OR, USA, 97331.

Pieter T. J. Johnson, Email: pieter.johnson@colorado.edu, Dept. of Ecology and Evolutionary Biology, University of Colorado, Boulder, CO, USA, 80309.

References

  1. Adams MJ. Pond permanence and the effects of exotic vertebrates on anurans. Ecological Applications. 2000;10:559–568. [Google Scholar]
  2. Adams MJ, Pearl CA, Mccreary B, Galvan SK, Wessell SJ, Wente WH, Chauncey W, Kuehl AB. Short-term effect of cattle exclosures on columbia spotted frog (Rana luteiventris) populations and habitat in northeastern Oregon. Journal of Herpetology. 2009;43:132–1338. [Google Scholar]
  3. Alsterberg C, Eklöf JS, Gamfeldt L, Havenhand JN, Sundbäck K. Consumers mediate the effects of experimental ocean acidification and warming on primary producers. Proceedings of the National Academy of Sciences of the United States of America. 2013;110:8603–8. doi: 10.1073/pnas.1303797110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Alvarez JA, Dunn C, Zuur AF. Response of California Red-legged Frogs to removal of non-native fish. Transactions of the Western Section of the Wildlife society. 2003;39:9–12. [Google Scholar]
  5. Andersson G, Berggren H, Cronberg G, Gelin C. Effects of planktivorous and benthivorous fish on organisms and water chemistry in eutrophic lakes. Hydrobiologia. 1978;59:9–15. [Google Scholar]
  6. Bollen KA. Structural Equations with Latent Variables. John Wiley & Sons, Inc; 1989. [Google Scholar]
  7. Brodman R, Ogger J, Bogard T, Long AJ, Pulver RA, Mancuso K, Falk D. Multivariate analyses of the influences of water chemistry and habitat parameters on the abundances of pond-breeding amphibians. Journal of Freshwater Ecology. 2003;18:425–436. [Google Scholar]
  8. Brooks S, Gelman A. General methods for monitoring convergence of iterative simulations. Journal of computational and graphical statistics. 1998;7:434–455. [Google Scholar]
  9. Burne MR, Griffin CR. Habitat associations of pool-breeding amphibians in eastern Massachusetts, USA. Wetlands Ecology and Management. 2005;13:247–259. [Google Scholar]
  10. Clough Y. A generalized approach to modeling and estimating indirect effects in ecology. Ecology. 2012;93:1809–1815. doi: 10.1890/11-1899.1. [DOI] [PubMed] [Google Scholar]
  11. Collins J. Intrapopulation variation in the body size at metamorphosis and timing of metamorphosis in the bullfrog, Rana catesbeiana. Ecology. 1979;60:738–749. [Google Scholar]
  12. Cressie N, Calder CA, Clark JS, Ver Hoef JM, Wikle CK. Accounting for uncertainty in ecological analysis: the strengths and limitations of hierarchical statistical modeling. Ecological Applications. 2009;19:553–70. doi: 10.1890/07-0744.1. [DOI] [PubMed] [Google Scholar]
  13. Crump M, Bury R. Visual encounter surveys. In: Heyer W, Donnely M, McDiarmid R, Hayek L, Foster M, editors. Measuring and monitoring biological diversity, standard methods for amphibians. Smithsonian Institution Press; Washington DC: 1994. pp. 84–92. [Google Scholar]
  14. Dail D, Madsen L. Models for estimating abundance from repeated counts of an open metapopulation. Biometrics. 2011;67:577–587. doi: 10.1111/j.1541-0420.2010.01465.x. [DOI] [PubMed] [Google Scholar]
  15. Dorazio RM, Connor EF. Estimating abundances of interacting species using morphological traits, foraging guilds, and habitat. PLoS ONE. 2014;9:e94323. doi: 10.1371/journal.pone.0094323. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Dorazio RM, Kéry M, Royle JA, Plattner M. Models for inference in dynamic metacommunity systems. Ecology. 2010;91:2466–75. doi: 10.1890/09-1033.1. [DOI] [PubMed] [Google Scholar]
  17. Earl JE, Whiteman HH. Effects of pulsed nitrate exposure on amphibian development. Environmental toxicology and chemistry. 2009;28:1331–1337. doi: 10.1897/08-325.1. [DOI] [PubMed] [Google Scholar]
  18. Egan R, Paton P. Within-pond parameters affecting oviposition by wood frogs and spotted salamanders. Wetlands. 2004;24:1–13. [Google Scholar]
  19. Fisher RN, Shaffer HB. The Decline of amphibians in California’s Great Central Valley. Conservation Biology. 1996;10:1387–1397. [Google Scholar]
  20. Fleischner TL. Ecological costs of livestock grazing in western North America 1994 [Google Scholar]
  21. Freda J, Dunson W. Effects of low pH and other chemical variables on the local distribution of amphibians. Copeia. 1986;1986:454–466. [Google Scholar]
  22. Garone P. The Fall and Rise of the Wetlands of California’s Great Central Valley. 1st. University of California Press; 2011. [Google Scholar]
  23. Gelman A. Prior distributions for variance parameters in hierarchical models. Bayesian Analysis. 2006;1:515–533. [Google Scholar]
  24. Gelman A, Rubin DB. Inference from iterative simulation using multiple sequences. Statistical science. 1992;7:457–511. [Google Scholar]
  25. Gimenez O, Anker-Nilssen T, Grosbois V. Exploring causal pathways in demographic parameter variation: path analysis of mark-recapture data. Methods in Ecology and Evolution. 2012;3:427–432. [Google Scholar]
  26. Grace JB. Structural Equation Modeling and Natural Systems. Cambridge Univ Press; 2006. [Google Scholar]
  27. Grace JB, Adler PB, Stanley Harpole W, Borer ET, Seabloom EW. Causal networks clarify productivity-richness interrelations, bivariate plots do not. Functional Ecology. 2014;28:787–798. [Google Scholar]
  28. Grace JB, Schoolmaster DR, Guntenspergen GR, Little AM, Mitchell BR, Miller KM, Schweiger EW. Guidelines for a graph-theoretic implementation of structural equation modeling. Ecosphere. 2012;3:1–44. [Google Scholar]
  29. Guisan A, Thuiller W. Predicting species distribution: offering more than simple habitat models. Ecology Letters. 2005;8:993–1009. doi: 10.1111/j.1461-0248.2005.00792.x. [DOI] [PubMed] [Google Scholar]
  30. Henrikson L, Nyman HG, Oscarson HG, Stenson JAE. Trophic changes, without changes in the external nutrient loading. Hydrobiologia. 1980;68:257–263. [Google Scholar]
  31. Hoffman MD, Gelman A. The No-U-Turn Sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. The Journal of Machine Learning Research. 2014;15:1593–1623. [Google Scholar]
  32. Jansen A, Healey M. Frog communities and wetland condition: relationships with grazing by domestic livestock along an Australian floodplain river. Biological Conservation. 2003;109:207–219. [Google Scholar]
  33. Johnson PTJ, Preston DL, Hoverman JT, Richgels KLD. Biodiversity decreases disease through predictable changes in host community competence. Nature. 2013;494:230–3. doi: 10.1038/nature11883. [DOI] [PubMed] [Google Scholar]
  34. Kats LB, Petranka JW, Sih A. Antipredator defenses and the persistence of amphibian larvae with fishes 1988 [Google Scholar]
  35. Kauffman J, Krueger W, Vavra M. Effects of late season cattle grazing on riparian plant communities. Journal of Range Management. 1983;36:685–691. [Google Scholar]
  36. Knutson MMG, Richardson WWB, Reineke DM, Gray BR, Parmelee JR, Weick SE. Agricultural ponds support amphibian populations. Ecological Applications. 2004;14:669–684. [Google Scholar]
  37. Koller D, Friedman N. In: Probabilistic Graphical Models: Principles and Techniques. Dietterich T, editor. MIT Press; 2009. [Google Scholar]
  38. Kruse KP, Stone BM. Largemouth bass (Micropterus salmoides) learn to avoid feeding on toad (Bufo) tadpoles. Animal Behaviour. 1984;32:1035–1039. [Google Scholar]
  39. Kruse K, Francis M. A predation deterrent in larvae of the bullfrog, Rana catesbeiana. Transactions of the American Fisheries Society. 1977;106:248–252. [Google Scholar]
  40. Lamb EG, Mengersen KL, Stewart KJ. Spatially Explicit Structural Equation Modeling. Ecology. 2014;95:2434–2442. [Google Scholar]
  41. Lannoo MJ. Amphibian Declines: The Conservation Status of United States Species. University of California Press; 2005. [Google Scholar]
  42. Lawler SP, Dritz D, Strange T, Holyoak M. Effects of introduced mosquitofish and bullfrogs on the threatened California red-legged frog. Conservation Biology. 1999;13:613–622. [Google Scholar]
  43. Lee S. Structural Equation Modeling: A Bayesian Approach. Wiley; 2007. [Google Scholar]
  44. MacKenzie DI, Bailey LL. Assessing the fit of site-occupancy models. Journal of Agricultural, Biological, and Environmental Statistics. 2004;9:300–318. [Google Scholar]
  45. MacKenzie DI, Bailey LL, Hines JE, Nichols JD. An integrated model of habitat and species occurrence dynamics. Methods in Ecology and Evolution. 2011;2:612–622. [Google Scholar]
  46. MacKenzie D, Nichols J, Lachman G. Estimating site occupancy rates when detection probabilities are less than one. Ecology. 2002;83:2248–2255. [Google Scholar]
  47. Marty JT. Effects of cattle grazing on diversity in ephemeral wetlands. Conservation Biology. 2005;19:1626–1632. [Google Scholar]
  48. Miller ZJ. Fungal pathogen species richness: why do some plant species have more pathogens than others? The American Naturalist. 2012;179:282–92. doi: 10.1086/663676. [DOI] [PubMed] [Google Scholar]
  49. Ospina R, Ferrari SL. A general class of zero-or-one inflated beta regression models. Computational Statistics & Data Analysis. 2012;56:1609–1623. [Google Scholar]
  50. Pearl J. Graphs, Causality, and Structural Equation Models. Sociological Methods & Research. 1998;R-253:226–284. [Google Scholar]
  51. Pearl J. Causality. New York: Cambridge; 2000. [Google Scholar]
  52. Preston DL, Henderson JS, Johnson PT. Community ecology of invasions: direct and indirect effects of multiple invasive species on aquatic communities. Ecology. 2012;93:1254–1261. doi: 10.1890/11-1821.1. [DOI] [PubMed] [Google Scholar]
  53. R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical ComputingVienna; Austria: 2014. [Google Scholar]
  54. Robins JD, Vollmar JE. Livestock grazing and vernal pools. Wildlife and rare plant ecology of eastern merced county’s vernal pool grasslands 2002:401–430. [Google Scholar]
  55. Roche LM, Latimer AM, Eastburn DJ, Tate KW. Cattle grazing and conservation of a meadow-dependent amphibian species in the Sierra Nevada. PLoS ONE. 2012;7:e35734. doi: 10.1371/journal.pone.0035734. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Rouse JD, Bishop CA, Struger J. Nitrogen pollution: an assessment of its threat to amphibian survival. Environmental Health Perspectives. 1999;107:799–803. doi: 10.1289/ehp.99107799. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Rowe C, Dunson W. Impacts of hydroperiod on growth and survival of larval amphibians in temporary ponds of central Pennsylvania, USA. Oecologia. 1995;102:397–403. doi: 10.1007/BF00341351. [DOI] [PubMed] [Google Scholar]
  58. Royle JA. N-mixture models for estimating population size from spatially replicated counts. Biometrics. 2004;60:108–115. doi: 10.1111/j.0006-341X.2004.00142.x. [DOI] [PubMed] [Google Scholar]
  59. Royle JA, Kéry M. A Bayesian state-space formulation of dynamic occupancy models. Ecology. 2007;88:1813–23. doi: 10.1890/06-0669.1. [DOI] [PubMed] [Google Scholar]
  60. Rykiel EJ. Testing ecological models: The meaning of validation. Ecological Modelling. 1996;90:229–244. [Google Scholar]
  61. Schmutzer AC, Gray MJ, Burton EC, Miller DL. Impacts of cattle on amphibian larvae and the aquatic environment. Freshwater Biology. 2008;53:2613–2625. [Google Scholar]
  62. Shipley B. Cause and Correlation in Biology: A User’s Guide to Path Analysis, Structural Equations and Causal Inference. Cambridge University Press; 2002. [Google Scholar]
  63. Shipley B. The AIC model selection method applied to path analytic models compared using a d-separation test. Ecology. 2013;94:560–564. doi: 10.1890/12-0976.1. [DOI] [PubMed] [Google Scholar]
  64. Stan Development Team. Stan Modeling Language Users Guide and Reference Manual, Version 2.5.0 2014a [Google Scholar]
  65. Stan Development Team. RStan: the R interface to Stan, Version 2.5.0 2014b [Google Scholar]
  66. Stan Development Team. Stan: A C++ Library for Probability and Sampling, Version 2.5.0 2014c [Google Scholar]
  67. Stebbins RC. A Field Guide to Western Reptiles and Amphibians. Houghton Mifflin; 2003. [Google Scholar]
  68. Szuroczki D, Richardson JML. Palatability of the larvae of three species of Lithobates. Herpetologica. 2011;67:213–221. [Google Scholar]
  69. Walters B. Studies of interspecific predation within an amphibian community. Journal of Herpetology. 1975;9:267–279. [Google Scholar]
  70. Welsh HH, Pope KL, Boiano D. Sub-alpine amphibian distributions related to species palatability to non-native salmonids in the Klamath mountains of northern California. Diversity and Distributions. 2006;12:298–309. [Google Scholar]
  71. Zipkin EF, Campbell Grant EH, Fagan WF. Evaluating the predictive abilities of community occupancy models using AUC while accounting for imperfect detection. Ecological Applications. 2012;22:1962–1972. doi: 10.1890/11-1936.1. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supp

RESOURCES