Skip to main content
. 2022 Aug 29;24(11):3395–3421. doi: 10.1007/s10530-022-02858-8

Box 2.

Estimating relative abundances

Consider a survey designed to quantify the abundances Nl at locations l=1,,L from abundances reported by observers j=1,,J from a total of V visits. Let dv denote the reported abundance during visit v=1,,V conducted at location lv by observer ov. Here, dv is affected by both the abundance Nlv at location lv as well as by the detection probability pov of observer ov such that

Pdv|Nlv,pov=Nlvdvpovdv(1-pov)Nlv-dv

is given by binomial sampling. Since Nlv and pov are confounded, estimating them individually is difficult (DasGupta & Rubin 2005). To illustrate this, consider a case with two locations with N1=100 and N2=200 surveyed m=5 times each by a single observer with detection probability p=0.2. As shown in Fig. 5A, the uncertainty associated with abundance estimated from that data under mild priors N1,N2Exp(0.001) spans about two orders of magnitude. This is because the data is well explained by pretty much any abundance if paired with a corresponding detection probability and more informative priors would be required to constrain the range of possible values. However, there is considerable evidence that N2 is about twice N1 (Fig. 5B), illustrating that relative abundances may be learned accurately from such surveys.

To benefit from this in a realistic setting, we here generalize the inference of relative abundances to many locations. Let us assume that the abundances Nl=N0eρl are scaled by location-specific factors ρlN(0,σρ2) that are themselves normally distributed with mean zero and variance σρ2. Similarly, we assume that the detection probabilities pj=logistic(π0+πj) are scaled by observer-specific effects πjN0,σπ2 that are also normally distributed with mean zero and variance σπ2. Here, the logistic transformation ensures 0pj1. We further enforce the conditions 1Liρl=0 and 1Jjπj=0 by scaling N0 and p0 accordingly. If observers do not visit multiple locations, the πj need to be modelled using informative covariates.

We conducted simulations with N0=100, σρ2=0.2, π0=-1 and σπ2=0.5, corresponding to an average detection probability p0=logisticπ0=0.27. As shown in Figs. 5C and 5D, neither N0 nor p0 can be inferred accurately, regardless of whether L=20 or L=100 locations were surveyed by J=20 or J=100 observers visiting m=5 different locations each, corresponding to V=100 and V=500 visits, respectively. In contrast, the relative abundances are estimated well, and easily distinguish locations with high from those with low abundances (Figs. 5E and 5F).