TABLE 1.
Function | Type of bias assessed | Method (in brief) | Output | Dataset type | Additional data required |
---|---|---|---|---|---|
assessEnvBias | Environmental bias. Indicates whether the input data are likely to be sampled from the same portion of environmental space over time, or whether the data are sampled from a representative portion of environmental space in the spatial domain of interest | Reduces the dimensionality of environmental space using principal component analyses and maps the distribution of the data in this reduced environmental space | Maps of the distribution of the data on two user‐selected principal components of environmental space. Displayed as ellipses or points. One ellipse or set of points per period | Single or multispecies. Highly likely to indicate strong bias with one or a small number of species | Environmental data corresponding to each occurrence data point and, optionally, a background sample (e.g. a random sample from the domain of interest for inference) |
assessRarityBias | Indicates whether rare species are overrepresented in the data and whether the degree to which they are overrepresented changes over time | Measures the congruence of the number of times species have been recorded and their estimated commonness (range sizes). Drops records not identified to species level | Time series showing congruence in each period (correlation or r 2 from regression of the number of records on commonness) | Multispecies only | None |
assessRecordNumber | Identifies temporal variation in sampling intensity in the domain of interest | Sums the number of records in the dataset in each time period | Time series of counts | Single or multispecies | None |
assessSpatialBias | Indicates whether the data resemble a random distribution in the geographic space of interest for inference, and whether the extent to which the data resemble a random distribution change over time | Compares the average nearest neighbour distance of the data with the average nearest neighbour distances of simulated random distributions of the same density | Time series showing nearest neighbour index in each period | Single or multispecies. Highly likely to indicate strong bias with one or a small number of species | Raster layer indicating which areas fall inside the study extent |
assessSpatialCov | Indicates whether a representative portion of the spatial domain of interest has been sampled and whether the same portion of geographic space has been sampled over time | Maps the data in geographical space | Either multiple (gridded) maps showing the distribution of the data in each period, or one map showing the number of periods in which grid cells have been sampled | Single or multispecies | If the input data are not on WGS84 coordinate reference system, then any country or political boundaries that the user requires to be superimposed on the resultant plots must be supplied |
assessSpeciesID | Taxonomic resolution and whether it changes over time | Calculates proportions or counts of records identified to species level | Time series of proportions or counts | Multispecies only | None |
assessSpeciesNumber | Taxonomic coverage and how it changes over time | Sums the number of species recorded in each time period. Drops records not identified to species level | Time series of counts | Multispecies only | None |
Note that users can opt to split the data into multiple time periods; in this case all functions are temporally explicit and hence provide information on temporal variation in some characteristic of the data. See the worked example in the main text for more details of each function.