Figure 1. Changing frequency of species-specific DNA barcodes in an expanding reference resource.
A model constructs a comprehensive reference barcode (CRB) resource from rbcL and matK sequences from 721 whole chloroplast genomes and associated variants derived from the BOLD Systems database to represent all barcodes in a hypothetical geographic area. The model takes subsets of barcodes from the CRB to represent incomplete coverage of barcodes in the reference database (Interim Reference Barcode, IRB) and records the proportion of barcodes that are perceived as species-specific (red line), ambiguous (blue line) or false (species-specific in the IRB only, orange line) for: (a) rbcL, (b) matK or (c) rbcL + matK. Panel (d) shows changing frequency of species-specific DNA barcodes with expanding species coverage recovered from literature listed in Supplementary Table S4.