Skip to main content
. 2022 Oct 13;10:e14055. doi: 10.7717/peerj.14055

Figure 3. Defining RdRp boundaries for sequence-based classification.

Figure 3

Schematic depiction of methods for defining RdRp segment boundaries for sequence analysis. As shown at the top, RdRp may be embedded in a multi-gene ORF (see also Fig. 1). Below are three alternative RdRp boundary schemes defined by Wolf et al. (2018) (“Wolf2018”), Zayed et al. (2022) (“Zayed2022”), and Edgar et al. (2022) (“Edgar2022”), respectively. Wolf2018 attempted to identify approximately full-length genes, discarding fragments unless they are close to full-length. This scheme is problematic because RdRp is often found in a longer ORF with other functional domains, and in such cases the boundary of the RdRp is often unclear. Zayed2022 used a similar scheme while additionally allowing fragments. Allowing fragments allows more sequences to be included but is problematic for classification because pairs with little or no overlap may be assigned to different vOTUs even if they belong to the same species. Edgar2022 used palmprints, a short segment of RdRp with well-defined boundaries.