Skip to main content
. Author manuscript; available in PMC: 2020 Aug 14.
Published in final edited form as: Cell Host Microbe. 2019 Aug 14;26(2):283–295.e8. doi: 10.1016/j.chom.2019.07.008

Table 1:

Table of definitions used in the paper.

Term Definition
metagenome total genomic potential of a microbial community (in this work, we use this term interchangeably with “sample” and “metagenomic sample”)
singleton gene a gene detected in only one metagenomic sample across a defined collection of samples
non-singleton gene a gene detected in more than one metagenomic sample across a defined collection of samples
ORFan gene genes that have no detectable homologs in other species and are distinct from all open reading frames (ORFs) in the genome
universe of genes the set of all non-redundant genetic elements across all communities of organisms in a given niche
gene rarefaction curve a curve tracking the accumulation of new genes as samples are incrementally added
gene discovery curve the derivative of the rarefaction curve (It estimates the rate at which new genes are added to the catalog when samples are added incrementally, and it can be used to estimate the size and burden of sampling of the universe of genes.)
singleton fraction curve a curve estimating the fraction of a gene catalog that consists of singletons vs. non-singletons as samples are added incrementally (It is used to estimate the total number of samples that would be required for all singletons to be seen twice and thus no longer be singletons.)
mixture contig A contig from de novo assembly consisting of both singletons and non-singletons
singleton contig A contig from de novo assembly consisting of only singletons
non-singleton contig A contig from de novo assembly consisting of only non-singletons