Skip to main content
. 2018 Jan 8;8:28. doi: 10.1038/s41598-017-18341-7

Table 3.

Protein family hits to described proteins.

Family DB Tool Best Hit (protein) Best hit (species) Curated Annotation
1217 nr blastp & hmmsearch unknown Veillonella sp. CAG:933 Bacterial protein
532 nr hmmsearch hypothetical protein M. rupellensis Bacterial protein
565b nr blastp hypothetical protein H257_12751 A. astaci Putative replication protein, viral or bacterial
nr hmmsearch putative replication protein Phytophthora parasitica virus
956b nr blastp & hmmsearch hypothetical protein C. trachomatis Torque Teno virus ORF

Four of the 32 ORFan protein families match proteins in the NR database. The table describes for each family: (1) the database of the hit, (2) the tool used to detect the similarity, (3) the description of the highest scoring hit, (4) the annotated species for the highest scoring hit, and (5) our manually curated annotation for the protein based on all the significant hits for the protein family. Manual annotation was required since the best hit for a sequence does not always correspond to the most plausible annotation, due to wrong metadata or to the discovery of a distant relative of a protein conserved in many different organisms.