Skip to main content
. 2017 Oct 24;8:1770. doi: 10.3389/fpls.2017.01770

Table 1.

Bioinformatics tools for the identification of viruses in RNA-sequence samples.

Tool Reference Strategy Benchmarking Seq Input Availability
VirFind Ho and Tzanetakis, 2014 Web based tool that maps and removes host reads, gives taxonomic information for virus reads
  • Quality Control

  • Mapping to host: Bowtie2 (Langmead, 2013)

  • Assembly of non-host reads: Velvet

  • Contigs compared with GenBank using Blastx

  • Unmapped reads translated and compared against NCBI conserved domain database (Marchler-Bauer et al., 2015)

38 Plant samples from 19 species sRNA
mRNA
Web based tool with GUI http://virfind.org
Taxonomer Flygare et al., 2016 Fast web-based metagenomics analysis tool based on k-mer profiling Comprised of 4 modules
  • Binner: compares reads to reference k-mer database [based on 21-kmers created using Kanalyze (Audano and Vannberg, 2014)] assigning to broad taxa (human, virus, bacteria)

  • Classifier: exact k-mer matching within taxonomic bins against k-mer databases created from UniRef datasets (Suzek et al., 2015) (e.g., viral UniRef90)

  • Protonomer: further classification in protein space

  • AfterBurner: discovery of novel taxa

Human mRNA Webserver: https://www.taxonomer.com/
VSD toolkit Barrero et al., 2017 Modules and workflows in the Yabi analytical environment for identification of viral sequences in plants
  • Quality Control

  • De novo assembly: SPAdes (Bankevich et al., 2012)

  • Overlapping contigs merged

  • Contigs >40 nt aligned to plant, virus and viroid Genbank databases using Blastn and Blastx

  • Unmapped contigs filtered and analyzed to identify putative circular viroids

21 Plant genomes sRNA Source code available to use with Yabi (Hunter et al., 2012) https://github.com/muccg/yabi
Metavisitor Carissimo et al., 2017 Modular tools and workflows within the Galaxy analytical environment, designed for detection and reconstruction of viral genomes
  • Quality control

  • Mapping to host, symbionts and parasites: Bowtie2

  • Unmapped reads assembled using Velvet + Oases (Schulz et al., 2012) or Trinity

  • Contigs compared to Genbank virus database using Blastx and Blastn

  • Blast guided scaffolding of selected virus sequences

Human Drosphila Mosquito mRNA Source code available to use within Galaxy (Afgan et al., 2016). Galaxy Toolshed: suite_metavisitor_1_2 Galaxy instance: https://mississippi.snv.jussieu.fr/
VIP Li et al., 2016 An integrated pipeline for metagenomics of virus identification and discovery
  • Quality Control

  • Mapping to host: Bowtie2

  • Fast Mode: reads mapped to Virus Pathogen Resource (ViPR) (Pickett et al., 2012) and Influenza Database (Squires et al., 2012)

  • Sense mode: reads mapped to virus RefSeq (O'Leary et al., 2016) nucleotide

  • Unmapped reads mapped to RefSeq protein: RAPSearch (Zhao et al., 2012)

  • Options for de novo assembly with Velvet-Oases

Human mRNA Local Installation. Code available at https://github.com/keylabivdc/VIP
ViromeScan Rampelli et al., 2016 Tool for metagenomics viral community profiling
  • Mapping to built-in databases (includes plants): Bowtie2

  • Mapped reads processed for quality

  • Human Best Match Tagger (BMTagger) (Agarwala and Aleksandr, 2011) used to screen out human and bacterial sequences

  • Screened reads re-mapped to virus database: Bowtie2

Human mRNA Local installation. Code available at http://sourceforge.net/projects/viromescan
VirusHunter Zhao et al., 2013 Data analysis pipeline for novel virus identification from Roche 454 sequencers and other long read platforms
  • Similar reads clustered using CD-Hit (Li and Godzik, 2006) and longest sequence used as representative

  • Repeat regions masked with Repeat Masker (Smit et al., 2013)

  • Mapping to host (default: Human): Blastn

  • Unmapped sequences aligned to NCBI nucleotide database and classified into taxonomies

  • Unmapped sequences mapped to NCBI nr databases using Blastx

BHK (hamster) cell culture infected with viruses mRNA http://pathology.wustl.edu/VirusHunter/Code available upon request for local installation
ezVIR Petty et al., 2014 Bioinformatics pipeline to evaluate spectrum of known human viruses
  • Mapping to host (Human genome): Bowtie2

  • Nonhost mapped to custom virus database: Bowtie2

  • Additional analysis on specific mapped classes provides targeted strain classification

Human mRNA Local installation. Code available: http://cegg.unige.ch/ezvir
Virus Detect Zheng et al., 2017 Bioinformatics pipeline to analyse sRNA datasets for both known and novel virus identification
  • Maps reads to virus reference sequences: BWA

  • Mapped reads assembled using references

  • Mapped reads de novo assembled using Velvet

  • Reference assemblies and de novo assemblies pooled and redundant sequences removed

  • Contigs compared to virus reference nulcotides: Blastn

  • Unmatched contigs matched to virus reference Blastx

Plants (Potato) sRNA Webserver: http://bioinfo.bti.cornell.edu/tool/VirusDetect
VirusFinder Wang et al., 2013 Software for detection of viruses and their host integration sites
  • Mapping to host (Human genome): Bowtie2

  • Non-host reads mapped one of two virus databases (Hirahata et al., 2007; Bhaduri et al., 2012): Bowtie2

  • Mapped reads de novo assembled: Trinity

  • Contigs mapped to virus database and human genome

  • For specific virus of interest host integration sites are identified using BWA (Li and Durbin, 2009)

Human RNA-seq
WGS
Targeted
https://bioinfo.uth.edu/VirusFinder/

References for databases and software with multiple entries in table: Genbank (Benson et al., 2013), Blastx and Blastn (Altschul et al., 1990), Velvet (Zerbino and Birney, 2008), Bowtie2 (Langmead, 2013), Trinity (Grabherr et al., 2011).