Skip to main content
. 2017 Apr 16;19(5):1035–1050. doi: 10.1093/bib/bbx039

Table 5.

Some widely used DWFS and their potential use cases with limitations summarized from their Web site and other literature including [4, 28, 54, 98–100]

DWFS Potential use cases Technologies Limitations
Tavaxy Personalized medicine and NGS (short DNA reads, DNA segments, phylogenetic and taxonomical analyze, EMBOSS, SAMtools, etc.) SCUFL, JSON, hierarchical workflow structure, asynchronous protocol and DAG style in workflow creation and execution
  1. Difficulty in combining bio-pipelines between Galaxy and Taverna’s workflows using SCUFL

  2. Lack of sufficient interoperability

  3. Does not support loops in workflow creation

  4. Lack of opportunity of workflow sharing

Taverna2- Galaxy Life Sciences (e.g. eukaryotic genome biology) SCUFL 2 (experimental), Semantics, RDF, OWL and DAG
  1. SCUFL 2 is still in Apache’s incubation

  2. Does not support loops in workflow

  3. Lack of opportunity in workflow sharing

Galaxy NGS (QC and manipulation, Deep Tools, Mapping, RNA Analysis, SAMtools, BAM Tools, Picard, VCF Manipulation, Peak Calling, Variant Analysis, RNA Structure, Du Novo, Gemini, FASTA Manipulation, EMBOSS, etc.) Python, JavaScript, Shell script, OS: Linux and Mac OS X
  1. No proper interlinking mechanism in pipeline functionalities between dependent modules

  2. Does not support loops in workflow creation

  3. Does not support control-flow operations and remote services

  4. No workflow language available rather than RDBMS

  5. Adding new tools require advanced IT knowledge

KNIME Pharma and healthcare (virtual high-throughput screening, chemical library enumeration, outlier detection in BioMed data and NGS analysis with KNIME Extension [107] Java/Eclipse, KNIME SDK and Spotfire (supports Python ad Perl scripts)
  1. JDBC mechanism to access the databases is slow

  2. High latency time in requests and responses

  3. Not scalable for large-scale data and heavy computation

  4. No reproducibility of the computational results

Taverna Domain-independent (bioinformatics, cheminformatics, gravitational wave analysis) WSDL, Java and DAG
  1. Not scalable for large-scale data and heavy computation

  2. Slow response while creating large-scale workflow and submission, thereafter

  3. No reproducibility of the computational results

Wings Multi-omics analysis and cancer omics
  • Java, Maven, DAG, Tomcat and Graphviz

  • OS: Unix and Mac OS X

  1. Not scalable for large-scale data and heavy computation

  2. No data integration support

  3. Lack of computational transparency

  4. Lack of interoperability with other DWFS

Anduril Cancer research and molecular biology, DNA, RNA and ChIP-seq, DNA and RNA microarrays, cytometry and image analysis
  • Workflows are constructed using Scala, DAG notation, the AndurilScript, Developed in Java

  • OS: Windows, Linux, and Mac OS X

  1. No data conversion support

  2. Lack of interoperability with other DWFS

  3. Cannot be configured on cloud infrastructure

  4. Not suitable for workflows containing loops

Unipro UGENE NGS: sequencing, annotationsMultiple alignments, phylogenetic trees, assemblies, RNA/ChIP-seq, raw NGS, local sequence alignment, protein sequencing, plasmid, variant calling, evolutionary biology and virology
  • C ++, Qt, DAG style workflow creation and support

  • (Cross-platform software system)

  1. Does not support loops in workflow creation

  2. Data provenance cannot be ensured

  3. Not scalable for large-scale data and heavy computation

  4. Lack of computational transparency

  5. No reproducibility of the computational results

Pipeline Pilot NGS: gene expression and sequence data analysis, imaging, Pharma: drug–chemical material analysis, cheminformatics, ADMET, polymer properties synthesis, data modeling
  • Visual and data flow oriented, written with C ++

  • OS: Windows, and Linux

  1. No control flow operation

  2. Not scalable for large-scale data and heavy computation

  3. Limited data provenance support

  4. No reproducibility of the computational results