Highlights: Which WfMS to use day-to-day |
In light of this, a pragmatic approach to workflow choice could be the following: |
1. Assess: is there a need to build a new pipeline, or there is an existing reasonable pipeline in the Nextflow, CWL,or WDL repos? |
(a) If a workflow exists that follows good coding practices, it should be adopted and modified as per specific needs. |
(b) If starting fresh, without restrictions by collaborators’ preferences or existing legacy code-base: |
i. If a quick development cycle is important, Nextflow is optimal. |
ii. If code readability is important, WDL is optimal. |
iii. If execution environment is variable, or there is a need to work across heterogeneous hardware environments, CWL is optimal. |
iv. Table 1 is a quick overview of each language’s features at a crude level. |
2. Assess: what execution constraints are in place? |
(a) For HPC environments, pay particular attention to runners supporting differnt CRMs. Our recommended free, production-scale runners for these are: Cromwell (for both WDL and CWL), and Nextflow (for Nextflow workflows). Toil was less performant in comparison. (refer to section: Scalability) |
(b) For running in the cloud, pay particular attention to runners with support for different cloud APIs, and features like automatic rescaling, containerization, and security settings. Table 2 gives a quick overview of runners, language versions supported by each, and key performance aspects. |