Table 2.
Comparison summary of four different hardware accelerators for sequence alignment.
| Features | AligneR (Zokaee et al., 2018) | FPGASW (Fei et al., 2018) | Darwin (Turakhia et al., 2017, 2019) | ASAP (Banerjee et al., 2019) |
|---|---|---|---|---|
| Speed (reads/sec) | 483k* | – | 23k† | ∼10k§ |
| Max read length (bp) | 1024 | – | 10k | 128 |
| Data structure | FM-index | – | – | – |
| Hardware accelerator processor | ReRam (specialist) | Xilinx Virtex-7 XC7VX485T FPGA | Xilinx Kintex-7 FPGA‡ | Xilinx Virtex-7 XC7VX690T FPGA |
| Operating frequency (MHz) | 100 | 200 | 250 | 250 |
| Processing elements (PE) per array | – | 512 | 64 | 256 |
| GCUPS | – | 105.9 | – | 609.6 |
| Data bus | – | – | NoC interconnect | Crossbar |
| External memory (DRAM) | No external memory dependence | 3 x 8GB DDR3-1600 | 4 x 32GB LPDDR4 | – |
| Host CPU | – | Intel i5 | Intel Xeon E5-26200 | IBM power8 |
| Host memory (GB) (DDR3 RAM) | – | 8 | 64 | – |
| Host interface | – | SFP+ Optical interface | ×16 PCIe 2.0 | CAPI interface |
| Search space reduction | – | – | D-SOFT | – |
| Edit distance function | Hamming | Levenshtein | – | Levenshtein |
| Gap penalty model | – | Affine | Affine | Constant |
| Edit distance implementation | Process-In-Memory (PIM) | Sequential logic | Sequential logic | Sequential logic |
| Power consumption (W) | 1.9 | 44 | 15 | 6.9 |
Note: Speed is quoted in reads per second for simulated reads. Maximum read length (bp) is the reported maximum read length that can be aligned. Data structure corresponds to the compression mode utilized. Hardware accelerator processer is the main accelerator device used. Operating frequency (MHz) is the clock frequency of the accelerator hardware. Processing elements (PE) is the number of computational cells per dynamic programming (DP) matrix/array. GCUPS (Giga Cell Updates Per Second) is a performance measure of the number of processing element cell updates per second for a single array cell. Data bus is the interconnection strategy used. External memory (GB) corresponds to the available DDR3 RAM required to support accelerator operation. Host CPU is the CPU of interface computer to the accelerator. Host memory (GB) is the memory capacity of the host computer which the accelerator can draw upon. Host interface is the communication interconnect between host and accelerator. Search space reduction corresponds to the search space reduction strategy used in the pre-alignment filtering stage. Edit distance function corresponds to the specific edit distance calculation method used. Gap penalty model corresponds to the specific gap (insertion or deletion) penalty method used for each implementation. Edit distance implementation is the mode in which each accelerator computes the edit distance function to determine optimum alignment. Power consumption (W) is the power consumed by the accelerator during alignment. The information which is not obtainable is denoted as (–). Please refer to the respective article(s) mentioned in the table for further details.
* AligneR computing speed is based upon 10 million, 100 bp simulated short reads from human genome reference hg19.
† Darwin computing speed is based upon 3 million, 1000 bp simulated short reads from human genome reference GRCh38.
‡Details on the actual device used in the case of Darwin are unavailable other than the Kintex-7 series by Xilinx.
§ ASAP computing speed is based upon 100 million, 128 bp simulated short reads from human genome reference hg38.