. 2021 May 25;37(13):1785–1795. doi: 10.1093/bioinformatics/btab017

Table 2.

Comparison summary of four different hardware accelerators for sequence alignment.

Features	AligneR (Zokaee et al., 2018)	FPGASW (Fei et al., 2018)	Darwin (Turakhia et al., 2017, 2019)	ASAP (Banerjee et al., 2019)
Speed (reads/sec)	483k^*	–	23k^†	∼10k^§
Max read length (bp)	1024	–	10k	128
Data structure	FM-index	–	–	–
Hardware accelerator processor	ReRam (specialist)	Xilinx Virtex-7 XC7VX485T FPGA	Xilinx Kintex-7 FPGA^‡	Xilinx Virtex-7 XC7VX690T FPGA
Operating frequency (MHz)	100	200	250	250
Processing elements (PE) per array	–	512	64	256
GCUPS	–	105.9	–	609.6
Data bus	–	–	NoC interconnect	Crossbar
External memory (DRAM)	No external memory dependence	3 x 8GB DDR3-1600	4 x 32GB LPDDR4	–
Host CPU	–	Intel i5	Intel Xeon E5-26200	IBM power8
Host memory (GB) (DDR3 RAM)	–	8	64	–
Host interface	–	SFP+ Optical interface	×16 PCIe 2.0	CAPI interface
Search space reduction	–	–	D-SOFT	–
Edit distance function	Hamming	Levenshtein	–	Levenshtein
Gap penalty model	–	Affine	Affine	Constant
Edit distance implementation	Process-In-Memory (PIM)	Sequential logic	Sequential logic	Sequential logic
Power consumption (W)	1.9	44	15	6.9

Note: Speed is quoted in reads per second for simulated reads. Maximum read length (bp) is the reported maximum read length that can be aligned. Data structure corresponds to the compression mode utilized. Hardware accelerator processer is the main accelerator device used. Operating frequency (MHz) is the clock frequency of the accelerator hardware. Processing elements (PE) is the number of computational cells per dynamic programming (DP) matrix/array. GCUPS (Giga Cell Updates Per Second) is a performance measure of the number of processing element cell updates per second for a single array cell. Data bus is the interconnection strategy used. External memory (GB) corresponds to the available DDR3 RAM required to support accelerator operation. Host CPU is the CPU of interface computer to the accelerator. Host memory (GB) is the memory capacity of the host computer which the accelerator can draw upon. Host interface is the communication interconnect between host and accelerator. Search space reduction corresponds to the search space reduction strategy used in the pre-alignment filtering stage. Edit distance function corresponds to the specific edit distance calculation method used. Gap penalty model corresponds to the specific gap (insertion or deletion) penalty method used for each implementation. Edit distance implementation is the mode in which each accelerator computes the edit distance function to determine optimum alignment. Power consumption (W) is the power consumed by the accelerator during alignment. The information which is not obtainable is denoted as (–). Please refer to the respective article(s) mentioned in the table for further details.

^* AligneR computing speed is based upon 10 million, 100 bp simulated short reads from human genome reference hg19.

^† Darwin computing speed is based upon 3 million, 1000 bp simulated short reads from human genome reference GRCh38.

^‡Details on the actual device used in the case of Darwin are unavailable other than the Kintex-7 series by Xilinx.

^§ ASAP computing speed is based upon 100 million, 128 bp simulated short reads from human genome reference hg38.