. 2011 May 20;12:181. doi: 10.1186/1471-2105-12-181

Table 7.

Performance of the algorithms depending on the hardware architecture

algorithm	avg.	# of	1 GT200 GPU		1 Fermi GPU		2 Fermi GPUs
	length	sequences	GCUPS	speedup	GCUPS	speedup	GCUPS	speedup
		4000	2,58	1,00	5,13	1,99	10,02	3,88
	51	8000	2,68	1,00	5,21	1,95	10,13	3,78
		12000	2,71	1,00	5,25	1,94	10,28	3,79

		2000	2,65	1,00	5,37	2,03	10,55	3,98
	154	4000	2,80	1,00	5,57	1,99	10,87	3,88
SW		6000	2,86	1,00	5,65	1,97	11,07	3,87

		2000	2,68	1,00	5,12	1,91	9,91	3,69
	257	4000	2,83	1,00	5,25	1,85	10,21	3,60
		6000	2,90	1,00	5,20	1,80	10,10	3,48

		2000	2,71	1,00	4,26	1,57	8,02	2,96
	459	4000	2,84	1,00	4,57	1,61	8,46	2,98
		6000	2,85	1,00	4,64	1,63	8,56	3,01

		4000	3,04	1,00	5,68	1,87	11,13	3,66
	51	8000	3,15	1,00	5,76	1,83	11,21	3,56
		12000	3,17	1,00	5,81	1,83	11,36	3,58

		2000	3,10	1,00	5,88	1,90	11,46	3,69
	154	4000	3,29	1,00	6,16	1,87	12,06	3,67
		6000	3,36	1,00	6,28	1,87	12,24	3,64

global NW		2000	3,15	1,00	5,68	1,80	10,85	3,44
	257	4000	3,33	1,00	5,80	1,74	11,15	3,35
		6000	3,55	1,00	5,78	1,63	11,15	3,14

		2000	3,19	1,00	4,84	1,52	9,07	2,84
	459	4000	3,35	1,00	5,14	1,54	9,63	2,88
		6000	3,36	1,00	5,17	1,54	9,68	2,89

		4000	2,88	1,00	5,50	1,91	10,75	3,74
	51	8000	2,97	1,00	5,58	1,88	10,86	3,66
		12000	3,01	1,00	5,62	1,87	11,04	3,67

		2000	3,00	1,00	5,81	1,94	11,36	3,79
	154	4000	3,18	1,00	6,02	1,90	11,87	3,74
		6000	3,25	1,00	6,17	1,90	12,00	3,69

semiglobal NW		2000	3,05	1,00	5,60	1,84	10,90	3,57
	257	4000	3,23	1,00	5,77	1,79	11,10	3,44
		6000	3,39	1,00	5,73	1,69	11,03	3,26

		2000	3,09	1,00	4,78	1,54	9,04	2,92
	459	4000	3,24	1,00	5,04	1,56	9,55	2,95
		6000	3,25	1,00	5,14	1,58	9,56	2,94

Mean performance (in GCUPS) for different versions of the algorithm running on two architectures: GT200 (Tesla S1070) and Fermi (GeForce GTX480). The columns with speedup always refer to the configuration with one GT200 GPU.