Average time (in second) used by CPU and GPU solvers for eight selected test proteins and representative solvers. The CG and Jacobi-preconditioned CG on GPU were carried out with CUSP library and DIA matrix format, and the SA-AMG-preconditioned CG on GPU was carried out with CUSP library and COO matrix format, while the ICCG solver on GPU was implemented with cuSPARSE library and CSR matrix format. The timing scheme for each solver include all execution time of the core routine code, i.e. time elapsed on device (GPU) and on host (CPU) and on transferring data between the device and the host. Both 10−3 and 10−6 criteria were used for comparison.