Skip to main content
. Author manuscript; available in PMC: 2013 Oct 1.
Published in final edited form as: Physiol Meas. 2012 Sep 26;33(10):1703–1715. doi: 10.1088/0967-3334/33/10/1703

Figure 3.

Figure 3

Sketch of the multi-GPU architecture: the NVIDIA Tesla S1070 computing solutions consists of 4 GPUs which are housed in a chassis that provides power and a dedicated interconnection to a PC. At the software level, 4 threads are instantiated on the CPU; each thread communicates with one GPU and is responsible for splitting the data, and transferring the data to the associated GPU, launching the Jacobian computation on the GPU, transferring the results back to the CPU. As each GPU processes one fourth of the Jacobian matrix, results are re-assembled to form the full matrix on the CPU. With the above overheads, accelerations of 50 times have been possible using the 4 GPUs provided be the S1070 computing solution.