GPU-powered model analysis with PySB/cupSODA

Leonard A Harris; Marco S Nobile; James C Pino; Alexander L R Lubbock; Daniela Besozzi; Giancarlo Mauri; Paolo Cazzaniga; Carlos F Lopez

doi:10.1093/bioinformatics/btx420

. 2017 Jun 28;33(21):3492–3494. doi: 10.1093/bioinformatics/btx420

GPU-powered model analysis with PySB/cupSODA

Leonard A Harris ^1,^2,², Marco S Nobile ^3,^4,², James C Pino ^2,^5,², Alexander L R Lubbock ^1,², Daniela Besozzi ^3,⁴, Giancarlo Mauri ^3,⁴, Paolo Cazzaniga ^4,^6,^✉, Carlos F Lopez ^1,^2,^✉

Editor: Jonathan Wren

PMCID: PMC5860165 PMID: 28666314

Abstract

Summary

A major barrier to the practical utilization of large, complex models of biochemical systems is the lack of open-source computational tools to evaluate model behaviors over high-dimensional parameter spaces. This is due to the high computational expense of performing thousands to millions of model simulations required for statistical analysis. To address this need, we have implemented a user-friendly interface between cupSODA, a GPU-powered kinetic simulator, and PySB, a Python-based modeling and simulation framework. For three example models of varying size, we show that for large numbers of simulations PySB/cupSODA achieves order-of-magnitude speedups relative to a CPU-based ordinary differential equation integrator.

Availability and implementation

The PySB/cupSODA interface has been integrated into the PySB modeling framework (version 1.4.0), which can be installed from the Python Package Index (PyPI) using a Python package manager such as pip. cupSODA source code and precompiled binaries (Linux, Mac OS/X, Windows) are available at github.com/aresio/cupSODA (requires an Nvidia GPU; developer.nvidia.com/cuda-gpus). Additional information about PySB is available at pysb.org.

Supplementary information

Supplementary data are available at Bioinformatics online.

1 Introduction

Kinetic modeling of complex biochemical systems is central to the emerging field of systems biology (Kitano, 2002; Le Novère, 2015). Kinetic models require definition of numerous free parameters, usually obtained by calibration to experimental data, that specify initial species concentrations and kinetic rate constants. Once calibrated, a model should be analyzed for its sensitivity and predictive power over ranges of parameter values (Fisher and Henzinger, 2007). Both model calibration and analysis can require thousands to millions of model simulations for statistical convergence and significance (Eydgahi et al., 2013; Gutenkunst et al., 2007). In many cases, the computational expense of simulation at this scale makes detailed model analysis infeasible.

Recently, efforts have been made to leverage the highly parallel structure of graphics processing units (GPUs) to accelerate scientific computations (Dematté and Prandi, 2010; Nobile et al., 2016). GPUs are well suited for applications in which the same arithmetic operations are applied to many independent data elements, e.g. solving independently parameterized systems of ordinary differential equations (ODEs). GPU-based kinetic simulators thus hold great promise for accelerating tasks such as model calibration and analysis, but are challenging for non-experts to use because they require specialized settings and inputs.

To address this problem, we have created a user-friendly interface between the GPU-based kinetic simulator cupSODA (Nobile et al., 2013, 2014) and PySB, a Python-based modeling and simulation platform (Lopez et al., 2013). cupSODA is built around the well-known adaptive stiff/non-stiff ODE integrator LSODA (Petzold, 1983). It is designed to perform thousands of parallel simulations, each independently parameterized, of mass-action kinetic models by leveraging the high-performance memories on the GPU, specifically the cached and non-mutable constant memory and the low-latency on-chip shared memory. PySB is a rule-based modeling (Chylek et al., 2014, 2015) platform for constructing and analyzing complex models of biochemical systems. Models can be constructed in native Python code or imported from various formats, including the Systems Biology Markup Language (SBML) (Hucka et al., 2003). PySB leverages powerful libraries within the Python ecosystem, such as NumPy, SymPy and SciPy (Perez et al., 2011), and provides user-friendly interfaces to numerous third-party simulation and analysis tools, including BioNetGen (Faeder et al., 2009; Harris et al., 2016), KaSim (Suderman and Deeds, 2013) and StochKit (Sanft et al., 2011).

Below, we briefly describe the main features of the PySB/cupSODA interface and showcase its utility by performing run time and sensitivity analyses for three model systems of varying size (Table 1).

Table 1.

Models used for PySB/cupSODA performance testing

Model	Species	Reactions	End time	Output steps
Cell cycle^a	5	7	100	100
Ras/cAMP/PKA^b	33	39	1500	100
EARM^c	77	105	20 000	100

Open in a new tab

Tyson (1991).

Besozzi et al. (2012).

Lopez et al. (2013).

2 Features and implementation

cupSODA is designed to exploit the massive parallelism of the CUDA architecture (Nickolls et al., 2008). To run simulations with cupSODA, one must construct multiple input files containing, e.g. the reaction stoichiometries and the initial species concentrations and rate parameter values for each specified simulation. Numerous simulator-specific parameters must also be defined, such as the number of CUDA ‘blocks’ to use and the desired cupSODA memory configuration (see Supplementary Information). The number of simulations that cupSODA can run in parallel is limited by the number of CUDA ‘cores’ on the GPU (usually a few thousand; see Supplementary Table S1), but the number of simulations that can be loaded onto the GPU at one time is usually many more than this, limited by the available memory (Nobile et al., 2013, 2014).

The PySB/cupSODA interface simplifies and streamlines the use of cupSODA via a CupSodaSimulator class, available within the PySB package. The class constructor accepts the following arguments:

model: A PySB model object (required)
tspan: A list of output time points (default: None)
initials: A list or dictionary of initial species concentrations for each simulation (default: None)
param_values: A list or dictionary of rate parameter values for each simulation (default: None)
verbose: Verbose output (default: False)

The CupSodaSimulator constructor also recognizes numerous keyword arguments (kwargs), such as n_blocks, the number of CUDA blocks and memory_usage, the desired memory configuration. Importantly, default values are defined for each kwarg, removing the need for user input. For example, if a user-defined value is not provided, the number of CUDA blocks is automatically calculated by querying the specifications of the GPU in use.

The CupSodaSimulator.run() method performs the simulations by constructing the cupSODA input files and invoking cupSODA as a subprocess (the method takes tspan, initials and param_values as optional arguments). Additionally, the method reads into a three-dimensional array the results of the simulations (species time courses), which cupSODA outputs to (typically thousands of) separate text files. The user then has the ability to analyze and/or visualize the results using tools available within the Python ecosystem, e.g. plotting the time courses using the Matplotlib library (Perez et al., 2011). For convenience, a run_cupSODA wrapper function has also been implemented that combines invocations of the CupSodaSimulator constructor and run method into a single step. A workflow diagram and example Python script using the run_cupSODA function are provided in Supplementary Figures S1 and S2, respectively.

3 Results

In Figure 1A–C and Supplementary Figure S3, we compare the run time efficiency of PySB/cupSODA to the CPU-bound ODE integrator LSODA, available in the Python package SciPy (Oliphant, 2007), for three example models listed in Table 1 (see Supplementary Information for descriptions). These include models of the eukaryotic cell cycle (Tyson, 1991), the Ras/cAMP/PKA signaling pathway in Saccharomyces cerevisiae (Besozzi et al., 2012), and extrinsically induced apoptosis in mammalian cells (EARM: extrinsic apoptosis reaction model) (Lopez et al., 2013). Run time comparisons show that in all cases SciPy/LSODA is faster for small numbers of simulations but PySB/cupSODA overtakes it for large numbers of simulations, achieving a maximum speedup of approximately one order of magnitude. Comparable speedups are achieved for other memory settings and GPUs (Supplementary Information and Supplementary Figs S4 and S5).

For each model in Table 1, we also performed sensitivity analyses (Fig. 1D and Supplementary Figs S6–S12) by quantifying changes in defined model outputs to variations (±20%) in initial protein concentrations around a set of reference values (Supplementary Tables S2–S4; see Supplementary Information for further details). The ability to efficiently perform such analyses is critical since non-genetic variability within isogenic cell populations has been attributed to significant variations in protein concentrations across cells (Spencer et al., 2009). In Figure 1D and Supplementary Figure S11, we analyze the sensitivity in ‘time-to-death’ in EARM (defined as the time at which Smac reaches 50% cleavage; see Supplementary Information and Supplementary Fig. S10) for a specific set of rate parameters. Our results show that time-to-death is sensitive to the initial levels of six of the 21 proteins considered. Of particular interest is the sensitivity to Bak. The same analysis performed for a different set of rate parameters (Supplementary Fig. S12) shows insensitivity to Bak but sensitivity to Bax. This indicates that the model harbors at least two alternative pathways to apoptosis induction. The analysis comprised 25 410 total simulations and took ∼11 min with PySB/cupSODA and ∼35 min with SciPy/LSODA (for both parameter sets). Similar accelerations were seen for the cell cycle and Ras/cAMP/PKA models (Supplementary Information).

4 Conclusion

The PySB/cupSODA interface provides the modeling community with a high-performance GPU-based kinetic simulator, that can run thousands of parallel simulations on a common desktop workstation, within the easy-to-use framework of a full-fledged, open-source programming and analysis environment in Python. This will greatly accelerate and streamline the process of analyzing complex biochemical models for systems biology applications.

Supplementary Material

Supplementary Data

Click here for additional data file.^{(1.1MB, pdf)}

Acknowledgment

We thank Jeremy L. Muhlich for useful feedback during the implementation of this work.

Funding

This work was supported by the National Science Foundation [MCB-1411482]; the VICC Young Ambassador Award [to C.F.L.]; Vanderbilt Center for Quantitative Sciences and Kleberg Foundation Funds [to C.F.L.]; and the National Institutes of Health Vanderbilt Biomedical Informatics Training Program [NLM 5T15-LM007450-14 to L.A.H.].

Conflict of Interest: none declared.

References

Besozzi D. et al. (2012) The role of feedback control mechanisms on the establishment of oscillatory regimes in the Ras/cAMP/PKA pathway in S. cerevisiae. EURASIP J. Bioinform. Syst. Biol., 2012, 1–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chylek L.A. et al. (2014) Rule-based modeling: a computational approach for studying biomolecular site dynamics in cell signaling systems. WIRES Syst. Biol. Med., 6, 13–36. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chylek L.A. et al. (2015) Modeling for (physical) biologists: an introduction to the rule-based approach. Phys. Biol., 12, 045007.. [DOI] [PMC free article] [PubMed] [Google Scholar]
Dematté L, Prandi D (2010) GPU computing for systems biology. Brief. Bioinform., 11, 323–333. [DOI] [PubMed] [Google Scholar]
Eydgahi H. et al. (2013) Properties of cell death models calibrated and compared using Bayesian approaches. Mol. Syst. Biol., 9, 644.. [DOI] [PMC free article] [PubMed] [Google Scholar]
Faeder J.R. et al. (2009) Rule-based modeling of biochemical systems with BioNetGen. Methods Mol. Biol., 500, 113–167. [DOI] [PubMed] [Google Scholar]
Fisher J, Henzinger T.A (2007) Executable cell biology. Nat. Biotechnol., 25, 1239–1249. [DOI] [PubMed] [Google Scholar]
Gutenkunst R.N. et al. (2007) Universally sloppy parameter sensitivities in systems biology models. PLoS Comput. Biol., 3, 1–8.. [DOI] [PMC free article] [PubMed] [Google Scholar]
Harris L.A. et al. (2016) BioNetGen 2.2: advances in rule-based modeling. Bioinformatics, 32, 3366–3368. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hucka M. et al. (2003) The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics, 19, 524–531. [DOI] [PubMed] [Google Scholar]
Kitano H. (2002) Computational systems biology. Nature, 420, 206–210. [DOI] [PubMed] [Google Scholar]
Le Novère N. (2015) Quantitative and logic modelling of molecular and gene networks. Nat. Rev. Genet., 16, 146–158. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lopez C.F. et al. (2013) Programming biological models in Python using PySB. Mol. Syst. Biol., 9, 646.. [DOI] [PMC free article] [PubMed] [Google Scholar]
Nickolls J. et al. (2008) Scalable parallel programming with CUDA. ACM Queue, 6, 40–53. [Google Scholar]
Nobile M.S. et al. (2013) cupSODA: a CUDA-powered simulator of mass-action kinetics. Lect. Notes Comput. Sci., 7979, 344–357. [Google Scholar]
Nobile M.S. et al. (2014) GPU-accelerated simulations of mass-action kinetics models with cupSODA. J. Supercomput., 69, 17–24. [Google Scholar]
Nobile M.S. et al. (2016) Graphics processing units in bioinformatics, computational biology and systems biology. Brief. Bioinform., 2016, bbw058.. [DOI] [PMC free article] [PubMed] [Google Scholar]
Oliphant T.E. (2007) Python for scientific computing. Comput. Sci. Eng., 9, 10–20. [Google Scholar]
Perez F. et al. (2011) Python: An ecosystem for scientific computing. Comput. Sci. Eng., 13, 13–21. [Google Scholar]
Petzold L. (1983) Automatic selection of methods for solving stiff and nonstiff systems of ordinary differential equations. SIAM J. Sci. Stat. Comput., 4, 136–148. [Google Scholar]
Sanft K.R. et al. (2011) StochKit2: software for discrete stochastic simulation of biochemical systems with events. Bioinformatics, 27, 2457–2458. [DOI] [PMC free article] [PubMed] [Google Scholar]
Spencer S.L. et al. (2009) Non-genetic origins of cell-to-cell variability in TRAIL-induced apoptosis. Nature, 459, 428–432. [DOI] [PMC free article] [PubMed] [Google Scholar]
Suderman R, Deeds E.J (2013) Machines vs. ensembles: effective MAPK signaling through heterogeneous sets of protein complexes. PLoS Comput. Biol., 9, e1003278.. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tyson J.J. (1991) Modeling the cell division cycle: cdc2 and cyclin interactions. Proc. Natl. Acad. Sci. USA, 88, 7328–7332. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data

Click here for additional data file.^{(1.1MB, pdf)}

[btx420-B1] Besozzi D. et al. (2012) The role of feedback control mechanisms on the establishment of oscillatory regimes in the Ras/cAMP/PKA pathway in S. cerevisiae. EURASIP J. Bioinform. Syst. Biol., 2012, 1–17. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btx420-B2] Chylek L.A. et al. (2014) Rule-based modeling: a computational approach for studying biomolecular site dynamics in cell signaling systems. WIRES Syst. Biol. Med., 6, 13–36. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btx420-B3] Chylek L.A. et al. (2015) Modeling for (physical) biologists: an introduction to the rule-based approach. Phys. Biol., 12, 045007.. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btx420-B4] Dematté L, Prandi D (2010) GPU computing for systems biology. Brief. Bioinform., 11, 323–333. [DOI] [PubMed] [Google Scholar]

[btx420-B5] Eydgahi H. et al. (2013) Properties of cell death models calibrated and compared using Bayesian approaches. Mol. Syst. Biol., 9, 644.. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btx420-B6] Faeder J.R. et al. (2009) Rule-based modeling of biochemical systems with BioNetGen. Methods Mol. Biol., 500, 113–167. [DOI] [PubMed] [Google Scholar]

[btx420-B7] Fisher J, Henzinger T.A (2007) Executable cell biology. Nat. Biotechnol., 25, 1239–1249. [DOI] [PubMed] [Google Scholar]

[btx420-B8] Gutenkunst R.N. et al. (2007) Universally sloppy parameter sensitivities in systems biology models. PLoS Comput. Biol., 3, 1–8.. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btx420-B9] Harris L.A. et al. (2016) BioNetGen 2.2: advances in rule-based modeling. Bioinformatics, 32, 3366–3368. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btx420-B10] Hucka M. et al. (2003) The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics, 19, 524–531. [DOI] [PubMed] [Google Scholar]

[btx420-B11] Kitano H. (2002) Computational systems biology. Nature, 420, 206–210. [DOI] [PubMed] [Google Scholar]

[btx420-B12] Le Novère N. (2015) Quantitative and logic modelling of molecular and gene networks. Nat. Rev. Genet., 16, 146–158. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btx420-B13] Lopez C.F. et al. (2013) Programming biological models in Python using PySB. Mol. Syst. Biol., 9, 646.. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btx420-B14] Nickolls J. et al. (2008) Scalable parallel programming with CUDA. ACM Queue, 6, 40–53. [Google Scholar]

[btx420-B15] Nobile M.S. et al. (2013) cupSODA: a CUDA-powered simulator of mass-action kinetics. Lect. Notes Comput. Sci., 7979, 344–357. [Google Scholar]

[btx420-B16] Nobile M.S. et al. (2014) GPU-accelerated simulations of mass-action kinetics models with cupSODA. J. Supercomput., 69, 17–24. [Google Scholar]

[btx420-B17] Nobile M.S. et al. (2016) Graphics processing units in bioinformatics, computational biology and systems biology. Brief. Bioinform., 2016, bbw058.. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btx420-B18] Oliphant T.E. (2007) Python for scientific computing. Comput. Sci. Eng., 9, 10–20. [Google Scholar]

[btx420-B19] Perez F. et al. (2011) Python: An ecosystem for scientific computing. Comput. Sci. Eng., 13, 13–21. [Google Scholar]

[btx420-B20] Petzold L. (1983) Automatic selection of methods for solving stiff and nonstiff systems of ordinary differential equations. SIAM J. Sci. Stat. Comput., 4, 136–148. [Google Scholar]

[btx420-B21] Sanft K.R. et al. (2011) StochKit2: software for discrete stochastic simulation of biochemical systems with events. Bioinformatics, 27, 2457–2458. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btx420-B22] Spencer S.L. et al. (2009) Non-genetic origins of cell-to-cell variability in TRAIL-induced apoptosis. Nature, 459, 428–432. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btx420-B23] Suderman R, Deeds E.J (2013) Machines vs. ensembles: effective MAPK signaling through heterogeneous sets of protein complexes. PLoS Comput. Biol., 9, e1003278.. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btx420-B24] Tyson J.J. (1991) Modeling the cell division cycle: cdc2 and cyclin interactions. Proc. Natl. Acad. Sci. USA, 88, 7328–7332. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

GPU-powered model analysis with PySB/cupSODA

Leonard A Harris

Marco S Nobile

James C Pino

Alexander L R Lubbock

Daniela Besozzi

Giancarlo Mauri

Paolo Cazzaniga

Carlos F Lopez

Roles

Abstract

Summary

Availability and implementation

Supplementary information

1 Introduction

Table 1.

2 Features and implementation

3 Results

Fig. 1.

4 Conclusion

Supplementary Material

Acknowledgment

Funding

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

GPU-powered model analysis with PySB/cupSODA

Leonard A Harris

Marco S Nobile

James C Pino

Alexander L R Lubbock

Daniela Besozzi

Giancarlo Mauri

Paolo Cazzaniga

Carlos F Lopez

Roles

Abstract

Summary

Availability and implementation

Supplementary information

1 Introduction

Table 1.

2 Features and implementation

3 Results

Fig. 1.

4 Conclusion

Supplementary Material

Acknowledgment

Funding

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases