Recently, it has been proposed that more complex models should routinely be compared to empirical or simpler models1; often, simple models can predict specific emergent properties of complex systems as well as more complex models, hence questioning the added value of the latter. This perspective discusses this important point in the context of the purpose of modeling.
Discussion
Arguably, pharmaceutical industry productivity continues to decline, and some suggest it will cross the zero net return on investment threshold soon.2 The conclusion of analyses of this is that phase II failure is the key event, showing we did not understand the consequences of perturbing a biological system with a xenobiotic. Although it can be advocated that novel technology will help,2 precedent indicates this can only achieve so much. Thus, the requirement for improved understanding of human disease biology complexity remains. Quantitative Systems Pharmacology (QSP) is gaining traction as a tool to tackle this problem. However, one reasonable criticism of this approach is that the confidence in such highly complex models is hard to quantify. Biology data are incomplete, constantly evolving and potentially incorrect; thus, how can we be confident in models built upon this foundation, what is the added value, and how can the effort be resourced to deliver insight in a timely way?
To answer these questions, we need to think about the purpose of building models.
Simple models have been in use for decades in pharmaceutical research, typically as pharmacokinetic/pharmacodynamic (PK/PD) models. These have been successful in improving phase II/phase III efficiency but had a limited effect on Translational efficiency.3 One reason for this may be that empirical models parameterized with significant population data are good for extrapolating to the next clinical phase or patient cohort. Mechanistic insight is not necessarily required, as the empirical and probabilistic suffices. In contrast, extrapolating from a preclinical observation in an animal model or in vitro dataset is an entirely different proposition; a type of “far extrapolation” vs. the “near extrapolation” of interpatient prediction. Thus, questions arise as to whether we clearly understand how to extrapolate preclinical PK/PD or indeed whether the data themselves lack translational validity. Put simply, is the biology in the animal model similar enough to human disease to inform a useful prediction or not? Attrition data alone would indicate not.
Thus, a clear need exists for another methodology to extrapolate from the preclinical data and hypothesis. A logical step would be to explore the utility of more complex mathematical (e.g., QSP) models. In contrast to the empirical, the aim of QSP models is typically to generate mechanistic insight that can aid decision making. However, what can we do with a more complex model that we cannot do with a simple model?
Things we can do with a big model that we cannot with a simple/empirical model
Tools to investigate the drug targets in a specific pathway
As an example, consider the nerve growth factor (NGF) pathway currently of interest in drug discovery. QSP models of the NGF pathway have been developed using preclinical data.4 Thus, a sensitivity analysis identified NGF, TrkA kinase, and Ras as the optimal drug targets in the pathway and suggested efficacious doses for NGF and TrkA inhibitors. These predictions differed significantly from standard empirical predictions but have subsequently been supported by clinical data. The clinically efficacious dose for an NGF‐binding monoclonal antibody tanezumab was predicted by the QSP model to be ~10 mg, as was subsequently established via phase II clinical trials.5 The model predicted TrkA kinase is also a target, but > 99% maintained inhibition would be required to achieve efficacy on par with anti‐NGF monoclonal antibodies. This conclusion was recently supported by clinical trial data for PF‐06273340.6 Finally, the model predicted that the Ras/Gap in the pathway is one of the most important control points. Human genetic evidence shows that patients bearing a loss of function mutation in neuronal Ras/Gap exhibit a chronic pain phenotype.7 Thus, the information content of the QSP model has led to targets and associated dose predictions that have been verified by clinical data. In this respect, the complex model “wins.” Simple PK/PD models have also been used to assist decision making for the clinical development of tanezumab.8 Sufficient understanding to extrapolate across patient groups can be achieved with a simple model and, thus, the simple “wins.” This does not show that a simple model is better than a more complex one but, rather, that they are different tools addressing different questions; one is focused on our understanding of the pathway biology. In contrast, the other relates population PK/PD to a pain score and to extrapolate dose response to the next patient cohort.
Store mixed data on structure, components, and process
A unique property of QSP models is that they enable the collection of a summary of mixed multiscale data types. This can be subdivided into the tasks of capturing data, codifying data, clarifying data, and ultimately calculating or quantifying the implications (Figure 1 a). This enables a concise summary of all of the information a given project team believes is the relevant biology. The pathways and interactions can be displayed graphically (Figure 1 b), facilitating discussions with domain experts. Pathways and parameters can be linked to sources that allow rapid interrogation of the underlying data. Thus, such models act as a single repository of institutional information that is simple to access and easily updated and can prevent the drain away of institutional know‐how. Empirical models cannot enable this type of mixed‐data capture in this way and, hence, the complex “wins.”
Figure 1.

Added value of more complex models. (a) The “four C” value diamond of typical complex Quantitative Systems Pharmacology (QSP) models. In stage 1, input data are collected. These can come from text mining of literature corpuses (both automated and manual). In addition, domain expert opinion should also be utilized. In stage 2, these data are captured and codified in the model structure. Parameter and reactant values are hyperlinked to sources, thus preventing drain away of institutional data. To ensure scalability ontologies may be used. In stage 3, a graphical user interface (GUI) of the model is presented to domain experts to initiate a dialogue and clarify the accuracy of the model. Finally, in stage 4, the model can be used for calculations, such as calibration simulation and sensitivity analysis exercises. The diamond can be reinitiated as new data emerges. Gray arrows indicate typical order of execution of the stages. (b) An example representation of a QSP model for AD. (Image reprinted from ref. 10, CPT: Pharmacometrics & Systems Pharmacology https://doi.org/10.1002/psp4.12351, image is licensed under CC BY‐NC‐ND 4.0. ©2018 The authors.) The visual representation of compartments, reactions, and reactants allows cross‐discipline dialogue concerning the model. The GUI can be examined as shown at the level of the holistic model or specific areas can be visualized. APP, amyloid beta precursor protein; BACE1, Beta‐secretase 1; CSF, cerebrospinal fluid; PK, pharmacokinetic ; S1PR5, Sphingosine‐1‐phosphate receptor 5.
Model reduction: large models can be reduced but simple/empirical models cannot necessarily describe new data
There are several examples of successful model reduction; the complex pathway (full) NGF model was reduced from 99 to 11 state variables.9 In terms of simulating a given response to NGF pathway stimulation, the models perform equally well and, in this case, the simpler model “wins.” However, some information content of the full model is lost. At a simple level, the known biological pathway information is replaced by a series of input/output boxes. This has pros and cons; a pro may be that the complexity is rendered simpler to view. A con is that the known true pathway connections are lost and parameters that are linked to external data sources are lumped. At a quantitative level, the individual key controlling elements cannot be identified in the reduced model. From a drug discovery perspective, this is valuable information content, as discussed earlier.
It is also important to note that this reduction is a closed process, in the sense that details can be lumped and expanded, but those that were not in the model originally cannot necessarily be inferred (Figure 2). Following on from this, an advantage of multispecies QSP models is that they can be calibrated to and can simulate multiple end points (Figure 2). In contrast, an empirical model is typically restricted to a limited number of emergent properties. In addition, if complex models can be lumped efficiently, then simple empirical models can be produced as required from more complex models (e.g., during clinical trials to fit clinical emergent property data and to simulate clinical trial designs).
Figure 2.

Model A has three interlinked components each describing the behavior of one to a number of reactants (e.g., binding proteins, enzymes, receptors, etc.). Model A can be reduced to model B of n components where n < 3. Model B can be returned to give model A. Models A and B can simulate emergent property x and model A time courses for reactants in 1–3. In this example, new data is revealed showing a new component θ exists and that is interlinked with components 1 and 2. This is integrated to give model C. Model C can be reduced to model D with m components (m < 4) and the reverse. Models C and D can simulate emergent property y, and model D can simulate reactant time courses for 1–3 and θ. It is possible that model C can simulate emergent property x and reactants 1–3. Model A may not necessarily simulate emergent property y or θ. Black dashed arrows represent links between components, which could contain one or more reactants. Black solid arrows represent models that can be interchanged. Gray arrows indicate the simulations that could be produced. Dashed gray lines are dependent upon influence of new data θ.
Enable an enquiry into biological complexity
There are now many pathways in which the structure and reactions are in part agreed (e.g., NGF pathway). A logical step is, therefore, to build models that most closely reflect this, rather than an abstraction. We may not currently understand what this is telling us, but this approach gives the best possible capture of the biology and, hence, an optimal chance of extracting useful knowledge. The example of model reduction for the NGF pathway model mentioned previously illustrates the point; nature has evolved a pathway for the NGF pain response containing multiple steps. Model reduction can lump these without loss of emergent property prediction. The question this raises, though, is the following: if a response can be produced with fewer steps, why did evolution not eliminate the redundant steps (proteins)? Making proteins requires energy, and biology tends to eliminate wasted energy expenditure. This would lead to the conclusion that the additional complexity has a purpose we are not aware of— to create necessary robustness or a link to another pathway? Could it be that this is an example of inefficiency in evolution? In short, we do not know, but the complex model at least allows us to ask this crucially important question. In this regard, the complex “wins.”
Conclusions
Model predictions are dependent upon the assumptions inherent in them. As questions become more focused, models are simplified and calibration datasets become richer, then arguably the risk of models providing misleading conclusions decreases. A reasonable criticism of QSP models is that the influence of unknown‐unknowns and limited quality input data unacceptably increases the risk of using such models to explore complex biological questions. However, all models are “wrong” and history is rich with examples of incorrect models leading to productive discussion and a more detailed and realistic model. The Ptolemaic model of the universe was used to calculate interplanetary movements with some success for 1,500 years, before lack of concordance with key observations led to the current heliocentric model. Incorrect models can be powerful in scientific discovery, provided they are seen as tools to explore and are tested, debated, and revised systematically.
Overall, it is apparent that simple or empirical models “win” in some cases (simplicity, amenability to incorporate statistical parameters, ability to simulate an end point), but complex models in others (richer information content, clearer link to actual biology, potential to gain mechanistic insight). The question then becomes how do we assess relative value? An alternative view is that neither can “win,” merely that complex and simple/empirical models have different but complementary purposes. Thus, the model should be chosen for the use case. QSP models can perhaps be best looked at as tools to explore our understanding of disease biology in the earlier stages of drug discovery. As programs advance into the phase II and III domain, then the questions change from “is this the optimal target” to “how do we optimize dose, regimen, and patient numbers”? This latter question can be answered with a simple/empirical model. Indeed, this reduced model could be derived from the earlier complex QSP model using model reduction techniques and, thus, perhaps one is a natural evolution of the other.
Funding
No funding was received for this work.
Conflict of Interest
Neil Benson is an employee of Certara.
Acknowledgments
The author would like to thank Piet van der Graaf and Cesar Pichardo for valuable feedback.
References
- 1. Mistry, H.B. QSP versus the rest: let the competition commence!. CPT Pharmacometrics Syst. Pharmacol. 7, 490 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Stott, K. Pharma's broken business model — Part 2: scraping the barrel in drug discovery. <https://endpts.com/pharmas-broken-business-model-an-industry-on-the-brink-of-terminal-decline/> (2018).
- 3. Milligan, P.A. et al Model‐based drug development: a rational approach to efficiently accelerate drug development. Clin. Pharmacol. Ther. 93, 502–514 (2013). [DOI] [PubMed] [Google Scholar]
- 4. Benson, N. et al Systems pharmacology of the nerve growth factor pathway: use of a systems biology model for the identification of key drug targets using sensitivity analysis and the integration of physiology and pharmacology. Interface Focus 3, 20120071 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Lane, N.E. et al Tanezumab for the treatment of pain from osteoarthritis of the knee. N. Engl. J. Med. 363, 1521–1531 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Loudon, P. et al Demonstration of an anti‐hyperalgesic effect of a novel pan‐Trk inhibitor PF‐06273340 in a battery of human evoked pain models. Br. J. Clin. Pharmacol. 84, 301–309 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Gereau, R.W.T. Neurofibromatosis pain is in the membrane. Focus on “sensory neurons from Nf1 haploinsufficient mice exhibit increased excitability.” J. Neurophysiol. 94, 3659–3660 (2005). [DOI] [PubMed] [Google Scholar]
- 8. Rujia Xie, R.A. , Olson, S. & Marshall, S. Population Pharmacokinetic/Pharmacodynamic (Pk/Pd) Analysis Of The Effect Of Tanezumab On Overall Daily Pain Score Data In Adults With Moderate‐To‐Severe Pain Due To Osteoarthritis Of The Knee (Oxford University Press, Oxford, UK, 2009). [Google Scholar]
- 9. Snowden, T.J. , van der Graaf, P.H. & Tindall, M.J. A combined model reduction algorithm for controlled biochemical systems. BMC Syst. Biol. 11, 17 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Clausznitzer, D. et al Quantitative systems pharmacology model for Alzheimer disease indicates targeting sphingolipid dysregulation as potential treatment option. CPT Pharmacometrics Syst. Pharmacol. 7, 759–770 [DOI] [PMC free article] [PubMed] [Google Scholar]
