Abstract
The case is made for a much closer synergy between climate science, numerical analysis and computer science.
This article is part of a discussion meeting issue ‘Numerical algorithms for high-performance computational science’.
Keywords: climate modelling, stochastic parametrization, low-precision modelling, artificial intelligence, exascale computing
Numerical modelling is an essential part of modern science and many researchers in both the physical and the biological sciences use numerical models, e.g. when the underpinning equations are too complicated to be solved analytically. To have confidence that the numerical results are faithful to these underpinning equations, it is in principle important for such scientists to understand the numerical techniques used in the models. In practice, however, many will simply assume that these numerics have been sufficiently well tested on similar systems and are therefore adequate. Moreover, many scientists would have little interest in designing more efficient code if it was found that the time to solution was becoming too long—instead, this would be passed on to those numerical analysts trained in algorithmic design, and to computer scientists whose job would be to implement the algorithms efficiently on available hardware. In this way, a disconnect has developed between the scientists who use numerical models and those who develop and implement the numerical algorithms which underpin these models.
Focusing on climate modelling, I will argue that for some problems this disconnect is creating an obstacle to progress. More specifically, I will argue that the existence of a definable demarcation between the vast majority of climate scientists who simply use models (e.g. to test hypotheses), and the numerical analysts and computer scientists who develop these models are holding back the development of climate science. Given the critical importance of reliable climate predictions for helping society become resilient to the changing extremes of weather and climate, society is the ultimate loser from this state of affairs. My own view is that the ideal climate scientist of the future will be something of a polymath, knowledgeable about climate theory, computer science and numerical methods.
A key reason for making such an assertion is that the traditional paradigm in numerical analysis does not really apply in climate science. This paradigm can be framed as follows: if I the scientist have a problem that I want to be solved to some prescribed level of accuracy then I look to you, the numerical analyst, to give me an algorithm that will achieve this level of accuracy, and to you, the computer scientist, to either implement this algorithm on given hardware or to tell me what type of hardware I will need to implement this algorithm.
For climate, no-one knows of an algorithm that will allow me to simulate climate to any given reasonable level of accuracy (1, 5, 10% say). As a measure of accuracy, let me demand that the algorithm simulates the climate of the recent 50 years (for which there are reasonably plentiful observations) without substantial bias or systematic error when compared with observations. For example, one can estimate from observations of the last 50 years, the average seasonal-mean rainfall in different regions of the world. I may reasonably demand of the numerical analyst that the systematic error in the simulation of such rainfall amounts is substantially smaller than the model's seasonal-mean rainfall response to anticipated atmospheric CO2 concentrations between the late twenty-first century and the late twentieth century. That is to say, I may reasonably require some suitable ‘bias-to-signal’ ratio to be a small dimensionless number, say 0.1.
For reasons discussed below, no existing set of algorithms comes close to being able to simulate the statistical properties of the variables that determine our climate (rainfall, temperature, wind and so on) to such a level of accuracy. For some of the contemporary climate models used in the IPCC AR5 report [1], this bias-to-signal ratio can exceed a factor of 10 for some variables and regions [2]—a hundred times bigger than what we might desire. This reflects the fact that simulating climate accurately is an exceptionally challenging problem. We are a long way from passing the ‘Climate Turing Test’ [3] where we cannot easily tell whether we are looking at a simulation of climate, rather than the real thing.
Hence, instead of seeking algorithms which solve the climate simulation problem to some prescribed tolerance, we must instead ask: What algorithms can reduce systematic errors to the smallest possible values and at the same time be not so expensive as to be unaffordable? The notion of what is affordable raises the question of how much society values accurate forecasts of climate and whether climate simulation and prediction is of such importance societally that it should be funded at a similar level of ambition to something like the Large Hadron Collider [4]. We do not answer such questions here, but nevertheless suggest that the development of algorithms which run in a reasonable time (say a day of wall-clock time for a decade of simulated climate) on a dedicated exascale high-performance computer would appear a plausible objective for the design of algorithms for the next-generation climate simulation and prediction models.
Since it is believed that the underpinning equations of motion on which the algorithms are based are correct, an inability to solve these equations numerically without significant bias must arise from the fact that existing numerical representations are not sufficiently accurate. In practice, this means that the resolution of existing models is inadequate. As such, it should be possible to reduce these errors by increasing the resolution of the model sufficiently, thereby eliminating some of the semi-empirical subgrid parametrizations [5], which are required to represent processes, which are not resolvable on the chosen grid. Currently, many climate models have horizontal grid point spacings of around 100 km, though some can run with finer grids of around 20 km. If it were possible to reduce grid spacing to around 1 km grid, it would be possible to eliminate parametrizations of deep convection and orographic gravity-wave drag in the atmosphere and mesoscale eddy mixing in the oceans, and instead represent these processes with the proper laws of physics. Unfortunately, even with the anticipated increase in computing speed brought about by exascale computing, we will still be short of processing capability needed to get to a 1 km grid by a couple of orders of magnitude. We need to find further cost savings.
The core of a weather or climate model (sometimes referred to as a dynamical core) is its computational representation of the dynamical equations of fluid motion—the nonlinear Navier–Stokes partial differential equations, the laws of thermodynamics and relevant conservation laws (e.g. for mass). Conventionally, these equations are solved numerically by projecting the equations onto some finite set of basis functions (e.g. spherical harmonics) and representing all unresolved processes by parametrizations: deterministic formulae slaved to the smallest-scale variables in the dynamical core. These parametrizations are not computationally cheap. In a contemporary numerical weather prediction model, about 50% of the overall computational cost of integrating the model arises from the parametrization schemes. For an Earth System Model—a climate model which includes comprehensive representations of the cryosphere and the biosphere–such parametrizations can take the overwhelming majority of the computational cost of running the model.
The problem with this approach is that such a hard truncation does not respect one of the key symmetries of the Navier–Stokes equations, the so-called scaling symmetries [6]. If u(x,t) is the velocity field and p(x,t) is the pressure field associated with a solution to the Navier–Stokes equations, then
1.1 |
are velocity and pressure fields which also solve the Navier–Stokes equations, where is a dimensionless scaling parameter. Such scaling symmetry provides the basis for the self-similar nature of fluid turbulence and is consistent with the ubiquitous existence of power-law behaviour in multi-scale observations of the atmosphere and oceans [7,8].
A way to mitigate such numerical violation of these scaling symmetries is to make the subgrid parametrizations stochastic, where the stochasticity has spatio-temporal correlations which extend into the resolved spectrum of the model [9–11]. Research over the years has demonstrated that climate models with stochastic parametrization often have reduced systematic errors compared with the traditional deterministic numerical approach [12–14]. Stochastic parametrization is an important element of probabilistic ensemble prediction systems, providing explicit representations of model uncertainty in the ensembles. Without stochastic parametrization, ensembles would have insufficient spread and forecast probabilities would be overconfident [11].
If a model is integrated with stochastic rather than deterministic parametrizations, the stochasticity will percolate up the spectrum of spherical harmonics, eventually affecting even the largest scales. This raises an immediate question: If there are scientific reasons to treat the numerical representation of the Navier–Stokes equations in a partially stochastic manner, then is it really necessary to integrate the equations using traditional deterministic bit-reproducible hardware, especially since this constrains the type of supercomputing hardware which can be used for climate prediction [15]? Of course, there are many parts of the computational process which do have to be performed precisely, and there are good reasons to want determinism when developing a model prior to operational release. However, in a stochastic model, many of the higher-order bits in the significands of floating-point variables can tolerate stochastic perturbations due to thermal noise inside the chips [16], without inducing a significant error in the skill of probabilistic forecasts. If the computational energy saved can be reinvested to improve model resolution, then imprecision can be used to increase forecast accuracy, paradoxical as that might sound at first sight. However, although the use of stochastic chips in high-performance computing may have its day at some stage in the future, at present flop rate is not the main constraint in determining overall performance in current high-performance computers—it is energy consumption. The principal determinant of energy consumption is the amount of data transported inside an HPC system: from processor to processor and from processor to memory.
Traditionally (at least over the professional research time scale of the author), the variables in a weather or a climate model have been represented by 64-bit double-precision floating-point representations. However, the presence of stochasticity in climate model parametrizations suggests that the useful information in these variables may be much less than 64 bits. Instead of transporting all 64 bits, how much energy savings could be achieved by only transporting the bits which represent useful information? With the explosive interest in artificial intelligence, mixed-precision chips have been developed with the ability to perform arithmetic efficiently with 16-, 32- and 64-bit floating-point representations of real-number variables. Over the past few years, the author and colleagues have been developing a research version of the spectral dynamical core of the European Centre for Medium-Range Weather Forecasts Integrated Forecast System model, where the majority of the model is coded with 16-bit (rather than 64-bit) floating-point reals. To do this, an emulator of reduced-precision arithmetic has been coded [17] and research has focused primarily on reducing the number of significand bits. A spectral model is particularly interesting in this respect as in such a model it is possible to make numerical precision scale-dependent (i.e. reducing precision with increasing wavenumber). Results are extremely encouraging: a scale-dependent optimally reduced precision has been developed where most of the high wavenumbers (those which contribute most to the computational load) can be run with 7-bit significands. Typically, the difference between the model with reduced and double precision is smaller than the impact of the stochastic parametrization scheme [18–20]. On top of this, preliminary results [21] indicate that it should also be possible to integrate the parametrizations with 16-bit precision without significant loss of forecast accuracy.
The principal purpose of this discussion is to note that this entire programme of work did not arise by considering (out of the blue) ways of making the code more algorithmically efficient, nor did it arise from the development of low-precision chips per se, it arose as a result of the introduction of stochastic parametrizations, themselves aimed at better algorithmic representation of the Navier–Stokes equations in the models. When I first started discussing with modelling colleagues, the reasons why their models were typically run with 64-bit floating-point reals, a typical answer would be: ‘The models will blow up with anything less'. And indeed this was the case—though the reasons were due to poor coding practice rather than anything intrinsic. In particular, there was initially little enthusiasm to invest in the time to recode the model to test whether it really would run at lower precision. It needed good scientific arguments to justify investing the time to recode the model.
It is now becoming commonplace to run weather forecast models with 32-bit precision [22]. However, to run the parametrizations using 16-bit reals, and spectral dynamical core and Legendre transforms with high wavenumbers using 16-bit reals, further work is needed to rescale the basic prognostic variables to reduce the dynamic range of such variables [23]. This work is in progress: again, in the past, little attention has been given to the issue of trying to ensure that the range of prognostic variables lies as close to unity as possible—with 64 bits available, there was simply no incentive to do such work.
Of course, this is not all. The notion that parametrizations are best represented stochastically suggests that it may be possible to represent them adequately using Neural Nets, run at low numerical precision [24,25]. Work to assess the viability of such a possibility is currently in progress.
Finally, since stochasticity manifestly blurs the boundary between the resolved and unresolved scales, a critical question is whether it really is vital to reduce the grid all the way to 1 km, or whether the benefits of describing deep convection, orographic gravity waves and ocean mesoscale eddies (as far as their ability to influence the larger-scale climatic elements are concerned) are retained with a grid spacing of 2 km or more. After all, each halving of resolution (horizontal and vertical) will reduce computationally load by a factor of up to 16. And if it is possible to run with grids of a few kilometres, instead of 1 km, then it will be possible to use hydrostatic equations of motion, instead of the more costly non-hydrostatic equations, again saving computational time. This is a scientific question which can only be answered by detailed numerical experimentation. Initial results suggest that it will not be necessary to reduce grid spacing all the way to 1 km to benefit from the effects of switching off the parametrization of convective cloud systems [26,27].
In this way, the algorithmic design of cloud-permitting global climate models can be envisaged for the next-generation exascale computers: these will be based on stochastic parametrization using Neural Nets, mixed-precision arithmetic for both parametrizations and dynamical core and a grid spacing coarse enough to allow 10 years of integration for 1 day of wall-clock time and fine enough to allow the parametrizations of deep convection, orographic gravity-wave drag and ocean mesoscale eddies to be switched off. Arriving at this picture of a next-generation climate model has arisen from a detailed interplay between basic science, numerical analysis and computational design.
As such, the notion of climate scientists as mere users of algorithms, developed and implemented by largely independent numerical analysts and computer scientists is an ineffective route to progress. Not only do these groups have to work together, one could argue that any young scientist entering the field of climate modelling should seek to be knowledgeable in each of these three fields. Progress often arises from a transfer of ideas and information from different fields. The imposition of barriers which would constrain such transfer of ideas, in turn, implies barriers to progress.
I believe this paradigm may also be relevant in other areas of science.
Data accessibility
This article has no additional data.
Competing interests
I declare I have no competing interests.
Funding
The author acknowledges financial support from the ERC Advanced Grant ITHACA [grant no. 741112].
References
- 1.IPCC. 2013. Climate change 2013: the physical science basis. In Contribution of working group I to the fifth assessment report of the intergovernmental panel on climate change (eds Stocker TF, et al.), 1535 p Cambridge, UK, New York, NY: Cambridge University Press. [Google Scholar]
- 2.Palmer TN, Stevens B. 2019. Is climate science doing its part to address the challenge of climate change? Proc. Natl Acad. Sci. USA 116, 24 390–24 395. ( 10.1073/pnas.1906691116) [DOI] [Google Scholar]
- 3.Palmer TN. 2016. A personal perspective on modelling the climate system. Proc. R. Soc. Lond. A Math. Phys. Sci. 472, 20150772 ( 10.1098/rspa.2015.0772) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Palmer TN. 2011. A CERN for climate change. Phys. World 24, 14–15. ( 10.1088/2058-7058/24/03/24) [DOI] [Google Scholar]
- 5.Arakawa A. 2004. The cumulus parameterization problem: past, present, and future. J. Clim. 17, 2493–2525. () [DOI] [Google Scholar]
- 6.Majda AJ, Bertozzi AL. 2001. Vorticity and incompressible flow. Cambridge texts in applied mathematics. Cambridge, UK: Cambridge University Press. [Google Scholar]
- 7.Nastrom GD, Gage KS. 1985. A climatology of atmospheric wavenumber spectra observed by commercial aircraft. J. Atmos. Sci. 42, 950–960. () [DOI] [Google Scholar]
- 8.Lovejoy S, Schertzer D. 2013. The weather and climate - emergent laws and multifractal cascades. Cambridge, UK: Cambridge University Press. Cambridge.
- 9.Buizza R, Miller MJ, Palmer TN. 1999. Stochastic simulation of model uncertainties in the ECMWF Ensemble Prediction System. Q. J. R. Meteorol. Soc. 125, 2887–2908. ( 10.1002/qj.49712556006) [DOI] [Google Scholar]
- 10.Palmer TN. 2001. A nonlinear dynamical perspective on model error: a proposal for nonlocal stochastic-dynamic parametrisation in weather and climate prediction models. Q. J. R. Meteorol. Soc. 127, 279–304. ( 10.1002/qj.49712757202) [DOI] [Google Scholar]
- 11.Palmer TN. 2019. Stochastic weather and climate models. Nat. Phys. Rev. 1, 463–471. [Google Scholar]
- 12.Weisheimer A, Corti S, Palmer T, Vitart F. 2014. Addressing model error through atmospheric stochastic physical parametrizations: impact on the coupled ECMWF seasonal forecasting system. Phil. Trans. A Math. Phys. Eng. Sci. 372, 20130290 ( 10.1098/rsta.2013.0290) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Christensen HM, Berner J, Coleman DRB, Palmer TN. 2017. Stochastic parameterization and El Niño-southern oscillation. J. Clim. 30, 17–38. ( 10.1175/JCLI-D-16-0122.1) [DOI] [Google Scholar]
- 14.Berner J, Achatz U, Batte L, Bengtsson L, de la Camara A, Christensen HM, Yano JI. 2017. Stochastic parameterization toward a new view of weather and climate models. Bull. Am. Meteorol. Soc. 98, 565–587. ( 10.1175/BAMS-D-15-00268.1) [DOI] [Google Scholar]
- 15.Palmer T, Düben P, McNamara H. 2014. Stochastic modelling and energy-efficient computing for weather and climate prediction. Phil. Trans. A Math. Phys. Eng. Sci. 372, 20140118 ( 10.1098/rsta.2014.0118) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Palem KV. 2014. Inexactness and a future of computing. Phil. Trans. R. Soc. A 372, 20130281 ( 10.1098/rsta.2013.0281) [DOI] [PubMed] [Google Scholar]
- 17.Dawson A, Düben PD. 2017. An emulator for reduced floating-point precision in large numerical simulations. Geosci. Model Dev. 10, 2221–2230. ( 10.5194/gmd-10-2221-2017) [DOI] [Google Scholar]
- 18.Düben PD, Palmer TN. 2014. Benchmark tests for numerical weather forecasts on inexact hardware. Mon. Weather Rev. 142, 3809–3829. ( 10.1175/MWR-D-14-00110.1) [DOI] [Google Scholar]
- 19.Thornes T, Düben PD, Palmer TN. 2018. A power law for reduced precision at small spatial scales: experiments with an SQG model. Q. J. R. Meteorol. Soc. 144, 1179–1188. ( 10.1002/qj.3303) [DOI] [Google Scholar]
- 20.Chantry M, Thornes T, Palmer T, Dueben P. 2019. Scale-selective precision for weather and climate forecasting. Mon. Weather Rev. 147, 645–655. ( 10.1175/MWR-D-18-0308.1) [DOI] [Google Scholar]
- 21.Saffin L. Submitted. Using stochastic physics to determine the required numerical precision for the parametrization schemes of a global atmospheric model.
- 22.Vána F, Düben PD, Lang S, Palmer TN, Leutbecher M, Salmond D, Carver G. 2017. Single precision in weather forecasting models: an evaluation with the IFS. Mon. Weather Rev. 145, 495–502. ( 10.1175/MWR-D-16-0228.1) [DOI] [Google Scholar]
- 23.Higham NJ, Pranesh S, Zounon M. 2019. Squeezing a matrix into half precision, with an application to solving linear systems. SIAM J. Sci. Comput. 41, A2536–A2551. ( 10.1137/18M1229511) [DOI] [Google Scholar]
- 24.Chevallier F, Chéruy F, Scott NA, Chédin A. 1998. A neural network approach for a fast and accurate computation of longwave radiative budget. J. Appl. Meteorol. 37, 1385–1397. () [DOI] [Google Scholar]
- 25.Krasnopolsky VM. 2013. The application of neural networks in the Earth-System sciences. In Atmospheric and oceanographic sciences library, 46. Berlin, Germany: Springer. [Google Scholar]
- 26.Hohenegger C, Kornblueh L, Klocke D, Becker T, Cioni G, Engels JF, Schulzweide U, Stevens B. In press Climate statistics in global simulations of the atmosphere, from 80 to 2.5 km spacing. J. R. Meteorol. Soc. Jpn. ( 10.2151/jmsj.2020-005) [DOI] [Google Scholar]
- 27.Temprado J-V, Ben N, Panosetti D, Schlemmer L, Schaer C. In press Climate models permit convection at much coarser resolutions than previously considered. J. Clim. ( 10.1175/JCLI-D-19-0286.1) [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
This article has no additional data.