Abstract
We seek statistical methods to study the occurrence of multiple rain types observed by satellite on a global scale. The main scientific interests are to relate rainfall occurrence with various atmospheric state variables and to study the dependence between the occurrences of multiple types of rainfall (e.g. short-lived and intense versus long-lived and weak; the heights of the rain clouds are also considered). Commonly in point process model literature, the spatial domain is assumed to be a small, and thus planar domain. We consider the log-Gaussian Cox Process (LGCP) models on the surface of a sphere and take advantage of cross-covariance models for spatial processes on a global scale to model the stochastic intensity function of the LGCP models. We present analysis results for rainfall observations from the TRMM satellite and atmospheric state variables from MERRA-2 reanalysis data over the tropical Eastern and Western Pacific Ocean, as well as over the entire tropical and subtropical ocean regions. Statistical inference is done through Monte Carlo likelihood approximation for LGCP models. We employ covariance approximation to deal with massive data.
Keywords: Global spatial data, Log-Gaussian Cox process, Point process models, Rainfall occurrence, TRMM precipitation radar
1. INTRODUCTION
Despite tremendous efforts by researchers to understand the global atmospheric circulation and climate, state-of-the-art climate models, that is General Circulation Models (GCMs), still exhibit pervasive biases. For example, climate models used for understanding the human influence on climate change, namely the Coupled Model Intercomparison Project phase 3 and 5 models (CMIP3 and CMIP5, respectively), show only a slight improvement in terms of their representation of rainfall (Flato, G. et al. 2013). Accurate understanding of rainfall distribution over space and time is crucial, as it is not just a matter of local rainfall but entails the forcing of atmospheric circulations around the globe (Hartmann, Hendon, and Houze Jr. 1984; Schumacher, Houze Jr., and Kraucunas 2004) and the sensitivity to anthropogenic climate change (Sherwood, Bony, and Dufresne 2014). Poor rainfall representation in models also degrades the simulation of tropical phenomena such as the Madden-Julian Oscillation (MJO) and El Niño that contribute to atmospheric predictability (Hung et al. 2013; Zhu et al. 2017).
Over the last 19 years, high quality measurements of rainfall over the tropics and extratropics have become available via NASA’s Tropical Rainfall Measurement Mission (TRMM; Kummerow, Barnes, Kozu, Shiue, and Simpson 1998) and the Global Precipitation Measurement (GPM; Hou et al. 2014) mission satellites. These high quality data sets help provide us better understanding of rainfall characteristics around the globe, which can help improve climate model simulations of rainfall. The radars onboard the TRMM and GPM satellites provide rainfall occurrence and amount for three different types of rain, namely stratiform, deep convective, and shallow convective. Each of these rain types have different properties in terms of intensity, duration, and height in the atmosphere and are further described in Section 2.1.
Little work has been done in developing flexible statistical models and methods to understand how rainfall happens, let alone how different rain types are characterized and interact with each other. Many statistical studies regarding rainfall focus on relatively small regional domains (e.g., Seo 1998; Frei and Schar 2001; Cowpertwait, Isham, and Onof 2007; Sun and Stein 2015). Also most of these studies focus on the rainfall data itself, with less focus on understanding how rainfall is related to atmospheric state variables such as temperature and humidity. These state variables have strong physical connections to rainfall amounts and rain types (e.g., Johnson, Rickenbach, Rutledge, Ciesielski, and Schubert 1999; Bretherton, Peters, and Back 2004; Ahmed and Schumacher 2015), and statistical modeling of these connections can shed light on the processes that control rainfall occurrence and strength. This underscores the need to develop flexible statistical models to understand not only how each rain type occurs but also the joint distributional structure for these three different types of rainfall.
Statistical methods for point processes are concerned with arrangements (or patterns) of points in a random set (temporal, spatial or spatio-temporal domain). There are numerous types of data that come as point patterns in physical, environmental, and biological applications. Statistical methods for spatial (or spatio-temporal) point patterns have been developed for various aspects of the analyses, such as stochastic models and methods (Møller, Syversveen, and Waagepetersen 1998; Schlather, Ribeiro, Jr., and Diggle 2004), model fitting and inference (Diggle 1985; Guan 2006; Waagepetersen and Guan 2009), and goodness-of-fit methods for the statistical models (Guan 2008). Various point process models have been used in a wide variety of applications (Schoenberg 2003; Diggle, Rowlingson, and Su 2005; Peng, Schoenberg, and Woods 2005; Zammit-Mangion, Dewar, Kadirkamanathan, and Sanguinetti 2012). Diggle (2014) and Møller and Waagepetersen (2004, 2017) provide nice overviews of the field with further references.
In statistical modeling of spatial point patterns, a Poisson process often serves as a building block for more complex models. When the intensity function of a Poisson process is constant over the spatial domain we call it a homogeneous Poisson process, and when it varies over space we call it an inhomogeneous Poisson process. Spatial point pattern data in most real applications are not suitable for being modeled with homogeneous Poisson process models due to the models’ obvious limitations of spatially constant intensity functions.
One prominent approach for dealing with inhomogeneous spatial point patterns is through the so-called Coxprocess (or “doubly stochastic” process, Cox 1955). A spatial Cox process, Φ, in a planar domain, , is defined via the following two properties:
{Λ(x) : x ∈ D} is a non-negative-valued stochastic process
conditional on {Λ(x) = λ(x) : x ∈ D}, Φ is an inhomogeneous Poisson process with intensity function λ.
A Cox process is particularly suitable for Poisson processes with intensity functions that vary over space, which is usually the case for environmental applications. Chapter 25 of Gelfand, Diggle, Fuentes, and Guttorp (2010) states that Cox processes provide natural models when the point process in question arises as a consequence of environmental varia tion in intensity that cannot be described completely by available explanatory variables. A particular kind of Cox process, the log-Gaussian Cox process (LGCP), is defined with log{Λ(x)}, a Gaussian spatial random field (Møller et al. 1998). LGCP models are effective and convenient in the sense that we can exploit the rich literature on spatial and spatio-temporal models for Gaussian random fields in geostatistics (Diggle, Moraga, Rowlingson, and Taylor 2013). In particular, stationary and nonstationary parametric mean and covariance functions, which have been developed in the geostatistical literature for both univariate and multivariate settings for spatial and spatio-temporal processes (e.g., Cressie and Huang 1999; Gneiting 2002; Stein 2005b; Apanasovich and Genton 2010; Gneiting, Kleiber, and Schlather 2010; Jun 2014) can be used to model the stochastic intensity function.
On the other hand, in the literature on multivariate point patterns with a multivariate stochastic intensity function, cross-covariance structures of the stochastic intensity functions have been quite limited. Suppose Λ = (Λ1,…,Λm) denotes the multivariate stochastic intensity function. Diggle (2014, p. 126) lets Λ1(x) = ξ·Λ2(x) (ξ > 0) for a bivariate case, which is too restrictive. Møller and Waagepetersen (2004) use a Linear Model of Coregionalization (Gelfand, Schmidt, Banerjee, and Sirmans 2004), which essentially writes each process as a linear combination of several common independent processes.
Traditionally, most of spatial point process models were not developed for spherical domain. Common application examples for point processes in the literature concern spatial domains with sizes as small as an agricultural field or a small forest area, and the application domains are at most the size of a country (e.g., Schoenberg 2003; Diggle et al. 2005; Shirota and Gelfand 2017). Recently, there have been developments in point pattern modeling on a global scale. Robeson, Li, and Huang (2014) discussed how Ripley’s K-function needs to be adjusted on spheres. Lawrence, Baddeley, Milne, and Nair (2016) discussed estimation and modeling of the K-function on spheres and applied Thomas process models on spheres to galaxies data set. Møller, Nielsen, Porcu, and Rubak (2018) developed Determinantal point process models on spheres. However, as far as the authors are aware, there has been little work on the LGCP models on a global scale, despite recent rapid developments on methods and models for geostatistical (continuous) spatial data on spheres (e.g., Heaton, Katzfuss, Berrett, and Nychka 2014; Jun 2014; Jeong and Jun 2015; Guinness and Fuentes 2016; Porcu, Bevilacqua, and Genton 2016, see Jeong, Jun, and Genton (2017) for a review with more references). There are a few prominent exceptions. Simpson, Illian, Lindgren, Sørbye, and Rue (2016) give a data analysis example ofLGCP models to a point process over the ocean using integrated nested Laplace approximation for fast approximate inference. A very recent work by Cuevas-Pacheco and Møller (2018) provides a theoretical study of (univariate) LGCP models on the sphere, especially regarding definition and existence of LGCP models on the sphere. They showed a data example of fitting an LGCP model to the sky positions of galaxies. Diggle et al. (2013) list a series of application examples for LGCP models and all of these are assumed to be defined on . As far as the authors can tell, an R package, lgcp (Taylor, Davies, Rowlingson, and Diggle 2015), which provides a nice tool box for LGCP, is not built for spatial patterns on a global scale.
Our scientific interests, in this paper, are in understanding of the spatial patterns of rainfall occurrences for three rain types and how they are related to various atmospheric state variables. We present analysis of multivariate point patterns on a global scale through multivariate LGCP models. The nonstationary nature of occurrences of multiple rain types is dealt with by incorporating atmospheric state variables in the mean structure of the log of the stochastic intensity functions. The cross-covariance structure of multivariate (log) intensity functions for the three rain types is modeled by multivariate Matern covariance function. We employ a Monte Carlo approximation of likelihood function for parameter estimation with the help of covariance approximation to deal with massive data.
Although the rainfall data we use in this work is given on a gridded domain, we treat locations of grid points with rainfall as point patterns, rather than a lattice, as our main interest is in modelling the occurrences of rain. This is reasonable given the high spatial and temporal resolution (we consider 0.5 degree, 6 hourly data) and the fact that global climate data is commonly a gridded product. In fact the rain data sets we use in this paper are originally from satellite and they are post processed on a high-resolution gridded domain. Rain pattern data have been analyzed using point process models in the literature. For instance, Cowpertwait et al. (2007) and Kaczmarska, Isham, and Onof (2014) applied point process models to deal with fine scale structure of rainfall process, although they focused on the temporal aspect of rainfall process. Cowpertwait (2010) applied a spatio-temporal point process model for rainfall processes over the Rome region, Italy.
Although the gridded data sets provide information every six hours, which results in four data points on a given day, we only use the first 6-hour period each day. This is mainly to avoid variations associated with the diurnal cycle. Indeed, we do not consider the temporal aspect of the data in this paper.
The interest in this work is to see how much information atmospheric state variables contain in terms of rain occurrence (not rain intensity). The conditions that impact rainfall occurrence are often different than the conditions that impact rain intensity and it is relevant to weather forecasting and climate modeling to know if it will rain or not. For example, weather forecasts commonly provide the chance of rain and many climate models have far too high of rain occurrence compared to observations, which impacts the rest of the climate system through changes to boundary layer conditions, radiative processes, etc.
The rest of the paper is organized as follows. Section 2 describes the data used for the analysis. Details in statistical models, inference, along with computational techniques for handling large data are given in Section 3. Section 4 provides analysis results for modeling three rainfall types. The paper is concluded with some remarks in Section 5.
2. DATA
2.1. Rainfall data
The TRMM satellite (Kummerow et al. 1998) operated from late 1997 to early 2015 and yielded almost 17 years of continuous high-resolution measurements of the 3-dimensional structure of tropical and subtropical rainfall using its precipitation radar (PR). The PR had a footprint of 5 km at nadir and a 240 km swath width (these values were 4.3 km and 215 km before the 2001 altitude boost). About 2 million rain measurements were produced per day. The GPM satellite (Hou et al. 2014) has been operating since 2014 and has begun providing this information with the dual frequency precipitation radar (DPR) into the extratropics.
We use Version 7 TRMM PR rainfall data for the first two weeks of June 2003 placed in 6-hourly, 0.5 grids. June is the start of the Northern Hemisphere summer and is when rain begins to maximize in the tropical Pacific. The year 2003 was chosen because it was a neutral year for the El Nino-Southern Oscillation. Large rain changes happen across the Pacific when there is a warm El Nino event or a cold La Nina event. During this time period, only radar observations over the tropics and subtropics (from 35 S to 35 N) are available. Although this range does not cover the entire globe, it gives 360 coverage in terms of longitude and thus it is inevitable to consider statistical models on a global scale (as opposed to models for planar domain). The outer rectangle of Figure 1 shows the entire domain of TRMM data and the two inner rectangles show the domain of local analysis (the Eastern Pacific and Western Pacific) presented in Section 4.1. Shaded region in Figure 1 shows the spatial coverage of the TRMM path for the first week of June 2003. Note that there are approximately 16 orbits per day and that a 0.5 grid will be visited by the PR 1–2 times per day at most (and often not at all) during a 6-hour period. The PR makes observations over both land and ocean but shaded region in Figure 1 only includes points over land that have atmospheric information at 1000 mb (atmospheric information at 1000 mb are used as covariates in statistical analysis. See Section 4 for mode details). It makes sense to focus our analysis mostly on the ocean portions of the domain because rain type occurrence over land is strongly related to topography and the diurnal cycle of the sun (Ahmed and Schumacher 2017), thus complicating the statistical models.
Figure 1.
TRMM domain and the Eastern Pacific (EP; 15° S - 15° N, 180° W - 100.25°W) and the Western Pacific (WP; same latitude range as EP, 130.25° E - 180° E) regions. Shaded area shows the region for global analysis presented in Section 4.2.
The three rain types of interest are: deep convective (DC), shallow convective (SC), and stratiform (Str). Deep convection is associated with strong, intermittent rain and constitutes a large portion of rainfall over tropical land and oceans and the extratropical storm tracks. Stratiform cloud systems are associated with weaker, widespread rainfall that can either form as a result of deep convective clouds, as is common in tropics, or from large-scale lifting as found in fronts at higher latitudes (Houze Jr. 2004). Convective rain in general can be separated into shallow and deep, where all of the shallow convective rain forms from warm rain processes and cloud tops do not exceed the 0° C height level (Schumacher and Houze Jr. 2003). Deep convective cloud tops often exceed 10 km and cold rain processes play an important role in overall intensity and rain production. Shallow convection often occurs outside of the heavy rain regions in the tropics, unlike deep convection. All three rain types are differentiated in the PR observations using texture and height information (Awaka, Iguchi, Kumagai, and Okamoto 1997; Funk, Schumacher, and Awaka 2013).
2.2. Atmospheric state variables
The atmospheric state variables that will help describe the rainfall distributions are generated by a global climate model that assimilates data to provide a dynamically consistent set of fields constrained by observations. We use NASA’s Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2; Molod, Takacs, Suarez, and Bacmeister 2015). We use 6-hourly data at 2/3° × 1/2° horizontal grid resolution. The reanalysis fields utilized are temperature, humidity, horizontal winds, and surface latent heat flux. These variables are interpolated to a common horizontal grid with 0.5° spatial resolution to match the spatial resolution of the gridded TRMM rainfall data every 6 hours. Care is taken to preserve the predictive temporal relationships (for instance, if atmospheric state variables are observed at time 00 UTC, then rainfall data is accumulated from 00 UTC to 06 UTC). This permits the attribution of causal interpretations to any statistical relationships that are identified.
Some atmospheric state variables, such as temperature, humidity, and horizontal winds, are given for multiple vertical levels from the reanalysis. One of the main scientific interests in this paper is to relate the vertical profile of these state variables with the rainfall data near the surface. Atmospheric scientists often use a technique called Empirical Orthogonal Function (EOF) decomposition. This is essentially the same as Principal Component (PC) analysis and each EOF is a vector of weights (or loadings) for each level of a PC. That is, let t(s, h) denote the temperature value at spatial location s and vertical height (i.e. pressure) h, and temperature is observed at r pressure levels, h1,…, hr. If ith PC (i = 1,…, r) is expressed as ti(s) = α1t(s,h1) + ··· ar t(s,hr), the corresponding EOF, Ei(s), is given by (a1,…,ar).
Figure 2 shows first three EOFs of temperature (corresponding to t1,t2,t3), humidity (q1, q2, q3), zonal (east-west) winds (u1,u2, u3) and meridional (north-south) winds (v1, v2, v3) over the Eastern Pacific (EP; 15° S - 15° N, 180° W - 100.25° W) and the Western Pacific (WP; same latitude range as EP, 130.25° E - 180° E) domains from Figure 1. Particular features of note are the differing heights and depth of the inversion (i.e. when temperature increases with height) at low levels in each temperature profile, the strong drying or moistening at mid levels in the humidity profiles, and the strength and direction of the winds near the surface and in the upper atmosphere. Negative zonal values indicate easterly winds (i.e., winds from the east) and positive zonal values indicate westerly winds (i.e., winds from the west). Similarly, negative meridional values indicate northerly winds (or winds from the north) and positive meridional values indicate southerly winds (or winds from the south). Temperature inversions are generally detrimental to convection unless they can be broken through (e.g., by daytime surface heating) allowing convection to attain great strength. A dry mid atmosphere is considered detrimental to both deep convective and stratiform rain, but is commonly associated with shallow convection (e.g., Jensen and Genio 2006). The relationship between wind and rainfall is often linked to the change of wind speed and/or direction with height (i.e., wind shear), which is further discussed below.
Figure 2.
First three EOFs of temperature, humidity, and wind data over the Eastern and Western Pacific regions.
Additional horizontally-varying state variables considered in the study are three definitions of vertical wind shear (ls, dp, and dds), surface latent heat flux (lh), and latitude (lat). If u[z] and v[z] denote zonal and meridional wind speeds at pressure level z, then shear variables are defined in the following way:
| (low-level shear), |
| (deep shear), |
| (deep directional shear), |
where 900 denotes the 900 mb pressure level and so on. The three vertical shear variables are meant to highlight different mechanisms in the atmosphere that promote convective and stratiform rain production. For example, low-level shear typically helps initiate convective cells (Rotunno, Klemp, and Weisman 1988), while deep shear is thought to assist the formation of stratiform rain regions (Li and Schumacher 2011). Deep directional shear represents situations when low-level zonal winds are going the opposite direction of the upper level zonal winds, which would cause the low-level convective cloud bases to rapidly move in a different direction than any upper level cloud, potentially impacting the occurrence of deep convective and stratiform rain.
3. STATISTICAL METHOD
The statistical challenges for this work come from the following: (i) multivariate spatial point patterns, (ii) point pattern models on a global scale, (iii) statistical inference for such point pattern models, and (iv) computational difficulties due to large number of data points.
3.1. LGCP model for multivariate point patterns on a sphere
Suppose X is a spatial point pattern on the surface of a sphere, and Y is a Gaussian random field on . Let the mean and covariance function of Y be
for . We assume X is a log-Gaussian Cox process driven by Λ = exp(Y). Depending on the structure we give to m and C, the resulting LGCP may have various properties. For example, one might give a simple structure by assuming m(s) = μ and C(s1, s2) = C0(d12), where is an isotropic covariance function, μ is a unknown constant mean, and d12 is a great circle distance between s1 and s2 on . Note that Chakraborty, Gelfand, Wilson, Latimer, and Silander (2011) used this structure except they did not assume a sphere as their spatial domain. On the other hand, one might assume that the mean structure of Y varies over the surface of a sphere with an isotropic covariance structure for Y, or assume a constant mean structure but a nonstationary covariance structure for Y.
Let X = (X1,X2,X3) now be a tri-variate LGCP on a sphere in which (logΛi,…, logΛ3) is a vector valued Gaussian random field on a sphere. Similarly to the univariate case, we write Λ = exp Y for a tri-variate Gaussian random field Y = (Y1,Y2,Y3) on a sphere. Therefore, marginal as well as joint distribution of log intensities of the tri-variate LGCP are determined by the marginal as well as joint distribution of the multivariate Gaussian random field, Y.
For the application in this paper, we utilize atmospheric state variables (described in Section 2.2) to account for the nonstationary mean structure of the log of the intensity process, Yi(i = 1, 2, 3 for each rain type). Further, the natural clustering of points for the locations of rainfall will be dealt with through an isotropic Matérn covariance model. Indeed, we use the parsimonious version of the Matérn covariance model originally introduced by Gneiting et al. (2010). We not only are able to estimate the contribution of each state variable to the occurrence of rainfall for each type, but are able to estimate the cross-correlation of pairs of rain types through this multivariate LGCP model.
Although the multivariate Matérn models originally introduced by Gneiting et al. (2010) are not developed for processes on spheres, recently Gneiting (2013) (for univariate) and Porcu et al. (2016) (for multivariate) showed conditions on parameters to ensure positive definiteness of multivariate Matérn model on spheres. There are total of 6 smoothness parameters for the Matérn model used for this application: νi and vij for i,j = 1,…, 3. We fix all of them equal to 0.5, and the resulting model satisfies the condition for positive definiteness according to Porcu et al. (2016). We could extend the model further by employing a nonstationary covariance function for modeling Yi’s, such as models introduced in Stein (2005a) and Jun (2014).
3.2. Monte Carlo likelihood approximation
Commonly, statistical inference for point process models is done through either a moment-based method or likelihood. The commonly used moment-based method, called minimum contrast, finds parameter estimates by minimizing the squared difference between the empirical and theoretical versions of Ripley’s K-function. See chapter 19 of Gelfand et al. (2010) for more details. Recently, Robeson et al. (2014) pointed out that the K function needs to be adjusted for point patterns on spheres. Cuevas-Pacheco and Møller (2018) fitted an LGCP model on the sphere to the sky position of galaxies by fitting the K function on the sphere. The minimum contrast method is computationally efficient and can be useful for provisional estimates of parameters of models. Nevertheless, similar to the least squares method for variogram estimation, moment-based methods for inference for point pattern models are known to be less efficient compared to likelihood-based methods (Diggle 2014).
We now present the idea for Monte Carlo likelihood approximation. We first present the main idea for the univariate case for notational convenience, and then discuss extension to the multivariate case. The likelihood for a LGCP model with data X = {xi ∈ A : i = 1,…, n} defined on a spatial domain W is given as (Diggle et al. 2013)
| (1) |
with
| (2) |
Calculation of the likelihood function as in (1) and (2) involve two computational difficulties. The main obstacle in performing maximum likelihood estimation for LGCP models has been that the evaluation of (1) involves integration over the infinite-dimensional distribution of Λ. The second obstacle is that, for a given realization of Λ, the integral term in (2) on the surface of a sphere adds further computational complication.
A natural solution to the first problem, that is, integrating the likelihood function over the stochastic intensity function (i.e., calculating the expected value with respect to the stochastic likelihood function in (1)) is through Monte Carlo approximation. That is, the expectation is approximated by an empirical average over a simulated realization. For simulated realizations of Λ, λ(j) = { λ(j) (sk) : k = 1,…, N }, j = 1,…, s, with a finite “grid” points s1,…, sN that cover the spatial region of interest, one may approximate (1) With
| (3) |
Here, we need to consider finite grid points since we cannot simulate random fields continuously over space. The accuracy of the approximation in (3) depends on s. The idea of a Monte Carlo approximation of likelihood for Cox processes has not been utilized much in the past mainly because of its computational intensiveness. For a LGCP model, such likelihood approximation requires a large number of simulated Gaussian random fields over dense grid points.
With the recent development of technology and computing power, however, simulating a large number of Gaussian random fields over dense grid points becomes more doable. An R package, RandomFields (Schlather, Malinowski, Menck, Oesting, and Strokorb 2015), provides tools to simulate Gaussian random fields over a large number of locations over Euclidean domain as well as spheres. Computational techniques for approximating likelihood (Stein, Chi, and Welty 2004; Fuentes 2007) or composite likelihood (Cox and Reid 2004) may not be directly applicable since we need to simulate random fields, rather than calculate likelihood values. The second problem is dealt with by approximating the integral in (2) by a Riemann sum. In particular, we approximate the integral with summation using simulated values over 5,000 locations chosen randomly.
The method described above extends to multivariate cases naturally. With a multivariate point pattern X = (X1, ⋯,Xm) with stochastic intensity Y = (Y1,…,Ym) and Λ = exp(Y), we simulate an m-variate Gaussian random field, Y, with m × m block covariance matrix accounting for their cross-covariances. Then, the realized stochastic intensity function is given by the log of the simulated values similar to the univariate case. The loglikelihood function for the multivariate case, , can still be written similar to (2),
where Xl = {xi,l, i = 1,…,nl} and nl is the number of events for the lth point pattern. For our application, m = 3. Any cross-covariance functions defined on spheres can be used for Y and we use the trivariate Matern model, as discussed in Section 3.1.
3.3. Covariance approximation
Suppose one needs to simulate a Gaussian random field over s1,…,sN for s many times. Let with Σ an N × N covariance matrix. For computationally efficient simulation of a large number of Gaussian random fields over a large number of locations (that is, large N), we will use the idea of a predictive process (PP) model (Banerjee, Gelfand, Finley, and Sang 2008) in a non-Bayesian context. PP models have been proven to provide computationally efficient tools for dealing with Gaussian random fields observed over a large number of spatial locations. However, their weakness in dealing with small-scale spatial variations has also been observed (e.g., Sang, Jun, and Huang 2011; Stein 2014). We will use a modified version of PP, a full scale approximation method, proposed in Sang et al. (2011) and Sang and Huang (2012) to account for the large-scale, as well as small-scale, variation of each Gaussian random field.
We approximate Σ by
| (4) |
a part resulting from a PP model and an approximated remainder. That is, we introduce a set of knots, u1,…, um, for m << N that cover the entire domain. Then, R = Var{Y(u)}, u = { u1, … , um} and A = Cov{Y(s), Y(u)}, s = {s1,…, sN}. Furthermore, the “remainder” covariance matrix, Σ − AR−1AT, after subtracting the covariance matrix for a PP model, is approximated by the “block independent” adjustment (c.f. adjustment using taper functions as in Sang and Huang (2012)). That is, the remainder matrix is approximated by a block diagonal matrix, denoted by V.
We then simulate multiple Gaussian random fields based on the approximation in (4). Let be a s × N matrix with s being many simulated random fields over N locations, S0 a s × m matrix with elements from iid N(0,1), and S1 a s × N matrix with elements from iid N(0,1) (each element in S1 is independent of elements in S0). Then, we write
| (5) |
where , and UR and UV are upper triangular matrices resulting from Cholesky decomposition of R and V, respectively (that is, and ). To find B, instead of inverting UR, we use
and solve for B efficiently using forward solve algorithm. The Cholesky decomposition of V can also be done efficiently accounting for the fact that V is a block diagonal matrix. That is, one needs to perform multiple Cholesky decomposition of each block of V to reduce computation significantly. See Appendix, A-I, for a short proof to show that the covariance matrix of each column of equals the approximation of Σ given in (4).
The idea of covariance approximation is naturally extended to the multivariate case, since, in the end, it boils down to simulating a multivariate normal random variable. Sang et al. (2011) gave details on the multivariate extension of the full scale approximation of covariance matrices.
4. APPLICATIONS
We use TRMM satellite radar data as well as MERRA-2 atmospheric state variable data for June 2003 as described in Section 2. We first consider rainfall data over the EP and WP regions during the first two weeks of June (Section 4.1). Understanding rain distributions during the summer months in the tropical Pacific is especially important because of the strength of the Intertropical Convergence Zone (ITCZ), a region of enhanced convection at the intersection of the trade winds, during these months and the overall importance of the tropical Pacific to the onset and evolution of El Nino events. Then we analyze the global data for the first week of June (Section 4.2).
Each atmospheric variable is standardized so that its spatial mean equals zero and its standard deviation equals one. The vertical profiles were calculated starting from 1000 mb (see Figure 2), which is effectively sea level. Locations with missing values at this height were excluded from the analysis. Figure 1 shows the land regions affected during the time period considered. For instance, most of Africa and some parts of Australia show missing data. Future work will use data from higher heights in the profile to better assess this technique over land.
Let Yi be the log transformed intensity process for ith rain type (i = 1, 2, 3). For the mean structure of the log transformed intensity process, Yi, we write
| (6) |
Here, the 17 covariates used are the first three PCs of temperature (tis), humidity (qi’s), zonal wind (ui’s), meriodional wind (vi’s), shears (ls,dp,dds), latent heat flux (lh), and latitude (lat). For the covariance structure of Y = (Y1, Y2, Y3), we use a trivariate version of Matérn covariance function (Gneiting et al. 2010; Porcu et al. 2016). For model parsimony, we give a common spatial range parameter (β, in spherical distance) and focus on estimating the cross-correlation between the three rain types (ρij·, i,j = 1, 2, 3). Indeed, we tried to fit the model separately for each rain type and found that estimates for spatial range parameter (β) do not vary much across different rain types. Note that with common spatial range parameter, the full cross-covariance matrix reduces to a Kronecker product of 3 × 3 crosscovariance matrix and a univariate spatial correlation matrix, which is again, guaranteed to be positive definite.
4.1. EP vs WP
The EP region covers a longitude range from 180° W to 100.25° W and the WP region covers a longitude range from 130.25° E to 180° E (Figure 1). Both regions cover a latitude range of 15.25° S to 15.25° N. We set s = 10000 for the Monte Carlo approximation of likelihood function (as in (3)). We tried various values of s and the results did not change significantly as long as s is reasonably large (e.g., s ≥ 5000). Analytic calculation of the integral in (2) is not possible and thus the integral term is approximated by a Riemann sum (using 5000 many terms). Note that we do not use covariance approximation for this local analysis as we can afford to use full covariance matrix for the simulation of Gaussian random fields. We let N equal the total number of grid pixels with rain for the TRMM data in the EP or WP region (5,504 for the EP and 5,878 for the WP).
Table 1 shows coefficient estimates for the atmospheric state variables for the log intensity processes. The best predictor for rain type occurrence is humidity, consistent with our physical expectation and previous statistical studies (e.g., Chen, Liu, and Mapes 2017). In particular, the first humidity PC (q1) indicates a moister atmosphere throughout the depth of the troposphere, which is strongly conducive to rain production, especially DC and Str rain types. The second humidity PC (q2) indicates a drier mid-troposphere (e.g., 600–700 mb), which hinders deep convective cloud growth making it a better predictor for SC rain. The next best predictor for rain type occurrence is temperature. The second temperature PC (t2) is warm at low levels and then rapidly cools around 800 mb. This creates an unstable temperature profile that promotes a deep convecting atmosphere. The first temperature PC (t1) has a similar structure to t2 except that it is warmer at low levels than upper levels and does not cool quite as rapidly at 800 mb. The third temperature PC (t3) indicates a strong inversion around 800 mb, which would damp convective cloud growth and explains the negative coefficients in Table 1. These temperature PCs are all good predictors for DC and Str but are weaker predictors for SC rain.
Table 1.
Estimates for coefficients of the linear model in (6). See Section 2.2 for definitions of predictors.
| Predictor | EP | WP | Global | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Str | DC | SC | Str | DC | SC | Str | DC | SC | |
| t1 | 0.61 | 0.45 | 0.12 | 0.31 | 0.25 | 0.03 | −1.57 | −1.62 | −0.94 |
| t2 | 0.80 | 0.63 | 0.31 | 0.23 | 0.27 | 0.04 | 0.36 | 0.39 | 0.15 |
| t3 | −0.43 | −0.29 | −0.29 | −0.11 | −0.13 | −0.10 | 0.06 | 0.26 | −0.19 |
| q1 | 1.15 | 1.18 | 0.87 | 0.88 | 0.76 | 0.46 | 2.66 | 2.30 | 1.69 |
| q2 | 0.24 | 0.22 | 0.39 | 0.09 | 0.11 | 0.22 | −0.16 | −0.11 | −0.36 |
| q3 | −0.08 | −0.06 | −0.08 | −0.06 | −0.05 | −0.05 | −0.05 | −0.16 | 0.24 |
| u1 | 0.10 | 0.15 | 0.11 | 0.00 | 0.00 | −0.02 | −0.49 | −0.21 | 0.10 |
| u2 | −0.14 | −0.15 | −0.07 | −0.01 | −0.07 | −0.06 | −0.11 | −0.01 | −0.13 |
| u3 | 0.05 | 0.03 | −0.07 | −0.03 | −0.03 | 0.03 | 0.08 | 0.02 | 0.01 |
| v1 | −0.03 | −0.04 | 0.04 | −0.01 | 0.01 | 0.06 | 0.05 | −0.05 | −0.13 |
| v2 | −0.14 | −0.11 | −0.08 | 0.11 | 0.08 | 0.06 | 0.03 | 0.11 | 0.11 |
| v3 | 0.09 | 0.07 | 0.09 | −0.14 | −0.10 | −0.06 | 0.06 | 0.02 | 0.00 |
| ls | −0.02 | −0.05 | −0.10 | −0.14 | −0.14 | −0.08 | 0.03 | 0.04 | −0.01 |
| dp | −0.07 | 0.00 | 0.01 | −0.01 | −0.01 | −0.08 | −0.07 | −0.12 | −0.05 |
| dds | −0.02 | −0.09 | −0.10 | −0.02 | −0.05 | 0.11 | −0.24 | 0.06 | 0.05 |
| lh | 0.03 | −0.01 | −0.01 | −0.05 | −0.05 | −0.01 | 0.06 | 0.20 | 0.17 |
| lat | −0.04 | −0.09 | −0.12 | 0.09 | 0.08 | 0.11 | −0.16 | −0.02 | 0.26 |
More generally, SC rain tends to have weaker or (sometimes even opposite) relationship compared to DC and Str rain for most of the predictors in Table 1. This result is physically consistent with the fact that stratiform rain forms from deep convection in the tropics, while shallow convective rain can occur outside of regions of deep convection. Table 2 further highlights these rain type relationships with cross-correlation values greater than or equal to 0.95 for Str and DC rain but negative or near zero for Str and SC rain.
Table 2.
Estimates for covariance parameters for trivariate Matérn covariance function. Cross-correlation ρ12 is between Str and DC, ρ13 between Str and SC, and ρ23 between DC and SC.
| EP | WP | Global | |
|---|---|---|---|
| β (km) | 1466 | 1212 | 1130 |
| ρ12 | 0.95 | 0.99 | 0.99 |
| ρl3 | −0.11 | −0.01 | −0.01 |
| ρ23 | 0.18 | −0.01 | −0.01 |
While the predictors related to wind, surface latent heating, and latitude have lower coefficients than temperature and humidity in Table 1, they still provide information about rain type occurrence and its regionality. For example, zonal wind variations are better predictors in the EP, while low-level shear is a better predictor in the WP. Outside of temperature and humidity, latitude is the strongest predictor for SC rain. Table 2 also indicates similar spatial range (β, in spherical distance) between the two regions.
Figures 3 and 4 show map comparisons between the observed rain intensity and the exponentiated estimated mean structure of the log intensity process for each rain type. Although our interest is not modeling rain intnsity directly, it may be interesting to check how similar the observed rain intensity is to the estimated intensity of the rain occurrences. In the EP, Str and DC are narrowly confined to the warm ocean regions north and south of the equatorial cold tongue, while SC is more spatially distributed (Figure 3). In the WP, rain occurrence for each type is more evenly distributed across the domain because of the generally warm waters in the western tropical Pacific (Figure 4). Even though we only take occurrences of rain for each type and we did not take into account the actual rain intensity (or rain rate) in our analysis, overall spatial patterns of the observed rain intensity (left columns of the figures) and exponentiated estimated mean log intensity (right columns of the figures) are strikingly similar. For both regions and all rain types, the original rain rate data is quite noisy and the exponentiated estimated mean field for the log intensity process appears much smoother. This is because we only display the estimated mean field, as given in (6). Further model verification example over the WP domain is given in Appendix, A-II.
Figure 3.
Observed rain intensity (mm/hr) (left) and exponentiated estimated mean of log intensity process (right) for the three rain types (row-wise) over the EP region during June 1–14, 2003.
Figure 4.
Same as Figure 3 over the WP region.
4.2. Global analysis
We now perform a multivariate analysis for spatial point patterns for the three rain types over the entire TRMM domain. We consider TRMM PR data for June 1 to June 7 in 2003, as well as corresponding atmospheric state variables (note that PCs are calculated again for the larger domain). Even with only one week of data, the number of grid pixels covered during the period is around 75,000 and the number of grid pixels with rain is around 20,000 (and thus N ≈ 20, 000). Therefore, we need to approximate covariance matrices as described in Section 3.3.
Similar to the local analysis in Section 4.1, we express the log intensity function of the LGCP as a linear combination of all the atmospheric state variables. We also use a parsimonious version of the trivariate Matern covariance function similar to the EP/WP analysis. We let s = 600 for the Monte Carlo approximation of likelihood. For the full-scale approximation, we use m = 100 and let the size of each block matrix in V be equal to 100 × 100. We tried larger s values and after s = 400 or so, the maximized likelihood values did not change much. For instance, for the global analysis, our experiment shows that when s > 500, maximized likelihood values stabilize nicely (see Appendix, A-III, for details).
The right columns of Tables 1 and 2 show estimated coefficients for the mean of log intensity process as well as covariance parameters for the log intensity process over the tropical and subtropical ocean regions. It is interesting to note that, while the estimated coefficients for EP and WP are similar, estimated coefficients for the global analysis are somewhat different. However, humidity and temperature remain the best predictors for all rain types and SC coefficients tend to be less than or of opposite sign than the Str and DC coefficients. Cross-correlation estimates for the global analysis are similar to those from the local analysis. The likelihood function turns out to be quite flat for the spatial range parameter. Figure 5 reiterates the capability of the statistical analysis to accurately capture not only each rain type occurrence, but their rain rates as well.
Figure 5.
Observed rain intensity (mm/hr) (left) and exponentiated estimated mean of log intensity process (right) for the three rain types (row-wise) during June 1–7, 2003.
5. CONCLUDING REMARKS
We demonstrated that LGCP models can be applied to local as well as global spatial point pattern data in a multivariate setting. We applied Monte Carlo approximations to log likelihood functions, and for the global analysis, we exploited covariance approximation methods to ease computational difficulties due to massive data. We were able to tease out scientifically interesting connections between rainfall occurrences and atmospheric state variables as well as cross-correlation between multiple rain type occurrence patterns.
We have shown that profiles of humidity and temperature, and even single-level variables, can predict the occurrence and intensity of rainfall in the tropics separated into deep convective, stratiform and shallow convective components. These three rain types are the building blocks of tropical cloud systems at multiple time and space scales (Mapes, Tulich, Lin, and Zuidema 2006), so the ability to predict rain type characteristics from environmental observations using statistical models supports the feasibility of parameterization of organized convective systems in coarse-resolution climate models. Further, the strong link between deep convective and stratiform rain and their relationship to large-scale environment variables stresses the need to not isolate deep convection from other rain types when representing their occurrence in climate models. The fact that shallow convection often has an opposite signal compared to the deeper rain types suggests that convective cloud parameterizations need to be adapted to produce both shallow and deep convective rain (currently, GCMs only produce an aggregate convective rain category).
We used fairly simple spatial covariance functions with common spatial correlation length scale (spatial range) as well as common smoothness for three rain types. This model can be limited, although much of the nonstationarity in our data is taken care of by the combination of atmospheric state variables. We plan to employ more flexible spatial covariance functions, for instance, those that allow latitude dependence of marginal and cross-covariance structures and/or spatially varying smoothness (e.g., Jun 2014).
Monte Carlo approximation of log likelihood functions require a large number of simulations of Gaussian random fields, and it naturally enables simple parallelization of the computing. For this work, we used 10 processors that were available for the authors for the parallel computing in R. But with much greater number of processors that will be available soon, we expect much more computationally efficient calculation of approximate likelihood in the near future.
Another future direction of this work is to consider rainfall occurrences as well as the actual rain intensity (or rain rate) data altogether. The actual rain intensity information can be incorporated especially in the covariance modeling of the log-Gaussian intensity function, or rainfall occurrences and rain intensity data altogether can be dealt with under a Marked point process framework.
It would be natural to extend the proposed method for spatio-temporal point patterns. There are several flexible spatio-temporal covariance functions available in the literature that can be used to model the spatio-temporal stochastic intensity function under an LGCP framework. Indeed, Siino, Adelfio, and Mateu (2018) considered various separable and nonseparable spatio-temporal covariance functions for LGCP models and they estimated parameters through the minimum contrast method.
There may be point process models, other than the LGCP models considered in this paper, that may be suitable for modeling multivariate rainfall occurrences data. For instance, Neyman-Scott clustered point process on spheres used in Lawrence et al. (2016) could be extended to multivariate point patterns, and then be compared to the results using the LGCP models. Møller and Waagepetersen (2017) give a nice review on various spatial point process modeling frameworks. We leave this as one of our future directions of research.
Acknowledgement
The authors acknowledge the support by NSF DMS-1208421, DMS-1613003, AGS-1347808, NIH P42ES027704, NASA PMM grant NNX16AE34G, and the Interdisciplinary Seed Grant in Big Data program from Texas A&M University. TRMM satellite data were provided by the NASA/Goddard Space Flight Center and PPS. MERRA was developed by the Global Modeling and Assimilation Office and supported by the NASA Modeling, Analysis and Prediction Program. Source files for the TRMM PR and MERRA-2 data can be acquired from the Goddard Earth Science Data Information Services Center (GES DISC) (https://disc.gsfc.nasa.gov/). Aaron Funk processed the TRMM PR and MERRA-2 data onto coincident temporal and spatial grids. Junho Yang calculated EOFs used in the analysis. Mikyoung Jun acknowledges helpful discussion with Yongtao Guan on point process models. The authors thank two anonymous reviewers and the Editor for their comments that helped to improve the paper.
Appendices.
A-I. Proposition
Proposition. For in (5), covariance matrix of is a sN × sN block diagonal matrix. Each block is of size N × N and equals the approximated version of Σ, W, as in (4).
Proof. Note that elements of S0 and S1 are iid N(0,1). Take any row of S0B and denote it by r0 (size 1 × N). It is easy to see that r0 = z0B, where z0 is an 1 × m row vector whose elements are iid . Therefore,
Now, take any row of S1 UV and denote it by ri (size 1 × N). Similarly to the above,
Therefore, the covariance matrix of a column vector of equals to
□
A-II. Model verification
We performed two separate procedures for model verification and we include here the results on the WP domain as an example. The first one is based on the thinning procedure, and for each location of rain (separately for each rain type), we selected the location for model fitting in the verification with probability 0.5. The second one is, we excluded a rectangular region in the middle of domain (chosen arbitrarily). Figure A.1 shows map of locations of rain excluded for the analysis and used for the verification.
Figure A.1.
(a)-(c): locations of rain for stratiform (in (a)), deep convective (in (b)), and shallow convective (in (c)). Black dots indicate thinned locations used in the verification and gray dots indicate locations not used in the verification. (d): gray rectangle shows the region excluded from the model fit in the verification procedure for all three rain types.
For both procedures, as Table A.1 shows, estimates for coefficients of the linear models appear to be similar. Furthermore, spatial maps of estimated intensity maps were indistinguishable.
Table A.1.
Estimates for coefficients of the linear model for the WP domain.
| Predictor | full domain | verification 1 | verification 2 | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Str | DC | SC | Str | DC | SC | Str | DC | SC | |
| t1 | 0.31 | 0.25 | 0.03 | 0.36 | 0.10 | −0.02 | 0.34 | 0.26 | 0.04 |
| t2 | 0.23 | 0.27 | 0.04 | 0.19 | 0.17 | −0.02 | 0.15 | 0.20 | −0.03 |
| t3 | −0.11 | −0.13 | −0.10 | −0.10 | −0.06 | −0.10 | −0.08 | −0.09 | −0.07 |
| q1 | 0.88 | 0.76 | 0.46 | 0.90 | 0.84 | 0.50 | 1.01 | 0.88 | 0.55 |
| q2 | 0.09 | 0.11 | 0.22 | 0.16 | 0.09 | 0.20 | 0.10 | 0.10 | 0.23 |
| q3 | −0.06 | −0.05 | −0.05 | −0.05 | −0.13 | −0.08 | −0.06 | −0.07 | −0.05 |
| u1 | 0.00 | 0.00 | −0.02 | −0.06 | 0.02 | −0.02 | 0.08 | 0.07 | 0.04 |
| u2 | −0.01 | −0.07 | −0.06 | −0.12 | −0.02 | −0.08 | 0.09 | 0.09 | −0.01 |
| u3 | −0.03 | −0.03 | 0.03 | 0.07 | 0.01 | 0.07 | −0.03 | −0.02 | 0.02 |
| v1 | −0.01 | 0.01 | 0.06 | −0.01 | 0.07 | 0.07 | −0.06 | −0.02 | 0.03 |
| v2 | 0.11 | 0.08 | 0.06 | 0.12 | 0.06 | 0.09 | 0.09 | 0.07 | 0.04 |
| v3 | −0.14 | −0.10 | −0.06 | −0.09 | −0.16 | −0.06 | −0.15 | −0.11 | −0.07 |
| ls | −0.14 | −0.14 | −0.08 | −0.17 | −0.16 | −0.11 | −0.14 | −0.15 | −0.07 |
| dp | −0.01 | −0.01 | −0.08 | −0.05 | −0.04 | −0.06 | 0.02 | 0.03 | −0.08 |
| dds | −0.02 | −0.05 | 0.11 | 0.14 | −0.03 | 0.13 | −0.11 | −0.13 | 0.05 |
| lh | −0.05 | −0.05 | −0.01 | 0.00 | −0.04 | −0.03 | −0.10 | −0.09 | −0.04 |
| lat | 0.09 | 0.08 | 0.11 | 0.10 | 0.13 | 0.12 | 0.09 | 0.09 | 0.13 |
A-III. sensitivity study
We performed an experiment to study sensitivity on the number of simulations for Monte Carlo approximation by varying s values for the global analysis. Figure A.2 shows the relative difference (%) between maximized likelihood values for each of s = 50,100, 200,400, 800,1600 to the maximized likelihood value for s = 50. That is, we calculated 100 × |L2 − L1\/\L1\ with L1 maximized likelihood value for s = 50 and L2 maximized likelihood value for s = 50, · · ·, 1600. As shown in the figure, the Monte Carlo approximation of the likelihood function of the LGCP seems stable and approximates well.
Figure A.2.
Maximized likelihood values for various s. Y-axis shows the relative difference between maximized likelihood value for each s and that for s = 50.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Contributor Information
Mikyoung Jun, Department of Statistics, Texas A&M University,.
Courtney Schumacher, Department of Atmospheric Sciences, Texas A&M University.
R. Saravanan, Department of Atmospheric Sciences, Texas A&M University
REFERENCES
- Ahmed F and Schumacher C (2015), “Convective and stratiform components of the precipitation-moisture relationship,” Geophys. Res. Lett, 42, 10,453–10,462, doi: 10.1002/2015GL066957. [DOI] [Google Scholar]
- Ahmed F and Schumacher C (2017), “Geographical differences in the tropical precipitation-moisure relationship and rain intensity onset,” Geophys. Res. Lett, 44, 1114–1122, doi: 10.1002/2016GL071980. [DOI] [Google Scholar]
- Apanasovich TV and Genton MG (2010), “Cross-covariance functions for multivariate random fields based on latent dimensions,” Biometrika, 97, 15–30. [Google Scholar]
- Awaka J, Iguchi T, Kumagai H, and Okamoto K (1997), “Rain type classification algorithm for TRMM precipitation radar,” Geoscience and Remote Sensing, 4, 1633–1635. [Google Scholar]
- Banerjee S, Gelfand A, Finley A, and Sang H (2008), “Gaussian predictive process models for large spatial data sets,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70(4), 825–848. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bretherton C, Peters M, and Back L (2004), “Relationships between water vapor path and precipitation over the tropical oceans,” J. Clim, 17, 1517–1528. [Google Scholar]
- Chakraborty A, Gelfand AE, Wilson AM, Latimer AM, and Silander JA (2011), “Point pattern modelling for degraded presence-only data over large regions,” Appl. Statist, 60, 757–776. [Google Scholar]
- Chen B, Liu C, and Mapes B (2017), “Relationships between Large Precipitating Systems and Atmospheric Factors at a Grid Scale,” J. Atmos. Sci, 74, 531–552. [Google Scholar]
- Cowpertwait P, Isham V, and Onof C (2007), “Point process models of rainfall: developments for fine-scale structure,” Proceedings of the Royal Society A-Mathematical Physical and Engineering Sciences, 463, 2569–2587. [Google Scholar]
- Cowpertwait PSP (2010), “A spatio-temporal point process model with a continuous distribution of storm types,” Water Resouces Research, 46, W12507. [Google Scholar]
- Cox D and Reid N (2004), “A note on pseudolikelihood constructed from marginal densities,” Biometrika, 91, 729–737. [Google Scholar]
- Cox DR (1955), “Some statistical methods related with series of events (with discussion),” Journal ofthe Royal Statistical Society, Series B., 17, 129–164. [Google Scholar]
- Cressie N and Huang H-C (1999), “Classes of nonseparable, spatio-temporal stationary covariance functions,” Journal of the American Statistical Association, 94, 1330–1340. [Google Scholar]
- Cuevas-Pacheco F and Møller J (2018), “Log Gaussian Cox processes on the sphere,” Spatial Statistics, 26, 69–82. [Google Scholar]
- Diggle P (1985), “A kernel method for smoothing point process data,” Applied Statistics, 34, 138–147. [Google Scholar]
- Diggle P, Rowlingson B, and Su T (2005), “Point process methodology for on-line spatio-temporal disease surveillance,” Environmetrics, 16, 423–434. [Google Scholar]
- Diggle PJ (2014), Statistical analysis ofspatial and spatio-temporalpointpatterns (Third ed.): CRC Press. [Google Scholar]
- Diggle PJ, Moraga P, Rowlingson B, and Taylor BM (2013), “Spatial and spatio-temporal log-Gaussian Cox processes: extending the geostatistical paradigm,” Statistical Science, 28, 542–563. [Google Scholar]
- Flato G et al. (2013), IPCC, chapter 9: The Physical Science Basis, Cambridge Univ. Press. [Google Scholar]
- Frei C and Schar C (2001), “Detection probability of trends in rare events: Theory and application to heavy precipitation in the Alpine region,” Journal of Climate, 14, 1568–1584. [Google Scholar]
- Fuentes M (2007), “Approximate likelihood for large irregularly spaced spatial data.” Journal of the American Statistical Association, 102, 321–331. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Funk A, Schumacher C, and Awaka J (2013), “Analysis of rain classifications over the tropics by Version 7 of the TRMM PR 2A23 algorithm,” J. Met. Soc. Japan, 91, 257–272. [Google Scholar]
- Gelfand AE, Diggle PJ, Fuentes M, and Guttorp P (eds.) (2010), Handbook of Spatial Statistics, CRC Press. [Google Scholar]
- Gelfand AE, Schmidt AM, Banerjee S, and Sirmans CF (2004), “Nonstationary Multivariate Process Modeling through Spatially Varying Coregionalization,” Test, 13(2), 263–312. [Google Scholar]
- Gneiting T (2002), “Nonseparable, stationary covariance functions for space-time data,” Journal of the American Statistical Association, 97(458), 590–600. [Google Scholar]
- Gneiting T (2013), “Strictly and non-strictly positive definite functions on spheres,” Bernoulli,19, 1327–1349. [Google Scholar]
- Gneiting T, Kleiber W, and Schlather M (2010), “Matern cross-covariance functions for multivariate random fields,” Journal of the American Statistical Association, 105, 1167–1177. [Google Scholar]
- Guan Y (2006), “A composite likelihood approach in fitting spatial point process models,” Journal of the American Statistical Association, 101, 1502–1512. [Google Scholar]
- Guan Y (2008), “A goodness-of-fit test for inhomogeneous spatial Poisson processes,”Biometrika, 95, 831–845. [Google Scholar]
- Guinness J and Fuentes M (2016), “Isotropic Covariance Functions on Spheres: Some Properties and Modeling Considerations,” Journal ofMultivariate Analysis, 143, 143–152. [Google Scholar]
- Hartmann DL, Hendon H, and Houze R Jr. (1984), “Some implications of the mesoscale circulations in tropical cloud clusters for large-scale dynamics and climate,” J. Atmos. Sci, 41, 113–121. [Google Scholar]
- Heaton MJ, Katzfuss M, Berrett C, and Nychka DW (2014), “Constructing valid spatial processes on the sphere using kernel convolutions,” Environmetrics, 25, 2–15. [Google Scholar]
- Hou AY, Kakar RK, Neeck S, Azarbarzin AA, Kummerow CD, Kojima M, Oki R, Nakamura K, and Iguchi T (2014), “The Global Precipitation Measurement Mission,” Bulletin of the American Meteorological Society, DOI: 10.1175/BAMS-D-13-00164.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Houze RA Jr. (2004), “Mesoscale convective systems,” Reviews of Geophysics, DOI: 10.1029/2004RG000150. [DOI] [Google Scholar]
- Hung M-P, Lin J-L, Wang W, Kim D, Shinoda T, and Weaver SJ (2013), “MJO and convectively coupled equatorial waves simulated by CMIP5 climate models,” J. Climate, 26, 6185–6214. [Google Scholar]
- Jensen M and Genio AD (2006), “Factors Limiting Convective Cloud-Top Height at the ARM Nauru Island Climate Research Facility,” J. Climate, 19, 2105–2117. [Google Scholar]
- Jeong J and Jun M (2015), “A class of Matern-like covariance functions for smooth processes on a sphere,” Spatial Statistics, 11, 1–18. [Google Scholar]
- Jeong J, Jun M, and Genton M (2017), “Spherical process models for global spatial statistics,” Statistical Science, 32, 501–513. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnson RH, Rickenbach T, Rutledge S, Ciesielski P, and Schubert W (1999), “Trimodal characteristics of tropical convection,” J. Clim, 12, 2397–2418. [Google Scholar]
- Jun M (2014), “Matern-basednonstationarycross-covariancemodelsforglobalprocesses,” Journal of Multivariate Analysis, 128, 134–146. [Google Scholar]
- Kaczmarska J, Isham V, and Onof C (2014), “Point process models for fine-resolution rainfall,” Hydrological Sciences Journal, 59, 1972–1991. [Google Scholar]
- Kummerow C, Barnes W, Kozu T, Shiue J, and Simpson J (1998), “The Tropical Rainfall Measuring Mission (TRMM) Sensor Package,” Journal of Atmospheric and Oceanic Technology, DOI: . [DOI] [Google Scholar]
- Lawrence T, Baddeley A, Milne RK, and Nair G (2016), “Point pattern analysis on a region of a sphere,” STAT, 5, 144–157. [Google Scholar]
- Li W and Schumacher C (2011), “Tropical thick anvil viewed by the TRMM Precipitation Radar,” J. Climate, 24, 1718–1735. [Google Scholar]
- Mapes B, Tulich S, Lin J, and Zuidema P (2006), “The mesoscale convection life cycle: Building block or prototype for large-scale tropical waves?” Dynamics ofAtmospheres and Oceans, 42, 3–29. [Google Scholar]
- Møller J, Nielsen M, Porcu E, and Rubak E (2018), “Determinantal point process models on the sphere,” Bernoulli, 24, 1171–1201. [Google Scholar]
- Møller J, Syversveen AR, and Waagepetersen RP (1998), “Log Gaussian Cox Processes,” Scandinavian Journal of Statistics, 25, 451–482. [Google Scholar]
- Møller J and Waagepetersen RP (2004), Statistical Inference and Simulationfor Spatial Point Processes: Chapman & Hall/CRC. [Google Scholar]
- Møller J and Waagepetersen RP (2017), “Some recent developments in statistics for spatial point patterns,” Annual Review ofStatistics and Its Application, 4, 317–342. [Google Scholar]
- Molod A, Takacs L, Suarez M, and Bacmeister J (2015), “Development o f t h e GEOS-5 atmospheric general circulation model: evolution from MERRA to MERRA2,” Geosci. Model Dev, 8, 1339–1356. [Google Scholar]
- Peng RD, Schoenberg FP, and Woods JA (2005), “A space-time conditional intensity model for evaluating a wildfire hazard index,” Journal of the American Statistical Association, 100, 26–35. [Google Scholar]
- Porcu E, Bevilacqua M, and Genton M (2016), “Spatio-temporal covariance and crosscovariance functions of the great circle distance on a sphere,” Journal of the American Statistical Association, 111, 888–898. [Google Scholar]
- Robeson SM, Li A, and Huang C (2014), “Point-pattern analysis on the sphere,” Spatial Statistics, 10, 76–86. [Google Scholar]
- Rotunno R, Klemp J, and Weisman M (1988), “A theory for strong, long-lived squall lines,” J. Atmos. Sci, 45, 463–485. [Google Scholar]
- Sang H and Huang JZ (2012), “A full-scale approximation of covariance functions for large spatial data sets,” Journal ofthe Royal Statistical Society, Series B, 74, 111–132. [Google Scholar]
- Sang H, Jun M, and Huang JZ (2011), “Covariance approximation for large multivariate spatial datasets with an application to multiple climate model errors,” Annals ofApplied Statistics, 5, 2519–2548. [Google Scholar]
- Schlather M, Malinowski A, Menck PJ, Oesting M, and Strokorb K (2015), “Analysis, Simulation and Prediction of Multivariate Random Fields with Package Random-Fields,” Journal ofStatistical Software, 63(8), 1–25. [Google Scholar]
- Schlather M, Ribeiro P Jr., and Diggle PJ (2004), “Detecting dependence between marks and locations of marked point processes,” Journal of the Royal Statistical Soce- ity, Series B., 66, 79–93. [Google Scholar]
- Schoenberg FP (2003), “Multidimensional residual analysis of point process models for earthquake occurrences,” Journal ofthe American Statistical Association, 98, 789–795. [Google Scholar]
- Schumacher C and Houze R Jr. (2003), “The TRMM Precipitation Radar’s view of shallow, isolated rain,” J. Appl. Meteor, 42, 1519–1524. [Google Scholar]
- Schumacher C, Houze R Jr., and Kraucunas I (2004), “The tropical dynamical response to latent heating estimates derived from the TRMM Precipitation Radar,” J. Atmos. Sci, 61, 1341–1358. [Google Scholar]
- Seo DJ (1998), “Real-time estimation of rainfall fields using radar rainfall and rain gauge data,” Journal of Hydrology, 208, 37–52. [Google Scholar]
- Sherwood S, Bony S, and Dufresne J (2014), “Spread in model climate sensitivity tracd to atmospheric convective mixing,” Nature, 505, 37–42. [DOI] [PubMed] [Google Scholar]
- Shirota S and Gelfand AE (2017), “Space and circular time log Gaussian Cox processes with application to crime event data,” The Annals ofApplied Statistics, 11, 481–503. [Google Scholar]
- Siino M, Adelfio G, and Mateu J (2018), “Joint second-order parameter estimation for spatio-temporal log-Gaussian Cox processes,” Stochastic Environmental Research and Risk Assessment, 32, 3525–3539. [Google Scholar]
- Simpson D, Illian J, Lindgren F, Sørbye S, and Rue H (2016), “Going off grid: computationally efficient inference for log-Gaussian Cox processes,” Biometrika, 103, 49–70. [Google Scholar]
- Stein M (2014), “Limitations on low rank approximations for covariance matrices of spatial data,” Spatial Statistics, 8, 1–19. [Google Scholar]
- Stein ML (2005a), “Nonstationary spatial covariance functions,” Technical Report 21, Center for Integrating Statistical and Environmental Science, The University of Chicago. [Google Scholar]
- Stein ML (2005b), “Space-time covariance functions,” Journal of the American StatisticalAssociation, 100, 310–321. [Google Scholar]
- Stein ML, Chi Z, and Welty LJ (2004), “Approximating likelihoods for large spatial datasets,” Journal of the Royal Statistical Society, Series B, 66, 275–296. [Google Scholar]
- Sun Y and Stein ML (2015), “A stochastic space-time model for intermittent precipitation occurrences,” The Annals ofApplied Statistics, 9, 2110–2132. [Google Scholar]
- Taylor BM, Davies TM, Rowlingson BS, and Diggle PJ (2015), “Bayesian Inference and Data Augmentation Schemes for Spatial, Spatiotemporal and Multivariate Log-Gaussian Cox Processes in R,” Journal ofStatistical Software, 63(7), 1–48. [Google Scholar]
- Waagepetersen R and Guan Y (2009), “Two-step estimation for inhomogeneous spatial point processes,” Journal of the Royal Statistical Society, Series B, 71, 685–702. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zammit-Mangion A, Dewar M, Kadirkamanathan V, and Sanguinetti G (2012), “Point process modelling of the Afghan War Diary,” PNAS, 109, 12414–12419. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhu J, Kumar A, Wang W, Hu Z-Z, Huang B, and Balmaseda M (2017), “Importance of Convective Parametrization in ENSO Predictions,” Geophys. Res. Lett, 44, doi: 10.1002/2017GL073669. [DOI] [Google Scholar]







