Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2019 Nov 4.
Published in final edited form as: Ann Appl Stat. 2018 Mar 9;12(1):459–489. doi: 10.1214/17-AOAS1104

A MULTI-RESOLUTION MODEL FOR NON-GAUSSIAN RANDOM FIELDS ON A SPHERE WITH APPLICATION TO IONOSPHERIC ELECTROSTATIC POTENTIALS

Minjie Fan *, Debashis Paul *, Thomas C M Lee *, Tomoko Matsuo
PMCID: PMC6827713  NIHMSID: NIHMS1054023  PMID: 31687059

Abstract

Gaussian random fields have been one of the most popular tools for analyzing spatial data. However, many geophysical and environmental processes often display non-Gaussian characteristics. In this paper, we propose a new class of spatial models for non-Gaussian random fields on a sphere based on a multi-resolution analysis. Using a special wavelet frame, named spherical needlets, as building blocks, the proposed model is constructed in the form of a sparse random effects model. The spatial localization of needlets, together with carefully chosen random coefficients, ensure the model to be non-Gaussian and isotropic. The model can also be expanded to include a spatially varying variance profile. The special formulation of the model enables us to develop efficient estimation and prediction procedures, in which an adaptive MCMC algorithm is used. We investigate the accuracy of parameter estimation of the proposed model, and compare its predictive performance with that of two Gaussian models by extensive numerical experiments. Practical utility of the proposed model is demonstrated through an application of the methodology to a data set of high-latitude ionospheric electrostatic potentials, generated from the LFM-MIX model of the magnetosphere-ionosphere system.

Keywords: Non-Gaussian random field, multi-resolution analysis, isotropic process on a sphere, MCMC, ionospheric electrostatic potential, LFM-MIX model

1. Introduction.

Gaussian random fields (GRF) have provided a very successful modeling framework for analyzing spatial data. Two key ingredients of the success of GRF modeling are: (i) the behavior of the underlying stochastic process is entirely characterized by the mean and covariance functions, thereby facilitating deep theoretical investigations; and (ii) computations for both estimation and prediction primarily involve matrix algebra. However, there are many geophysical and environmental processes that exhibit a significant degree of non-Gaussianity, such as turbulent fields [Barndorff-Nielsen (1979), Berg et al. (2016)], meteorological variables including relative vorticity of wind and oceanic currents, wind velocity, air temperature and humidity [Perron and Sura (2013)], precipitation [De Oliveira, Kedem and Short (1997), Wallin and Bolin (2015)] and ionospheric electric fields [Cousins and Shepherd (2012)].

1.1. High-latitude ionospheric electric field.

The process that motivates the methodological developments in this paper is the high-latitude ionospheric electric field, originating from solar wind-magnetosphere-ionosphere interactions. Energy and momentum deposition associated with highly variable electric fields lead to global disturbance in the Earth’s partially ionized upper atmosphere. This impacts the drag force on low-Earth-orbit satellites and debris, deteriorating our ability of tracking these objects to mitigate potential collisions; and affects radio signal propagation, hindering robust performance of modern technological systems including telecommunication, navigation and positioning. In spite of their scientific importance and societal relevance, current general circulation models of the upper atmosphere are incapable of adequately reproducing these phenomena. One of the obstinate issues originates from systematic underestimation of energy and momentum sources resulting from an inadequate representation of the variability of the high-latitude ionospheric electric fields [Codrescu, Fuller-Rowell and Foster (1995), Matsuo, Richmond and Hensel (2003)].

The electric fields exhibit considerable variability across a range of scales in both space and time. Cousins and Shepherd (2012) showed in their analysis of the data obtained from the Super Dual Auroral Radar Network (SuperDARN), an international network of ground-based high-frequency (HF) radars, that the small-scale spatial and temporal variability of the electric fields displays heavier-tailed behavior than a normal distribution. While a lot of efforts have been made to model the large-scale variability as a GRF through decomposing the process into spherical harmonics (SH) or empirical orthogonal functions (EOF) [see Cousins, Matsuo and Richmond (2013a) and references therein], little attempt has been made so far to characterize the non-Gaussian small-scale counterpart. Although SH and EOF basis functions can well represent large-scale features, they are not effective in capturing small-scale details. Moreover, the small-scale variability can form a significant part of the total variability. Through analyzing the SuperDARN data, Cousins, Matsuo and Richmond (2013b) found that together with the mean, the first three EOFs, which are characterized by global spatial scales and long time scales, can only explain approximately 50% of the observed squared electric field.

Most existing numerical simulations of the upper atmosphere general circulation only account for the large-scale electric fields. Because the amount of Joule heating that results from collisions between neutrals and ions drifting under the effects of the electric and magnetic fields is proportional to the square of the electric field, neglecting the small-scale electric field variability in general circulation models can lead to significant underestimation of energy and momentum inputs into the upper atmosphere. While simple GRF models have been used to parameterize the effects of the large-scale electric field variability on general circulation models [Codrescu et al. (2000), Matsuo and Richmond (2008)], there is no viable stochastic parameterization scheme for generating the non-Gaussian small-scale electric fields in a manner consistent with observations. The focus of this paper is therefore to model the small-scale electric field variability for the purpose of incorporating such a model as an adaptive random field generator in general circulation models to account for missing energy and momentum sources. This is a crucial step toward accurately characterizing the variability of the electric fields by random field modeling, and representing the energy budget of solar wind-magnetosphere-ionosphere interactions in numerical models, for facilitating not only further scientific understanding but also practical applications.

1.2. LFM-MIX model output.

We consider a simulated data set of high-latitude ionospheric electrostatic potentials generated from the LFM-MIX model, a state-of-the-art coupled magnetosphere-ionosphere model [Lyon, Fedder and Mobarry (2004)] capable of running at multiple resolutions [Wiltberger et al. (2017)]. Note that the aforementioned electric fields are electrostatic fields given as the negative gradient of the electrostatic potentials. The domain of the LFM-MIX model relevant to magnetosphere-ionosphere coupling is the high-latitude regions (i.e., latitude ≥ 45° or ≤ −45°) of the Northern and Southern Hemispheres. LFM-MIX global simulation results were compared to real observations and empirical models of high-latitude ionospheric electrodynamics, and good agreement was reported [Zhang et al. (2011), Wiltberger et al. (2017), Kleiber et al. (2016)]. Kleiber et al. (2013) and Heaton et al. (2015) constructed spatio-temporal statistical emulators of the LFM-MIX model, and using the emulators, they estimated the input parameters of the LFM-MIX model through matching the model output with real observations.

Global simulations using the LFM-MIX model especially at higher resolutions require a considerable amount of high-performance computing resources, which by itself provides another motivation for the stochastic parameterization of the small-scale electric field variability for further numerical investigations using upper atmosphere general circulation models. Herein, we focus on the Quad resolution output of the LFM-MIX model in the high-latitude region of the Northern Hemisphere since the model output in the Southern Hemisphere can be analyzed similarly. Through an exploratory analysis of the Quad resolution model output, we found that the large-scale features (i.e., the mean and the first four EOFs) of the electrostatic potentials in the Northern Hemisphere show patterns similar to those derived from real observations in Matsuo, Richmond and Nychka (2002), Cousins, Matsuo and Richmond (2013b) [see Section S.2.1 of the supplementary material Fan et al. (2018)]. Figure 1 shows the small-scale component of the high-latitude ionospheric electrostatic potentials in the Northern Hemisphere after subtracting the estimated large-scale component (see Section 5.1). The non-Gaussianity of the small-scale component is illustrated by the Q–Q plots of the replications over time at different locations. Thus, the small-scale component of the high-latitude ionospheric electrostatic potentials is naturally a non-Gaussian random field on a sphere.

Fig. 1.

Fig. 1.

Small-scale component of the high-latitude ionospheric electrostatic potentials in the Northern Hemisphere generated from the LFM-MIX model at the Quad resolution. Top left panel: the small-scale component at the first time point; Other panels: the non-Gaussianity of the small-scale component is illustrated by the Q–Q plots of the replications over time at different locations.

1.3. Challenges and contributions.

One popular approach for dealing with non-Gaussian processes is to transform the data by some nonlinear function such that they can be modeled by a GRF; see Cressie (1993), De Oliveira, Kedem and Short (1997), Xu and Genton (2017) for examples. The success of this approach relies heavily on finding a suitable transformation for the data. As the covariance structure of the latent GRF becomes more complicated, the transformation could make the derived non-Gaussian process more difficult to interpret. Among various approaches to modeling non-Gaussian processes directly without any transformation, Palacios and Steel (2006) proposed a class of non-Gaussian spatial models based on scale mixing of a Gaussian process; Røislien and Omre (2006) developed a t-distributed random field model with heavy-tailed marginal probability density functions, which is a generalization of the familiar multivariate t-distribution; and Wallin and Bolin (2015) derived non-Gaussian geostatistical models with a Matérn covariance structure from stochastic partial differential equations driven by non-Gaussian noise.

Modeling random fields on a sphere instead of a Euclidean space poses an additional technical challenge. Processes on a sphere are usually constructed by restricting processes on 3 to the sphere, but this may cause physically unrealistic distortions, especially for the covariance structure at long distances [Gneiting (2013)]. There have been efforts to develop valid covariance functions on a sphere directly [see Jun and Stein (2008), Guinness and Fuentes (2016) for examples], but having a covariance function only does not suffice to fully define a non-Gaussian process. Stein (2007) and Cressie and Johannesson (2008), among others, used a fixed rank approach in which GRFs on a sphere are approximated by a linear combination of basis functions with normally distributed coefficients. The stochastic properties of the resulting process are determined by the distribution of the coefficients and the choice of the basis functions.

In this paper, we propose a new class of multi-resolution spatial models for non-Gaussian random fields on a sphere, which are constructed to mimic the characteristics of the aforementioned high-latitude ionospheric electrostatic potentials. The construction of the model starts with a sparse random effects model, represented in terms of a special multi-resolution wavelet frame named spherical needlets. The spatial localization of needlets, together with carefully chosen random coefficients in this representation, ensure that the resulting process is non-Gaussian and isotropic. The model is then expanded to include a spatially varying variance profile. Motivated by the specific application, we consider a special case where the variance depends on the latitude only. Analytical properties of needlets and the special formulation of the model enable us to develop efficient estimation and prediction procedures.

We apply the proposed model to the simulated high-latitude ionospheric electrostatic potentials generated from the LFM-MIX model. The results reveal that the amount of Joule heating is significantly increased by taking into account and modeling the small-scale variability of the electrostatic potentials using the proposed model. This indicates the necessity of including the small-scale variability in the calculation of the Joule heating. In comparison, modeling the small-scale variability using the popular Gaussian Matérn model fails to produce significantly more energy than the large-scale component only. In current general circulation models, the Joule heating rate is underestimated, and hence often multiplied by an arbitrary factor such that the latitudinal distribution of the modeled upper atmosphere temperature matches well with observed climatology [Matsuo and Richmond (2008)]. However, such an ad hoc treatment of the energy source has a number of shortcomings. In particular, since the electric fields also drive winds through the ion-drag force, the energy and momentum resulting from the high-latitude electric fields cannot be arbitrarily altered and need to be consistently treated. We also notice that the proposed model has a higher chance of producing extremely high energy than its Gaussian version due to the non-Gaussianity of the model. In fact, the upper atmosphere responds to extreme energy and momentum sources differently from moderately elevated ones. Thus, the proposed model’s ability to produce extreme energy and momentum sources is promising to improve the accuracy of numerically modeling the magnetosphere-ionosphere coupling.

The remainder of the paper is organized as follows. In Section 2, we first give a brief introduction to spherical needlets, and using needlets as building blocks, we then construct a sparse random effects model for non-Gaussian random fields on the unit sphere. The model is also expanded to include a spatially varying variance profile. In Section 3, we describe the computational details of model fitting, prediction and unconditional simulation. Their good performance is demonstrated in Section 4 by extensive numerical experiments. In Section 5, we apply the proposed model to the high-latitude ionospheric electrostatic potentials. Some relevant issues and future directions are discussed in Section 6.

2. Model construction for non-Gaussian random fields.

In this section, we construct a class of spatial models for non-Gaussian random fields on the unit sphere through a sparse random effects model that uses a multi-resolution representation in the form of a spherical needlet frame.

2.1. Spherical needlets.

We first give a brief introduction to spherical needlets and some of their important properties. More details can be found in Marinucci and Peccati (2011), Narcowich, Petrushev and Ward (2006), Fan (2015).

Let s denote a point on the unit sphere S2={x3:x=1}, and (θ,φ) represent the same point in spherical coordinates, where θ and φ are the colatitude and longitude, respectively. Complex-valued spherical harmonics (SH), denoted by {Ylm(θ,φ), l = 0,1,…,m = −l,…,l}, are the spherical analogue of the Fourier basis on the unit circle. They form an orthonormal basis for the Hilbert space L2(S2), the space of square integrable functions on the unit sphere. The index l determines the frequency level of SH functions.

Needlets are constructed in terms of SH functions based on two key ideas, namely, (a) a discretization of the unit sphere; and (b) a Littlewood–Paley decomposition. The discretization of S2 is achieved by an exact quadrature formula: for every l, there exist a finite subset Xl={ζlk}k=1nlS2 (quadrature points) and positive weights {λlk}k=1nl (quadrature weights) such that, for any polynomial f of degree at most l,

S2f(s) ds=k=1nlλlkf(ζlk). (2.1)

The Littlewood–Paley decomposition is defined through a function b on + satisfying: (i) b(·) > 0 on (B−1,B) for some B > 1, and equal to zero on (B−1, B)c; (ii) j=0b2(y/Bj)=1 for all y ≥ 1 (iii) b()CM(+) for some M{}. Based on these specifications, a class of spherical needlets is defined as follows. For sS2.

ψjk(s)=λjkl=Bj1Bj+1b(lBj)m=llYlm(ζjk)Y¯lm(s)=λjkl=Bj1Bj+1b(lBj)(2l+14π)Pl(ζjk,s), (2.2)

where j{0} encodes the scale or frequency of needlets, (with a slight abuse of notation) {ζjk}k=1pj are a set of quadrature points, and {λjk}k=1pj are the corresponding quadrature weights, where pj = nCj withCj = 2⌊BJ+1⌋ and nl is as in (2.1). The point ζjk determines the location of the needlet ψjk, for each k = 1, …, pj. Besides,〈·,·〉 denotes the dot product on 3, and Pl represents the lth Legendre polynomial. Note that the last equality holds due to the Addition Theorem for SH functions [Atkinson and Han (2012), Theorem 2.9], and it also shows that needlets are real-valued functions.

Needlets possess several attractive properties. They are localized in both the spatial and frequency domains, with quasi-exponentially increasing concentration around the quadrature point ζjk as the frequency level j increases. Together with Y00, a constant function on the unit sphere, the collection of needlets also form a Parseval tight frame, that is, for any fL2(S2),

S2|f(s)|2ds=|f,Y00L2|2+j=0k=1pj|f,ψjkL2|2,

where ,L2 denotes the inner product on L2(S2) and

f,ψjkL2=S2f(s)ψjk(s)ds

is called the needlet coefficient of f corresponding to the index pair (j, k), denoted by βjk. The tight frame property, together with the localization in both the spatial and frequency domains, ensure that needlets can be used to perform a multiresolution analysis of functions in L2(S2).

In our construction, we use the symmetric spherical t-designs on S2 [Womersley (2015)] as the quadrature points. The number of quadrature points nl=l2/2+l/2+O(1), and the quadrature weights are all equal to 4π/nl. Moreover, the function b is chosen based on the second specification in Marinucci and Peccati (2011, Chapter 10.2.2) with b(·) ∈ C [see Section S.1.1 of the supplementary material Fan et al. (2018) for details], and B is specified as 2 according to Narcowich, Petrushev and Ward (2006).

2.2. Sparse random effects model.

In this subsection, we construct a sparse random effects model for scalar random fields on the unit sphere using needlets as building blocks. Let {X(s):sS2} be a zero-mean scalar random field on the unit sphere, which can be represented in terms of needlets, that is,

X(s)=j=J0Jk=1pjcjkψjk(s), (2.3)

where JJ0 ≥ 0 determine the frequency range of the information contained in the process. In general, the representation is not unique due to the fact that {ψjk}j,k form a tight frame, which is an overcomplete system. Thus, we need to impose certain probabilistic assumptions on the random coefficients cjk’s such that meaningful inference on model parameters is enabled with pooling of information. Another consideration is to make the process X non-Gaussian. This can be achieved through assuming that cjk’s have a non-Gaussian joint probability distribution. In order to incorporate both scale-dependent variations and non-Gaussianity, we make the following key structural assumptions:

  • For each index pair (j, k), there are parameters νjk > 2 and σjk > 0 such that cjk ~ σjkt(νjk), where t(ν) denotes the t-distribution with ν degrees of freedom. Besides, cjk’s are independent.

The independence assumption of the coefficients cjk’s is motivated by the fact that, for a two-weakly isotropic process with mild regularity conditions, the correlation among pairs of needlet coefficients decay rapidly as the corresponding quadrature points and frequency levels become more separated [Baldi et al. (2009), Marinucci and Peccati (2011)].

The t-distribution assumption and the spatial localization of needlets ensure non-Gaussian stochastic properties of the process X because for any given location s, X(s) is approximately a weighted sum of a small number of cjk’s, which are independent and non-Gaussian. The same does not hold if needlets are replaced by basis functions with global support, such as SH functions, since in that case, X(s) becomes a weighted sum of a large number of cjk’s and the central limit theorem kicks in, resulting in approximately Gaussian behavior. This specification also has a distinct computational advantage, which will be discussed in Section 3.2.

We have the following theorem characterizing the covariance structure of the process X in a special case.

Theorem 1. If we consider the special case where σjk = σj and νjk = νj with νj > 2 for all k = 1, …, pj, the resulting process X is two-weakly isotropic with an oscillating covariance function

C(s,t)=Cov(X(s),X(t))=j=J0Jvjσj2vj2lb2(lBj)(2l+14π)Pl(s,t), (2.4)

where s, tS2. The covariance function decays quasi-exponentially with respect to the great-circle distance between the two locations s and t

|C(s,t)|j=J0Jvjσj2vj2cMB2j[1+Bj arccos(s,t)]M, (2.5)

where cM is some constant depending on M. Obviously, when b(·) ∈ C, (2.5) holds for all M.

The proof is deferred to Appendix A. The isotropy of the process is mainly due to the assumption that σjk and νjk do not vary with respect to k. Extensions to anisotropic models are possible by allowing σjk and νjk to vary spatially, for example, σjk = f(ζjk)σj and νjk = g(ζjk)νj for certain functions f and g. For simplicity, in the rest of the paper, we focus on the special case where σjk = σj and νjk = νj = ν.

There are some connections of the derived covariance function (2.4) with existing literature. Lindgren, Rue and Lindström (2011) discussed an approach for constructing scalar GRFs on d and S2 with oscillating covariance functions through stochastic partial differential equations. These covariance functions have an oscillating structure similar to our proposal, but they do not necessarily decay quasi-exponentially with respect to the great-circle distance. Moreover, there is no closed-form expression for the covariance functions when they are defined on S2. Schoenberg (1942) showed that {C(θ) : 0 ≤ θπ} is a valid isotropic continuous covariance function on S2 if and only if it has the following representation in terms of Legendre polynomials:

C(θ)=l=0alPl(cos θ),

where θ is the great-circle distance between two locations, al ≥ 0 for all l ≥ 0 and l=0al<. This general form was used by Terdik et al. (2015) and Guinness and Fuentes (2016) through specifying specific al’s to construct isotropic covariance functions on S2.

We now consider an observation model based on the above formulation. Suppose that we have observations on X at n different locations, s1,…,sn, and these observations, denoted by Z(si), are corrupted by observational errors. Then

Z(si)=X(si)+ei, i=1,,n, (2.6)

where ei’s are the observational errors modeled as i.i.d. N(0,τ2). In practice, we can specify the parameter J as some value Jmax based on data resolution. The parameter σj usually decays as the frequency level j increases. For example, we may assume that σj = f(j)σ, where f(j) = Bαj/2 with α > 2 [see Section S.1.2 of the supplementary material Fan et al. (2018) for justification]. The hyperparameter α controls the decay rate of the magnitude of the coefficients cjk’s from low to high frequency levels. In general, σj can be estimated without imposing any particular structure.

In Figure 2, we give a few examples of the correlation function of the process X with σj = 2αj/2 for various α and ranges of j. The curve oscillates more dramatically as α increases, and decays more rapidly as J0 becomes larger. The parameter ν has no impact on the correlation structure, but it measures the degree of non-Gaussianity of the process. In all the cases, the curve decays quasi-exponentially fast as we have shown in Theorem 1.

Fig. 2.

Fig. 2.

(a) Correlation functions for various α with j ranging from 2 to 4; (b) correlation functions for various ranges of j with α = 3.

To demonstrate the effect of the non-Gaussianity assumption for the coefficients cjk’s, we simulate the process X when cjk is distributed as σjt(ν) and normally distributed with the same variance, respectively. We specify σj = 2αj/2 with α = 3, J0 = 2, J = 4 and ν = 2.5. Figure 9 in Section S.3 of the supplementary material [Fan et al. (2018)] shows the projection of the simulations onto frequency levels j = 3,4, that is, X(j):=k=1pjcjkψjk. We can see that the non-Gaussian coefficients yield more positive and negative extreme values than the Gaussian ones. The Q–Q plots of 10,000 i.i.d. simulations of the process X at a specific location, displayed in Figure 10 in Section S.3 of the supplementary material [Fan et al. (2018)], confirm its non-Gaussianity when cjk’s are t-distributed. Checking the non-Gaussianity at one location is sufficient due to the isotropy of the process.

2.3. Axially symmetric model with a variance profile.

Spatial data on a global scale usually exhibit nonstationary behavior. In this subsection, we therefore expand the sparse random effects model to include a spatially varying variance profile. Let {g(s),sS2} be a function that characterizes the spatially varying variance structure. The function g is modeled through a log-linear basis representation

g(s)=exp(bT(s)η), (2.7)

where b(s) = (1,b1(s),…,br(s))T is a vector of basis functions (including the intercept) evaluated at location s, and η=(η0,η0T)T is the corresponding coefficient vector. Through an exploratory analysis of the LFM-MIX model output, we notice that the variance of the high-latitude ionospheric electrostatic potentials shows a strong dependence on the latitude, while only a moderate dependence on the longitude. Motivated by this, for simplicity and interpretability, we ignore the longitudinal dependence, and assume that g(s) is a function of the co-latitude θ only, that is,

g(s)g(θ)=exp(bT(θ)η). (2.8)

Moreover, the variance is assumed to vary smoothly over S2 and, therefore, the basis functions are specified as cubic B-splines due to their numerical stability, with the first B-spline replaced by the intercept. Then we have the following non-Gaussian model with a variance profile:

X(s)=g(θ)j=J0Jk=1pjcjkψjk(s). (2.9)

Since the function g depends on the latitude only, this model is axially symmetric [Jones (1963)]. We refer to the model given by (2.8) and (2.9) as the axially symmetric non-Gaussian needlet model, abbreviated as AXING-need. As a generalization of the model, one may use known functions of spatial covariates instead of b(θ) in the log-linear basis representation (2.8) if they are available.

We now consider the simple observation model (2.6) with X defined by (2.9). Denote the vector of observations by Z = (Z1,…,Zn)T, where Zi = Z(si), and the vector of observational errors by e = (e1,…,en)T. The AXING-need model can be written in matrix-vector form

Z=GAc+e, (2.10)

where G := diag{g(s1),…,g(sn)}, A is the n-by-∑j pj design matrix with the ith row corresponding to the value of needlets at si, and columns ordered according to the index pair (j,k), and c is the coefficient vector consisting of cjk’s.

3. Model fitting and prediction.

In this section, we describe the computational details of model fitting, prediction and unconditional simulation. The latter two are used to test the performance of the proposed AXING-need model, and compare it with Gaussian models. Computing maximum likelihood estimates (MLE) in the AXING-need model is intractable due to the absence of a closed form of the likelihood for the parameters. Evaluation of the likelihood requires high-dimensional integrals over the distribution of the random coefficients cjk’s. Thus, we adopt a Bayesian approach, and implement a Markov Chain Monte Carlo (MCMC) algorithm that samples from the posterior distribution of the parameters. Before going into the details, we impose the following structural restrictions for the ease and stability of the computation: (i) the parameter ν is fixed and known [Wolfe, Godsill and Ng (2004)] and (ii) the parameters η0 and σJ0 are nonidentifiable. We choose to fix η0 at 0 and treat σJ0 as a free parameter, so that the latter can be sampled in a Gibbs step. According to our numerical experiments, this is more efficient than sampling η0. We denote the vector of the remaining parameters by θ=(σJ02,,σJ2,τ2,η0T).

In the following derivation of the algorithm, generic notations are used: [U] denotes the distribution (or density) of a random variable U and [U|V] denotes the conditional distribution (or density) of U given V.

3.1. Prior specification.

We assume that the parameters are a priori independent, that is, [θ]=[σJ02][σJ2][τ2][η0]. The prior distributions of σj2 and τ2 are specified as the noninformative Jeffreys’ priors [σj2]=1/σj2 and [τ2] = 1/τ2. The prior distribution of η−0 is assumed to be N(0,τη2Ir), where the hyperparameter τη2 is chosen to be sufficiently large such that the prior distribution is nearly noninformative.

3.2. Adaptive MCMC.

Implementation of an MCMC sampling scheme for the AXING-need model seems challenging due to the dimensionality and distribution of the random coefficients cjk’s. One important observation that significantly reduces the computational burden is that the t-distribution (for cjk’s) belongs to the class of scale mixtures of Gaussians (SMOG) [Andrews and Mallows (1974), West (1987)]. Thus, the coefficient cjk can be expressed as VjkGjk, where Vjk~IG(v/2,vσj2/2) with IG(α,β) denoting an inverse gamma distribution, and Gjk~N(0,1) independent of Vjk. Since cjk’s are independent, the same holds for Vjk’s and Gjk’s. This representation leads to an efficient MCMC algorithm in which a Gibbs sampler is used.

Let V denote the vector stacked by Vjk’s, and σ2 denote the vector consisting of σJ02,,σJ2. We use a Gibbs sampler to sample from [c,V,θ|Z] so that the full conditional distributions of c, V, σ2 and τ2 all have closed forms. In particular, the full conditional distribution of c (i.e., [c|Z,V,θ]) is multivariate Gaussian. Sampling from it requires O(p3) operations with p = ∑j pj, which is computationally intractable for large p. Nonetheless, our numerical experiments indicate that the subblocks [cj|Z,V,θ], j = J0,…,J are weakly correlated, where cj = (cj1,…,cjpj)T. This can be attributed to the fact that needlets are spatially localized and bandlimited. Thus, the sampling step for c is achieved through successive draws from the conditional subblocks [cj|Z,V,θ,cj], j = J0,…,J, where cj denotes the vector obtained by dropping cj from c. Partitioning the vector c into subblocks avoids the Cholesky decomposition of a large-scale matrix, while the weak correlation among pairs of the subblocks mitigates the potential problem of slow convergence for the Gibbs sampler. For sufficiently large j, we may further speed up the computation by sampling from [cjk|Z,V,θ,cjk], k = 1,…,pj. The full conditional distribution of η−0 is not available in closed form. Thus, we sample from [η−0|Z, c,V, σ2, τ2] using an adaptive Metropolis step [Andrieu and Thoms (2008), Algorithm 4], and incorporate it into the Gibbs sampler.

Let Aj denote the columns of A corresponding to j, and Vj = (Vj1,…,Vjpj)T. Then the aforementioned adaptive Metropolis-within-Gibbs sampler is summarized as follows:

Algorithm 1.

  1. Sample cj from [cj|Z,V,θ,cj]=N(μ^j,Σ^j), where
    Σ^j=τ2(AjTG2Aj+τ2 diag(Vj)1)1,
    and
    μ^j=1τ2Σ^jAjTG(ZGAjcj).
  2. Sample V from [V|Z,c,θ], where Vjk|Z,c,θ are independent and distributed as
    IG(ν+12,cjk2+vσj22).
  3. Sample σ2 from [σ2|Z,c,V,τ2,η], where σj2|Z,c,V,τ2,η are independent and distributed as
    G(vpj2,v2k=1pj1Vjk),
    where G(α,β) denotes a gamma distribution.
  4. Sample τ2 from
    [τ2|Z,c,V,σ2,η]=IG(n2,(ZGAc)T(ZGAc)2).
  5. 5. Sample η−0 using the adaptive Metropolis step from
    [η0|Z,c,V,σ2,τ2]exp{12τ2(ZGAc)T(ZGAc)} exp{12τη2η0Tη0}.
    The proposal distribution is chosen as
    Q(η0*|η0)~N(η0,γΣ),
    where γ is a parameter adaptively tuned with the goal of achieving the optimal acceptance rate [Gelman, Roberts and Gilks (1996)], and Σ is adaptively updated to approximate the covariance matrix of the full conditional distribution of η−0.

Choosing a good initial value of the parameter vector θ is crucial for successful convergence of the algorithm. We provide a simple but effective method to specify the initial value through fitting the model with cjk~N(0,vσj2/(v2)) for all j, k. The MLE under this misspecified model, which assumes a Gaussian process for the observations, is easily computable, and close to the true value according to our numerical experiments. Thus, it is reasonable to specify the initial value of θ in terms of the MLE under the misspecified model. By default, the initial values of c and V are specified as a zero vector and an all-ones vector, respectively.

3.3. Spatial prediction.

One main focus of spatial statistics is to predict values at unobserved locations based on observed values. In this subsection, we present a spatial prediction method for the proposed AXING-need model. Let Z* = (Z(t1),…,Z(tnP))T, where t1,…,tnP are nP unobserved locations. Then we have Z* = G*A*c + e*, where G*, A* and e* are G, A and e evaluated at the unobserved locations, respectively. From a fully Bayesian perspective, the posterior predictive distribution is

[Z*|Z=z]=θ,c[Z*|θ,c,Z=z][θ,c|Z=z]dθdc, (3.1)

where z is the actual observed value of Z, and

Z*|θ,c,Z=z~N(G*A*c,τ2InP). (3.2)

The posterior predictive distribution does not have a closed form, but it can be sampled using the MCMC samples from [θ,c|Z = z]. Let {θ(l), θ(l),l = 1,…,L} denote the MCMC samples. For each l, we draw a sample Z*(l) from [Z*|θ(l),c(l),Z = z] in (3.2). The empirical distribution of {Z*(l),l = l,…,L} is approximately the posterior predictive distribution. Thus, the posterior predictive mean, standard deviation, quantiles and posterior predictive intervals can be computed based on these samples, respectively. For example, the posterior predictive mean is estimated by L1l=1LZ*(l) and the (1 – α)100% posterior predictive interval is estimated by finding the (100α/2)th and (100(1 – α/2))th percentiles of {Z*(l),l = 1,…,L}.

3.4. Unconditional simulation.

Utility of spatial models is not restricted to spatial prediction. For example, Genton and Kleiber (2015, Section 7.4) emphasized the importance of using spatial models for simulation of random fields. It is also well known that a misspecified spatial model does not necessarily mean significantly inferior predictive performance [Stein (1988)]. Thus, simulation of random fields from a fitted spatial model provides an alternative way to test the performance of the model. Specifically, it is used to check whether the proposed AXING-need model can successfully capture the characteristics of data. The following algorithm describes the details of simulating a random field from a fitted AXING-need model (2.10) at locations s1,,snS.

Algorithm 2.

  1. To incorporate the uncertainty of parameter estimates into the simulation, we randomly select one of the MCMC samples {θ(l),l = 1,…,L}, denoted by θ^=(σ^J02,,σ^J2,τ^2,η^0), as an estimate of the parameters. The matrix G is evaluated with η0=η^0 at the locations. s1,,snS, and the design matrix A is also evaluated at these locations

  2. Generate a coefficient vector c, where cjk~σ^jt(ν) for all j, k, and they are independent.

  3. Then X = GAc is an unconditional simulation of random fields from the fitted AXING-need model, where the term of observational errors is omitted.

4. Numerical experiments.

4.1. Accuracy of parameter estimation.

In this subsection, we investigate the accuracy of parameter estimation by a Monte Carlo simulation study. The data are generated from the proposed AXING-need model (2.10), and the sampling locations are on a mildly perturbed HEALPix grid [Górski et al. (2005)] with 768 grid points. To reduce the computational burden of the simulation study, we assume that there are only two levels of needlets, that is, J0 = 2, J = 3. As a function of co-latitude, log g is represented as a linear combination of cubic B-splines with one interior knot π/2, where the first B-spline is replaced by the intercept. We consider two different settings of the coefficient vector η−0, and specify η0 = 0, ν = 4, σ2 = 1.25, σ3 = 0.4419, τ = 0.1.

For each simulation run, the parameters (η−0,σ2,σ3,τ) are estimated by the posterior sample means using the adaptive MCMC algorithm in Section 3.2, in which τη = 10. The chain was run for 400,000 iterations, with the first 200,000 samples discarded as a burn-in period, and every 200th sample is collected to compute the posterior sample means. Figure 3 displays the boxplots of the parameter estimates based on 100 simulation runs. We can see that all the parameter estimates have small biases and low variability. Moreover, the pointwise median curve of the function g matches well with the true one. These demonstrate the effectiveness of the estimation procedure using the adaptive MCMC algorithm.

Fig. 3.

Fig. 3.

Results of the Monte Carlo simulation study under two different settings of η−0. in panels (a) and (b) The dashed horizontal line accompanying the boxplots shows the true value of the parameters. Top left block: boxplots of estimated η−0; Bottom left block: estimated and true curves of g (i.e., the standard deviation profile up to a constant); Top right block: boxplots of estimated σ2 and σ3; Bottom right block: boxplot of estimated τ.

4.2. Predictive performance comparison.

In this subsection, we compare the predictive performance of the proposed AXING-need model with that of two Gaussian models, which are Gaussian needlet (Gau-need) and Gaussian Matérn (Gau–Matérn) models, the latter being widely used in spatial statistics. The purpose of this comparison is to understand the effects of two factors on spatial predictive performance: Gaussian versus non-Gaussian, and a Matérn covariance structure versus an oscillating one constructed through a needlet representation of the process. In the Gau-need model, the coefficient vector c in (2.10) is assumed to be multivariate Gaussian, that is,

c~N(0,Λ),

where Λ is a diagonal matrix, and Var(cjk)=σj2 for all j, k. The Gau–Matérn model assumes a nonstationary Matérn covariance function with latitude-dependent variance

C(s,t)=g(s)g(t)M(st;κ,a),

where s,tS2,st denotes the chordal distance between the two locations s and t, and g is given by (2.8). Besides, M(r;κ,a) is the Matérn correlation function

M(r;κ,a)=21κΓ(κ)(ar)κKκ(ar),

where Kκ is the modified Bessel function of the second kind, Γ is the gamma function, κ > 0 is the smoothness parameter, and a > 0 is the spatial scale parameter, whereby 1/a controls the range of correlation.

The data are generated from the AXING-need model, where the parameter ν is specified as 2.5,3,4 to illustrate the impact of the degree of non-Gaussianity on prediction accuracy. The specification of the remaining model parameters and sampling locations are the same as the first setting in Section 4.1. For each cross-validation replication, we randomly select 500 locations outside a fixed longitudinal region with width 30° as the training set to estimate the parameters. The remaining locations are held out as the test set to evaluate the predictive performance. For the AXING-need model, the parameters are estimated by the adaptive MCMC algorithm described in Section 3.2, in which τη = 10. The chain was run for 1,000,000 iterations, with the first 500,000 samples discarded as a burn-in period, and every 500th sample is collected as the posterior samples. The posterior predictive mean and quantiles (see Section 3.3) are used for prediction. For the Gaussian models, the parameters are estimated by the maximum likelihood method, and the prediction is performed by the usual kriging. The first panel of Figure 4 shows the training and test set separation for one of the cross-validation replications. Held-out locations in and out of the longitudinal region are used to test the predictive performance in terms of long-range and short-range predictions, respectively. We assess the prediction accuracy using five scoring rules: the mean absolute error (MAE), the mean squared prediction error (MSPE), the continuous ranked probability score (CRPS) [Gneiting and Raftery (2007)], and the quantile scores at the levels of 5% and 95% [Gneiting and Ranjan (2011)], averaged over all the predicted locations. The CRPS for the AXING-need model is computed based on the MCMC samples [see Section S.1.3 of the supplementary material Fan et al. (2018) for details]. The quantile score for a quantile forecast q at the level α ∈ (0,1) is defined as QSα(q,y)=(1{y<q}α)(qy), where y is the observed value. We repeat the cross-validation procedure 20 times, and summarize the prediction accuracy of the models in terms of the scoring rules in the remaining panels of Figure 4. We can see that when ν = 3,4, the AXING-need and Gau-need models give similar results, and both are better than the Gau–Matérn model. When ν = 2.5, the Gau-need model significantly deviates from the non-Gaussian characteristics of the data, and thus performs worse than the AXING-need model. Again, the Gau–Matérn model has the worst predictive performance among the three. The difference in predictive performance between the Gau-need and Gau–Matérn models becomes more pronounced for the long-range prediction than the short-range one. This can be explained by the following facts: the Gau–Matérn model can reasonably approximate the true covariance structure at short distances, but it fails to capture the oscillating covariance structure at long distances due to its constraint of nonnegative covariance.

Fig. 4.

Fig. 4.

First panel: Training and test set separation for one of the cross-validation replications. The sphere has been projected to an ellipse by the Hammer projection. o” observed locations; +” short-range predicted locations; *” long-range predicted locations; Other panels: Boxplots of five scoring rules for the AXING-need, Gau-need and Gau–Matérn models with 20 cross-validation replications. (a)–(b) MAE; (c)–(d) MSPE; (e)–(f) CRPS; (g)–(h) QS 5%; (i)–(j) QS 95%.

Gneiting, Balabdaoui and Raftery (2007) proposed the concept of sharpness to evaluate the predictive performance in probabilistic prediction, which provides a predictive distribution instead of a point prediction for the value of a random field at an unobserved location. Sharpness is measured by the concentration of a predictive distribution around the true value, and the more concentrated the predictive distribution is, the sharper the prediction. We adopt the mean length of predictive intervals (“mean” refers to averaging over all the predicted locations) and the corresponding coverage probability to assess the sharpness of the predictions produced by the models. The posterior predictive interval (see Section 3.3) and the usual symmetric predictive interval are used for the AXING-need model and the Gaussian models, respectively. Table 1 displays the mean lengths of 50% and 90% predictive intervals and the corresponding coverage probabilities, averaged over 20 cross-validation replications. In general, the AXING-need model yields the shortest predictive intervals with satisfactory coverage probabilities.

Table 1.

Coverage probabilities (CP) and mean lengths (mLen) of 50% and 90% predictive intervals, averaged over 20 cross-validation replications

v Model Short-range prediction Long-range prediction
CP (50%) mLen CP (90%) mLen CP (50%) mLen CP (90%) mLen
2.5 AXING-need 49.4% 0.21 88.8% 0.50 53.2% 0.55 91.8% 1.34
Gau-need 48.3% 0.21 88.2% 0.51 34.9% 0.57 79.7% 1.39
Gau-Matérn 52.6% 0.28 90.6% 0.69 43.4% 1.06 81.7% 2.59
3 AXING-need 49.0% 0.20 90.0% 0.50 50.5% 0.52 88.6% 1.26
Gau-need 48.2% 0.20 89.2% 0.49 44.0% 0.50 84.7% 1.22
Gau-Matérn 51.3% 0.25 90.6% 0.61 44.4% 0.87 93.2% 2.12
4 AXING-need 50.7% 0.20 89.6% 0.50 63.5% 0.50 93.2% 1.23
Gau-need 50.4% 0.20 89.4% 0.49 59.5% 0.48 92.1% 1.16
Gau-Matérn 52.1% 0.24 90.6% 0.60 54.3% 0.78 93.5% 1.90

5. Application to high-latitude ionospheric electrostatic potentials.

In this section, we demonstrate the effectiveness of the proposed AXING-need model and the associated statistical methodology through applying them to the LFMMIX model output: the high-latitude ionospheric electrostatic potentials, which are introduced in Section 1. A detailed description of the LFM-MIX model can be found in Wiltberger et al. (2017). The LFM-MIX model is capable of running at multiple resolutions including Single, Double and Quad. Herein, we use Quad resolution simulations since they contain more small-scale details than the other two. At the Quad resolution, the sampling locations are on a regular 1 × 1 degree geomagnetic latitude-longitude grid with latitudes ranging from 45° to 90°. The electrostatic potentials are generated every two minutes during the time period from March 20, 2008, to April 16, 2008 (inclusive), and thus there are 20,160 time points in total. We denote the observations by Z(si,tr), i = 1,…,N, r = 1,…,T, where si and tr represent location and time, respectively.

5.1. Data preprocessing.

To extract the small-scale component of the electrostatic potentials, we subtract a crude estimate of the large-scale component from the observations. The large-scale component is estimated from the data by the empirical orthogonal function (EOF) method, which was applied to the LFM-MIX model output in Kleiber et al. (2013). Let Z = (Z(si, tr))1≤rT,1≤iN denote the T × N matrix of observations with each row corresponding to a time point. We center Z by subtracting the vector of column averages from each row. The singular value decomposition (SVD) is then applied to Z, that is, Z = UDVT, where U = (urk)1≤rT,1≤kT is a T × T orthogonal matrix, D = diag(d1,…, dT) with decreasing singular values dk’s, and V = (vik)1≤iN,1≤kT is an N × T matrix with orthonormal columns, that is, EOFs. Element-wise, we can express the observations as Z(si,tr)=k=1Tdkurkvik The cumulative sum of the squared singular values indicates that the first four EOFs can explain approximately 95% of the total variability in the data. Section S.2.1 of the supplementary material [Fan et al. (2018)] gives a scientific explanation of the EOFs through comparing them with those derived from real observed data. Thus, the large-scale component can be estimated by k=1Kdkurkvik with K = 4, and the residuals are denoted by r(si,tr)=Z(si,tr)k=1Kdkurkvik.

The sampling locations of the electrostatic potentials are restricted to the highlatitude region of the Northern Hemisphere, while the AXING-need model is constructed for data on a global scale due to the grids on which spherical needlets are placed. There are several ways of handling this problem. First, we can still fit the same model to the data, which are in the high-latitude region, but the needlets outside the region would not have any observations contained in their effective support, and hence the corresponding coefficients could not be effectively estimated from the data [Chu, Clyde and Liang (2009), Section 3.1]. Second, we may simplify the problem as discussed in Heaton et al. (2015) through transforming the data domain to a disk in 2 by the polar projection, but our model needs to be modified accordingly (it is still under the same framework as a random effects model), especially using an appropriate set of basis functions constructed on the disk instead of the sphere. Herein, we follow the idea of Weimer (1995) and Ruohoniemi and Baker (1998), in which the data are stretched from the high-latitude region to the entire sphere. This is achieved by multiplying the co-latitude θ with a stretching factor α to obtain a new co-latitude θ′ = αθ, where in our case α = 180/45 = 4. When applied to random fields, one drawback of this approach is that the correlation structure may be distorted by the stretching, especially near the low-latitude boundary of the data. However, the magnitude of the electrostatic potentials in the lower latitude region (θ > 30°) is close to zero, and the correlation contours in the higher latitude region (θ < 30°) have elongated shapes that are wider in the east-west direction [Cousins, Matsuo and Richmond (2013b)]. These findings suggest that the distortion is not as serious as it seems.

We notice that there are still certain large-scale features remaining in the residuals, which cannot be directly modeled by the AXING-need model since it only captures small-scale details at needlet frequency levels higher than or equal to J0. Thus, we consider the following augmented model (the term of observational errors is omitted here):

r(si,tr)=g(si,tr)(l=0Lm=llalm(tr)Ylm(si)+j=J0Jk=1pjcjk(tr)ψjk(si)),

where the first term within the parentheses involving SH functions is included to represent the remaining large-scale features, si is the point obtained by stretching si, and the random coefficients alm and cjk all depend on the time tr. To ensure that the SH functions and needlets cover the full frequency range with the minimum overlap, we specify L = 3 and J0 = 2 given B = 2. We also specify J = 4. This indicates that there are in total three levels of needlets in the model, and the finest details it can capture are at the frequency level l = 31 in terms of SH functions. For simplicity, instead of simultaneously estimating the parameters of the SH and needlet terms, we remove a crude estimate of the SH term from the residuals. First, we further assume that the function g does not vary over time, and the residuals are i.i.d. replications over time. These two assumptions are made primarily to obtain a crude estimate of g by the moment estimator, denoted by g^(si), based on the replications over time. The residuals are then standardized by dividing by g^(si). We filter out the SH term through regressing the standardized residuals on SH functions up to L = 3. Last, we multiply the residuals after regression with g^(si), and treat them as the small-scale component of the electrostatic potentials. Note that a formal statistical analysis on the augmented model is left for future work.

5.2. Model fitting and prediction results.

We fit the AXING-need, Gau-need and Gau–Matérn models to the small-scale component of the electrostatic potential at the first time point (see the top left panel of Figure 1) since we are mainly interested in modeling the small-scale spatial variability. For future work, we may extend our model to spatio-temporal settings under the same framework with additional temporal structures on the coefficients cjk’s. Without loss of generality, the Earth is treated as a unit sphere. The function g is assumed to be a function of co-latitude and the basis functions bi(θ′)’s (θ′ = 4θ due to the stretching) are chosen as natural cubic B-splines with two interior knots and one boundary knot [see Section S.2.2 of the supplementary material Fan et al. (2018) for details]. To reduce the computational burden, we randomly select 4000 locations from the original grid to fit the models. We choose ν = 3 among 2.5, 3 and 4 since it yields the best predictive performance (in terms of the MAE, MSPE and CRPS) when predicting at the remaining locations. Moreover, too large ν (ν > 4) makes the model too close to Gaussian, and ν has to be larger than 2 to ensure the existence of the first two moments. For the AXING-need model, the parameters are estimated by the adaptive MCMC algorithm in Section 3.2, in which τη = 10. The chain was run for 600,000 iterations, with the first 400,000 samples discarded as a burn-in period, and every 200th sample is collected as the posterior samples. The MCMC diagnostics can be found in Section S.2.3 of the supplementary material [Fan et al. (2018)]. For the Gaussian models, the parameters are estimated by the maximum likelihood method. Table 2 lists the parameter estimates and the corresponding standard errors for the models. The standard errors are estimated by the posterior sample standard deviations and the parametric bootstrap with 200 bootstrap samples for the AXING-need model and the Gaussian models, respectively. The parametric bootstrap method was used in spatial settings in Xu and Genton (2017), Fan et al. (2017).

Table 2.

Parameter estimates for the AXING-need (ν = 3), Gau-need and Gau–Matérn models applied to the small-scale component of the high-latitude ionospheric electrostatic potential at the first time point. The corresponding standard errors are shown in parentheses. Note that η0 is fixed at 0 for the AXING-need and Gau-need models

Model AXING-need Gau-need Gau-Matérn
η0 0 (−) 0 (−) −2.037 (0.16)
η1 0.949 (0.036) 0.959 (0.011) 0.659 (0.10)
η2 1.579 (0.086) 1.619 (0.012) 1.372 (0.29)
η3 −1.209 (0.046) −1.190 (0.012) −1.078 (0.070)
σ2 0.148 (0.023) 0.252 (0.014) -
σ3 0.0363 (0.0035) 0.0665 (0.0032) -
σ4 4.09e-3 (2.62e-4) 6.47e-3 (2.11e-04) -
κ - - 2.857 (0.14)
1/a - - 0.102 (0.0072)
τ 0.0280 (3.71e-04) 0.0282 (3.18e-04) 7.76e-03 (1.16e-04)

Apart from model fitting, we further compare the predictive performance of the models. To reduce the computational burden, we only compare the Gau-need and Gau–Matérn models since the parameter estimation and spatial prediction for the AXING-need model involves more time-consuming MCMC iterations. The scientific implications and benefits of using a non-Gaussian model compared with Gaussian models will be illustrated in Section 5.3. Section S.2.6 of the supplementary material [Fan et al. (2018)] gives the training and test set separation in a cross-validation procedure, specified according to the spatial coverage of the SuperDARN data. We predict on the test set using the parameter estimates in Table 2. The scoring rules MAE, MSPE and CPRS are used to assess the prediction accuracy, averaged over all the predicted locations. We repeat the cross-validation procedure 100 times, and calculate the mean and standard deviation (in parentheses) of the scores, which are displayed in Table 3. There is no significant difference in predictive performance between the two models, with the Gau-need model performing slightly better than the Gau–Matérn model in terms of the MSPE and CRPS. Table 3 also shows the mean lengths of 50% and 90% predictive intervals and the corresponding coverage probabilities, averaged over the 100 cross-validation replications. The Gau-need model yields longer predictive intervals, but they have more satisfactory coverage probabilities.

Table 3.

Predictive performance of the Gau-need and Gau–Matérn models for the small-scale component of the high-latitude ionospheric electrostatic potential at the first time point. All the scores are averaged over 100 cross-validation replications, where the standard deviations are shown in parentheses

Model MAE (kV) MSPE (kV2) CRPS CP (50%) mLen CP (90%) mLen
Gau-need 0.138 (0.024) 0.146 (0.074) 0.102 (0.019) 68.2% 0.23 92.5% 0.57
Gau-Matérn 0.136 (0.031) 0.158 (0.092) 0.113 (0.029) 43.6% 0.08 68.8% 0.21

5.3. Scientific implications.

As mentioned in Section 1, one important scientific goal of modeling the electrostatic potentials is to generate their simulations for estimating high-latitude energy inputs into the upper atmosphere in general circulation models. In Figure 5, we compare the simulations of the fitted AXING-need, Gau-need and Gau–Matérn models, for which the parameter estimates are given in Table 2. These simulations are generated such that their patterns are comparable across the models [see Section S.2.5 of the supplementary material Fan et al. (2018)]. We can see that the magnitude of the simulations of the Gau–Matérn model is much smaller than that of the AXING-need and Gau-need models. The most activity (i.e., the red and blue patches in the online version of this figure) of the simulations of all the models occurs near 75° latitude, which is consistent with the location of the general auroral zone [Hunsucker and Hargreaves (2007), Chapter 6.2.1]. Compared with the Gaussian models, the red and blue patches of the simulations of the AXING-need model have the sharpest and clearest boundaries. This can be attributed to its non-Gaussianity, which makes only a small number of the coefficients cjk’s nonzero and the rest of them almost zero. Additional simulations of the fitted AXING-need model are shown in Section S.2.4 of the supplementary material [Fan et al. (2018)].

Fig. 5.

Fig. 5.

Comparison of simulations of the AXING-need, Gau-need and Gau–Matérn models.

Finally, we quantify the amount of Joule heating produced by the small-scale component of the electrostatic potentials. The estimation of the Joule heating rates (i.e., the amount of Joule heating per second) based on the electrostatic potentials is given in Appendices B and C. For each fitted model, we compute the overall integrated Joule heating rate of 1000 simulated electrostatic potentials in the high-latitude region of the Northern Hemisphere, and plot the corresponding histogram in Figure 6. The overall integrated Joule heating rates for the AXING-need and Gau-need models are significantly larger than that for the large-scale component only, while the latter is almost the same as the overall integrated Joule heating rates for the Gau–Matérn model. Note that the large-scale component is what has been subtracted from the data in the preprocessing step. Moreover, the overall integrated Joule heating rates for the AXING-need model have a heavier right tail than those for the Gau-need model. Specifically, the 95th and 99th percentiles of the overall integrated Joule heating rates for the AXING-need model are 9.6842 and 11.6769, while those for the Gau-need model are 9.2981 and 9.9659. Consequently, the AXING-need model can produce significantly larger extreme values of the Joule heating rate than the Gau-need model. All of these suggest that the proposed AXING-need model can provide a potentially viable remedy to systematic biases of general circulation models resulting from the underestimation of high-latitude energy inputs.

Fig. 6.

Fig. 6.

Histograms of the overall integrated Joule heating rates of 1000 electrostatic potentials in the high-latitude region of the Northern Hemisphere simulated from the AXING-need, Gan-need and Gau–Matérn models.

6. Discussion.

We have introduced a new class of multi-resolution spatial models for non-Gaussian random fields on a sphere. They are constructed in the form of a sparse random effects model using spherical needlets as building blocks. The spatial localization of needlets, together with carefully chosen random coefficients, ensure the model to be non-Gaussian and isotropic. We have shown that the proposed model has an oscillating and quasi-exponentially decaying covariance function. The model has also been expanded to include a spatially varying variance profile. The special formulation of the model enables us to develop efficient estimation and prediction procedures under a hierarchical Bayesian framework. We have investigated the accuracy of parameter estimation of the proposed model, and compared its predictive performance with that of two Gaussian models by extensive numerical experiments. The effectiveness of the proposed model is also demonstrated through an application of the methodology to the high-latitude ionospheric electrostatic potentials generated from the LFM-MIX model.

The computational speed of each iteration of the adaptive MCMC algorithm is primarily determined by the sampling step for the coefficient vector c, even though we have already partitioned it into subblocks cj’s to speed up the computation. Recall that cj is sampled from N(μ^j,Σ^j). When the number of observations is much smaller than that of needlets, we can reduce the computational burden significantly through applying the Sherman–Morrison–Woodbury formula to Σ^j. This coincides with the algorithm proposed in Bhattacharya, Chakraborty and Mallick (2016).

Many terrestrial processes arising in geophysical and environmental sciences are vector fields tangential to the surface of the Earth. Along the lines of Fan et al. (2017), extensions to models for tangential vector fields on a sphere are possible through applying spherical differential operators to the proposed model. Then the tangential vector fields are represented in terms of vectorial needlets. The proposed model can also be extended to models for anisotropic random fields, as mentioned in Section 2.2, by allowing σjk and νjk to vary spatially, and dynamic spatiotemporal random fields through imposing a time series model on the coefficients cjk’s [Cressie, Shi and Kang (2010)].

Acknowledgments.

A part of the work of the first author was done while he was visiting National Center for Atmospheric Research during the summer of 2014, 2015 and 2016. The authors are grateful to Doug Nychka, the Editor Nicoleta Serban, the Associated Editor and the referees for their valuable comments and suggestions. The data set of the LFM-MIX model output was kindly provided by Mike Wiltberger. The authors also thank Ellen Cousins for her valuable comments and suggestions.

Minjie Fan and Tomoko Matsuo supported in part by NSF Grants AGS-1025089 and PLR-1443703.

Debashis Paul supported in part by NSF Grants DMS-1407530, DMS-1713120 and NIH Grant 1R01EB021707.

Thomas C. M. Lee supported in part by NSF Grants DMS-1512945 and DMS-1513484.

Appendix A: PROOF OF THEOREM 1

Since cjk’s are independent, we obtain

Cov(X(s),X(t))=j=J0Jk=1pjVar(cjk)ψjk(s)ψjk(t)=j=J0Jvjσj2vj2k=1pjψjk(s)ψjk(t).

Using (2.2), it follows that

k=1pjψjk(s)ψjk(t)=l=Bj1Bj+1l=Bj1Bj+1b(lBj)b(lBj)(2l+14π)(2l+14π)×k=1pjλjkPl(ζjk,s)Pl(ζjk,t)=l=Bj1Bj+1l=Bj1Bj+1b(lBj)b(lBj)(2l+14π)(2l+14π)×S2Pl(x,s)Pl(x,t)dx,

where the last step is due to the quadrature formula (2.1). Now, by the fact that Pl is real-valued and the addition theorem for SH functions, we have

(2l+14π)(2l+14π)S2Pl(x,s)Pl(x,t)dx=S2m=llm=llY¯lm(x)Ylm(s)Ylm(x)Y¯lm(t)dx=m=llm=llYlm(s)Y¯lm(t)S2Ylm(x)Y¯lm(x)dx=m=llm=llYlm(s)Y¯lm(t)δllδmm=δllm=llYlm(s)Y¯lm(t)=δll(2l+14π)Pl(s,t),

where δab = 1 if a = b, and 0 otherwise. Note that the third equality follows from the orthonormality of SH functions, and the last one is due to another use of the addition theorem. Substituting this into the above equation, we obtain

k=1pjψjk(s)ψjk(t)=l=Bj1Bj+1b2(lBj)(2l+14π)Pl(s,t).

Thus,

Cov(X(s),X(t))=j=J0Jvjσj2vj2l=Bj1Bj+1b2(lBj)(2l+14π)Pl(s,t).

By the bound used in the proof of Baldi et al. (2009, Lemma 3),

|l=Bj1Bj+1b2(lBj)(2l+14π)Pl(s,t)|cMB2j[1+Bj arccos(s,t)]M,

where where cM is some constant depending on M. Note that a direct proof of the above inequality can be derived from Narcowich, Petrushev and Ward (2006, Theorem 3.5) or Dai and Xu (2013, Theorem 2.6.7).

Thus,

|Cov(X(s),X(t))|j=J0Jvjσj2vj2cMB2j[1+Bj arccos(s,t)]M.

Appendix B: DERIVING ELECTRIC FIELD FROM ELECTROSTATIC POTENTIAL

Denote the electric field by E and the electrostatic potential by ΦE. Then

E=ΦE1RΦEθθ^1R1sin θΦEϕϕ^,

where ∇ is the gradient operator defined on 3, θ^ and ϕ^ are two of the spherical unit vectors with θ^ and ϕ^ pointing southward and eastward, respectively, and R ≈ 6.5 × 106m is the radius of the ionosphere. Note that the radial component of the electric field is ignored in this approximation since its magnitude is relatively small compared with that of the tangential component. Recall that the electrostatic potential in the high-latitude region of the Northern Hemisphere can be expressed as

ΦE(θ,ϕ)=g(θ)j,kcjkψjk(θ,ϕ)=g(4θ)j,kcjkψjk(4θ,ϕ),

where θ′= 4θ ∈ [0,π] due to the stretching of the data to the entire sphere. Then we have

ΦEθ=4(gθ|θ=4θj,kcjkψjk(4θ,ϕ)+g(4θ)j,kcjkψjkθ|θ=4θ),

and

1sin θΦEϕ=sin θsin θg(4θ)j,kcjk1sin θψjkϕ.

Recall that

ψjk(θ,ϕ)=λjklb(lBj)2l+14π×Pl(xjk sin θ cos ϕ+yjk sin θ sin ϕ+zjk cos θ).

Then

ψjkθ=λjk(xjk cos θ cos ϕ+yjk cos θ sin ϕzjk sin θ)×lb(lBj)2l+14πdPl(u)du|u=u,

and

1sin θψjkϕ=λjk(xjk sin ϕ+yjk cos ϕ)lb(lBj)2l+14πdPl(u)du|u=u,

where u′ = xjk sin θ′ cosϕ + yjk sin θ′ sinϕ + zjk cos θ′ Note that d Pl(u)/du can be efficiently computed by a recursive formula.

APPENDIX C: COMPUTING IONOSPHERIC JOULE HEATING RATE

According to Palmroth et al. (2005), the ionospheric Joule heating rate can be estimated by

P^JH(θ,ϕ)=ΣP(θ,ϕ)|E(θ,ϕ)|2ΣP(θ,ϕ)(Eθ(θ,ϕ)2+Eϕ(θ,ϕ)2),

Where ΣP is the height-integrated Pedersen conductivity and Eθ and Eφ are the are the zonal and meridional components of the electric field given in Appendix B, that is,

Eθ=1RΦEθ,

and

Eϕ=1R1sin θΦEϕ.

The Joule heating rate integrated over a subset S of the ionosphere is defined as

PIJH(S)=SPJH(θ,ϕ)dS,

where dS is the area element of the ionosphere. In particular, when S is the high-latitude region of the Northern Hemisphere (co-latitude θπ/4),

PIJH=θπ/4PJH(θ,ϕ)dS.

Footnotes

SUPPLEMENTARY MATERIAL

Supplement to “A multi-resolution model for non-Gaussian random fields on a sphere with application to ionospheric electrostatic potentials” (DOI: 10.1214/17-AOAS1104SUPP; .pdf). This supplement provides additional figures and details of the numerical experiments and application.

REFERENCES

  1. Andrews DF and Mallows CL (1974). Scale mixtures of normal distributions. J. Roy. Statist. Soc. Ser. B 36 99–102. MR0359122 [Google Scholar]
  2. Andrieu C and Thoms J (2008). A tutorial on adaptive MCMC. Stat. Comput 18 343–373. [Google Scholar]
  3. Atkinson K and Han W (2012). Spherical Harmonics and Approximations on the Unit Sphere: An Introduction Lecture Notes in Math. 2044 Springer, Heidelberg: MR2934227 [Google Scholar]
  4. Baldi P, Kerkyacharian G, Marinucci D and Picard D (2009). Asymptotics for spherical needlets. Ann. Statist 37 1150–1171. [Google Scholar]
  5. Barndorff-Nielsen O (1979). Models for non-Gaussian variation, with applications to turbulence. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci 368 501–520. MR0554672 [Google Scholar]
  6. Berg J, Natarajan A, Mann J and Patton EG (2016). Gaussian vs non-Gaussian turbulence: Impact on wind turbine loads. Wind Energy 19 1975–1989. [Google Scholar]
  7. Bhattacharya A, Chakraborty A and Mallick BK (2016). Fast sampling with Gaussian scale mixture priors in high-dimensional regression. Biometrika 103 985–991. MR3620452 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Chu J-H, Clyde MA and Liang F (2009). Bayesian function estimation using continuous wavelet dictionaries. Statist. Sinica 19 1419–1438. [Google Scholar]
  9. Codrescu MV, Fuller-Rowell TJ and Foster JC (1995). On the importance of E-field variability for Joule heating in the high-latitude thermosphere. Geophys. Res. Lett 22 2393–2396. [Google Scholar]
  10. Codrescu MV, Fuller-Rowell TJ, Foster JC, Holt JM and Cariglia SJ (2000). Electric field variability associated with the Millstone Hill electric field model. J. Geophys. Res 105 5265–5273. [Google Scholar]
  11. Cousins EDP, Matsuo T and Richmond AD (2013a). SuperDARN assimilative mapping. J. Geophys. Res 118 7954–7962. [Google Scholar]
  12. Cousins EDP, Matsuo T and Richmond AD (2013b). Mesoscale and large-scale variability in high-latitude ionospheric convection: Dominant modes and spatial/temporal coherence. J. Geophys. Res 118 7895–7904. [Google Scholar]
  13. Cousins EDP and Shepherd SG (2012). Statistical characteristics of small-scale spatial and temporal electric field variability in the high-latitude ionosphere. J. Geophys. Res 117 A03317. [Google Scholar]
  14. Cressie N (1993). Statistics for Spatial Data. Wiley, New York, NY. [Google Scholar]
  15. Cressie N and Johannesson G (2008). Fixed rank kriging for very large spatial data sets. J. R. Stat. Soc. Ser. B. Stat. Methodol 70 209–226. [Google Scholar]
  16. Cressie N, Shi T and Kang EL (2010). Fixed rank filtering for spatio-temporal data. J. Comput. Graph. Statist 19 724–745. With supplementary material available online. MR2732500 [Google Scholar]
  17. Dai F and Xu Y (2013). Approximation Theory and Harmonic Analysis on Spheres and Balls. Springer, New York, NY. [Google Scholar]
  18. De Oliveira V, Kedem B and Short DA (1997). Bayesian prediction of transformed Gaussian random fields. J. Amer. Statist. Assoc 92 1422–1433. [Google Scholar]
  19. Fan M (2015). A note on spherical needlets. Preprint Available at arXiv:1508.05406.
  20. Fan M, Paul D, Lee CMT and Matsuo T (2017). Modeling tangential vector fields on a sphere. J. Amer. Statist. Assoc To appear. [Google Scholar]
  21. Fan M, Paul D, Lee CMT and Matsuo T (2018). Supplement to “A multi-resolution model for non-Gaussian random fields on a sphere with application to ionospheric electrostatic potentials.” DOI: 10.1214/17-AOAS1104SUPP. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Gelman A, Roberts GO and Gilks WR (1996). Efficient Metropolis jumping rules. Bayesian Stat 5 599–608. [Google Scholar]
  23. Genton MG and Kleiber W (2015). Cross-covariance functions for multivariate geostatistics. Statist. Sci 30 147–163. MR3353096 [Google Scholar]
  24. Gneiting T (2013). Strictly and non-strictly positive definite functions on spheres. Bernoulli 19 1327–1349. [Google Scholar]
  25. Gneiting T, Balabdaoui F and Raftery AE (2007). Probabilistic forecasts, calibration and sharpness. J. R. Stat. Soc. Ser. B. Stat. Methodol 69 243–268. [Google Scholar]
  26. Gneiting T and Raftery AE (2007). Strictly proper scoring rules, prediction, and estimation. J. Amer. Statist. Assoc 102 359–378. [Google Scholar]
  27. Gneiting T and Ranjan R (2011). Comparing density forecasts using threshold-and quantileweighted scoring rules. J. Bus. Econom. Statist 29 411–422. [Google Scholar]
  28. Górski KM, Hivon E, Banday AJ, Wandelt BD, Hansen FK, Reinecke M and Bartelmann M (2005). HEALPix: A framework for high-resolution discretization and fast analysis of data distributed on the sphere. Astrophys. J 622 759. [Google Scholar]
  29. Guinness J and Fuentes M (2016). Isotropic covariance functions on spheres: Some properties and modeling considerations. J. Multivariate Anal 143 143–152. [Google Scholar]
  30. Heaton MJ, Kleiber W, Sain SR and Wiltberger M (2015). Emulating and calibrating the multiple-fidelity Lyon–Fedder–Mobarry magnetosphere-ionosphere coupled computer model. J. R. Stat. Soc. Ser. C. Appl. Stat 64 93–113. MR3293920 [Google Scholar]
  31. Hunsucker RD and Hargreaves JK (2007). The High-Latitude Ionosphere and Its Effects on Radio Propagation. Cambridge Univ. Press, Cambridge. [Google Scholar]
  32. Jones RH (1963). Stochastic processes on a sphere. Ann. Math. Stat 34 213–218. MR0170378 [Google Scholar]
  33. Jun M and Stein ML (2008). Nonstationary covariance models for global data. Ann. Appl. Stat 2 1271–1289. [Google Scholar]
  34. Kleiber W, Sain SR, Heaton MJ, Wiltberger M, Reese CS, Bingham D (2013). Parameter tuning for a multi-fidelity dynamical model of the magnetosphere. Ann. Appl. Stat 7 1286–1310. [Google Scholar]
  35. Kleiber W, Hendershott B, Sain SR and Wiltberger M (2016). Feature-based validation of the Lyon–Fedder–Mobarry magnetohydrodynamical model. J. Geophys. Res 121 11921200. [Google Scholar]
  36. Lindgren F, Rue H and Lindström J (2011). An explicit link between Gaussian fields and Gaussian Markov random fields: The stochastic partial differential equation approach. J. R. Stat. Soc. Ser. B. Stat. Methodol 73 423–498. [Google Scholar]
  37. Lyon JG, Fedder JA and Mobarry CM (2004). The Lyon–Fedder–Mobarry (LFM) global MHD magnetospheric simulation code. J. Atmos. Sol.-Terr. Phys 66 1333–1350. [Google Scholar]
  38. Marinucci D and Peccati G (2011). Random Fields on the Sphere: Representation, Limit Theorems and Cosmological Applications. Cambridge Univ. Press, Cambridge. [Google Scholar]
  39. Matsuo T and Richmond AD (2008). Effects of high-latitude ionospheric electric field variability on global thermospheric Joule heating and mechanical energy transfer rate. J. Geophys. Res 113 A07309. [Google Scholar]
  40. Matsuo T, Richmond AD and Hensel K (2003). High-latitude ionospheric electric field variability and electric potential derived from DE-2 plasma drift measurements: Dependence on IMF and dipole tilt. J. Geophys. Res 108 SIA 1–1–SIA 1–15. [Google Scholar]
  41. Matsuo T, Richmond AD and Nychka DW (2002). Modes of high-latitude electric field variability derived from DE-2 measurements: Empirical Orthogonal Function (EOF) analysis. Geophys. Res. Lett 29 11–1–11–4. [Google Scholar]
  42. Narcowich FJ, Petrushev P and Ward JD (2006). Localized tight frames on spheres. SIAM J. Math. Anal 38 574–594. [Google Scholar]
  43. Palacios MB and Steel MFJ (2006). Non-Gaussian Bayesian geostatistical modeling. J. Amer. Statist. Assoc 101 604–618. MR2281244 [Google Scholar]
  44. Palmroth M, Janhunen P, Pulkkinen TI, Aksnes A, Lu G, Østgaard N, Watermann J, Reeves GD and Germany GA (2005). Assessment of ionospheric Joule heating by GUMICS-4 MHD simulation, AMIE, and satellite-based statistics: Towards a synthesis. Ann. Geophysicae 23 2051–2068. [Google Scholar]
  45. Perron M and Sura P (2013). Climatology of non-Gaussian atmospheric statistics. J. Climate 26 1063–1083. [Google Scholar]
  46. Røislien J and Omre H (2006). T-distributed random fields: A parametric model for heavytailed well-log data. Math. Geol 38 821–849. [Google Scholar]
  47. Ruohoniemi JM and Baker KB (1998). Large-scale imaging of high-latitude convection with Super Dual Auroral Radar Network HF radar observations. J. Geophys. Res 103 20797–20811. [Google Scholar]
  48. Schoenberg IJ (1942). Positive definite functions on spheres. Duke Math. J 9 96–108. [Google Scholar]
  49. Stein ML (1988). Asymptotically efficient prediction of a random field with a misspecified covariance function. Ann. Statist 16 55–63. [Google Scholar]
  50. Stein ML (2007). Spatial variation of total column ozone on a global scale. Ann. Appl. Stat 1 191–210. MR2393847 [Google Scholar]
  51. Terdik G (2015). Angular spectra for non-Gaussian isotropic fields. Braz. J. Probab. Stat 29 833–865. [Google Scholar]
  52. Wallin J and Bolin D (2015). Geostatistical modelling using non-Gaussian Matérn fields. Scand. J. Stat 42 872–890. [Google Scholar]
  53. Weimer DR (1995). Models of high-latitude electric potentials derived with a least error fit of spherical harmonic coefficients. J. Geophys. Res 100 19595–19607. [Google Scholar]
  54. West M (1987). On scale mixtures of normal distributions. Biometrika 74 646–648. MR0909372 [Google Scholar]
  55. Wiltberger M, Rigler EJ, Merkin V and Lyon JG (2017). Structure of high latitude currents in magnetosphere-ionosphere models. Space Sci. Rev 206 575–598. [Google Scholar]
  56. Wolfe PJ, Godsill SJ and Ng W-J (2004). Bayesian variable selection and regularization for time-frequency surface estimation. J. R. Stat. Soc. Ser. B. Stat. Methodol 66 575–589. MR2088291 [Google Scholar]
  57. Womersley RS (2015). Efficient spherical designs with good geometric properties. Available at http://web.maths.unsw.edu.au/~rsw/Sphere/EffSphDes/.
  58. Xu G and Genton MG (2017). Tukey g-and-h random fields. J. Amer. Statist. Assoc 112 1236–1249. [Google Scholar]
  59. Zhang B, Lotko W, Wiltberger MJ, Brambles OJ and Damiano PA (2011). A statistical study of magnetosphere–ionosphere coupling in the Lyon–Fedder–Mobarry global MHD model. J. Atmos. Sol.-Terr. Phys 73 686–702. [Google Scholar]

RESOURCES