Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Mar 1.
Published in final edited form as: Adv Comput Math. 2020 May 4;46(3):42. doi: 10.1007/s10444-020-09791-1

Analytic regularity and stochastic collocation of high-dimensional Newton iterates

Julio E Castrillón-Candás 1, Mark Kon 1
PMCID: PMC7201586  NIHMSID: NIHMS1575300  PMID: 32377059

Abstract

In this paper we introduce concepts from uncertainty quantification (UQ) and numerical analysis for the efficient evaluation of stochastic high dimensional Newton iterates. In particular, we develop complex analytic regularity theory of the solution with respect to the random variables. This justifies the application of sparse grids for the computation of statistical measures. Convergence rates are derived and are shown to be subexponential or algebraic with respect to the number of realizations of random perturbations. Due the accuracy of the method, sparse grids are well suited for computing low probability events with high confidence. We apply our method to the power flow problem. Numerical experiments on the non-trivial 39 bus New England power system model with large stochastic loads are consistent with the theoretical convergence rates. Moreover, compared to the Monte Carlo method our approach is at least 1011 times faster for the same accuracy.

Keywords: Uncertainty Quantification, Newton-Kantorovich Theorem, Sparse Grids, Approximation Theory, Complex Analysis, Power Flow, Non-linear Stochastic Newtworks

1. Introduction

Newton iteration is a powerful method for solving many scientific and engineering problems, naturally arising in the context of power flow problems (Non-linear networks) [8], non-linear Partial Differential Equations [16], among others.

With the advent of massive computational resources, complex mathematical models are widely used for prediction in many scientific and engineering areas such as finance, weather forecasting, seismology, and semiconductor design. Due to the complex nature of these problems uncertainty naturally arises that can affect the reliability of these forecasts. Mathematically rigorous Uncertainty Quantification (UQ) has become an instrumental approach to judge the reliability of such predictions.

Uncertainty quantification is a mathematical approach that allows the charactizations of uncertainty through out the computational model and for a given Quantity of Interest (QoI). Even with present day computational resources, the application of UQ to large and complex non-linear networks such as electric power grids is a formidable undertaking. In particular, due the highly computational challenges. In this paper we seek to develop a UQ process that is mathematically rigorous, computationally efficient and easy to adapted with currently available power flow solvers [21, 32].

One of the most widely used UQ techniques is the Monte Carlo method [11], which is robust and easy to implement. Indeed, a deep analysis or understanding of the underlying stochastic model is not required, making this an attractive approach for the practicing engineer and scientist. However, convergence rates for iterative approximation methods can be very slow. For the application of UQ to the power flow problem, due to the large numbers of generators, loads and transmission lines one potentially faces, the problem can be high dimensional, non-linear, non-Gaussian, and not feasible with current computational resources. An alternative approach is the use of tensor product methods. However, these methods suffer significantly from the curse of dimensionality, thus making them unattractive even for moderate dimensionalities.

If the regularity of a QoI is relatively high with respect to the fundamental random variables, then application of stochastic collocation with Smolyak sparse grids [24, 25, 29] is a good choice. Indeed, this method has become popular in the field of computational applied mathematics and engineering as a surrogate model of stochastic Partial Differential Equations (sPDEs) [9] where the QoI is composed of moderately large numbers of random variables. The method is easy to implement and non-intrusive, i.e., each collocation point corresponds to uncoupled deterministic problems. The stochastic collocation method can be used with non-linear dependence of the QoI on the random variables. However, such grids still suffer from the curse of dimensionality. Alternative adaptive techniques have been developed, including anisotropic sparse grids [24], dimension adaptive quadrature [12] and quasi-optimal sparse grids [7, 26]. Yet these methods are still not feasible for very high dimensional problems and/or low regularity of the QoI. In addition, although quasi-optimal sparse grids lead to exponential convergence rates with respect to numbers of realizations, there is to our knowledge still no systematic way to construct them.

We have particular interest in the application of the Newton iteration to the solution of the power flow equations of electric grids [8]. In practice many of the generators (wind, solar, etc) and loads are stochastic in nature, and thus a traditional deterministic power flow analysis is insufficient. UQ applied to mathematical/statistical modeling of electrical power grids is still in its infancy. A major 2016 National Academy of Sciences report underscores the importance of this new area of research in its potential to contribute to this and the next generation of electric power grids [22]. The incorporation of uncertainty in the grid has gathered interest in the power system community.

In [15] Hockenberry et al. proposed the probabilistic collocation method (PCM) with applications to power systems. Although the results of this work are good, the uncertainty is computed only with respect to a single parameter. In [27] the authors test a polynomial chaos collocation and Galerkin approach to study the uncertainty of power flow on a small 2-bus power system. The results are good for low stochastic dimensions, but it is not clear how this approach will scale for large electric power grids, with higher associated dimensions. Moreover, there is little mathematical theory on the effectiveness of this approach.

More recently, in [17], Tang et al. proposed a dimension-adaptive sparse grid method [12] by using the off-the-shelf Matlab Sparse Grid Toolbox [18, 19]. The numerical results show the feasibility of this approach. However, there are also some weaknesses. The authors did not analyze the regularity of the power flow with respect to the uncertainty, so that the rate of convergence of the sparse grid is not known. Reduced regularity of the stochastic power flow can choke the accuracy. In contrast to Monte Carlo methods, which are robust, sparse grid methods are sensitive to the regularity of the function at hand. Lack of regularity will lead to erroneous results.

This motivates the application of numerical analysis and UQ theory to the Newton iteration for such problems as the electric power grid. Many of the ideas of this paper originate from the numerical solution of stochastic PDEs [9, 24, 25], where UQ is having a large impact. The goals of our paper is to integrate these methods with the theory and practice of power systems, where we believe that UQ will eventually have a strong impact. Furthermore, from the numerical analysis perspective regularity can be determined by complex analytic extensions of the functions of interest [3, 9, 31]. These lead to sharper convergence rates than regularity in terms of derivatives.

In Section 2 the mathematical background for this paper is introduced. In particular, the Newton-Kantorovich Theorem, sparse grids and convergence rates are discussed. Furthermore, complex analysis from the numerical analysis perspective for polynomial approximation is also treated. In Section 3 the complex analytic regularity of the solution of the Newton iteration is developed with respect to the random perturbations. In Section 4 the theory developed in Section 3 justifies the application of sparse grids to the power flow equations. The sparse grids are applied to the power flow equations of the 39 bus, 10 Generator, New England model. Subexponential or algebraic convergence rates are obtained that are consistent with the sparse grid convergence rates. These convergence rates make the sparse grid method suitable for computing stochastic moments to high accuracy. Furthermore, they will also be well suited for computing small event tail probabilities with high accuracy.

2. Mathematical background

In this section we introduce the general notation and mathematical background that will be used in this paper. We provide a summary of the three important topics: i) Newton-Kantorovich Theorem, ii) Stochastic spaces and iii) Sparse grids (approximation theory).

2.1. Newton-Kantorovich Theorem

Consider the Fréchet differentiable operator f : XY that maps a convex open set D of a Banach space X into a Banach space Y . Suppose that we are interested in finding an element xD such that

f(x)=0. (1)

Assuming that a solution of equation (1) exists, a series of successive approximations xvD, where v00, can be built. Consider the space of bounded linear operators L(Y,X) from Y into X. Let J(xv) be the Fréchet derivative of f(xv). Suppose that J(xv)−1L(Y,X) and consider the sequence

xv+1=xvJ(xv)1f(xv).

Assumption 1

We assume that that J satisfies the Lipschitz condition

J(x)J(y)λxy (2)

for some constant λ ≥ 0, and all x, yD. Furthermore, assume that x0D and there exist positive constants ϰ and δ such that

J(x0)1ϰ,
J(x0)1f(x0)δ, (3)
h2ϰλδ1,

U(x0, t*) ⊂ D, with U(x, r) the open ball {y : ‖yx‖ ≤ r} and t*=2h(11h)δ.

If Assumption 1 above is satisfied, then by the Newton-Kantorovich theorem [2], for all v0,

  1. The Newton iterates xv+1 = xvJ(xv)−1f(xv) and J(xv)−1 exist.

  2. xvU(x0, t*) ⊂ D.

  3. x* = limv→∞xv exists, x*U(x0,t*)¯, and f(x*) = 0 uniquely.

2.2. Stochastic spaces

Let be the set of outcomes from the complete probability space (, F, ), where F is a sigma algebra of events and is a probability measure. Define Lq(Ω), q ∈ [1, ∞], as the Banach spaces

Lq(Ω){u:Ω|Ω|u(ω)|qd(ω)<} and
L(Ω){u:Ω|-ess supωΩ|u(ω)|<}.

Let W := [W1,…,WN] be an N-component random vector measurable in (, F, ) that takes values on ΓΓ1××ΓNN, with Γn := [−1, 1]. Let B(Γ) be the Borel σ-algebra. Suppose that a induced measure μW on (Γ, B(Γ)) is defined as μW(A)(W1(A)) for all AB(Γ). Given that the induced measure is absolutely continuous with respect to Lebesgue measure on Γ, then there exists a density function ρ(q) : Γ → [0, +∞) such that

(WA)(W1(A))=Aρ(q)dq,

for any event AB(Γ). Furthermore for any measurable function W[L1(Γ)]N define the expected value as

E[W]=Γqρ(q)dq.

Define also the Banach spaces (for q ≥ 1)

Lρq(Γ){u:Γ|Ω|u(q)|qρ(q)dq<} and
Lρ(Γ){u:Γ|ρ-ess supqΓ|u(q)|<}.

Note that the above ρ-essential supremium is with respect to the measure induced by the density function ρ(·), rather than Lebesgue measure itself.

In general the density ρ(·) will not factorize into independent probability density functions, making higher dimensional manipulations difficult in some cases. In [3] the authors recommend use of an auxiliary probability density function ρ^:Γ+ that factorizes into N independent ones, i.e.,

ρ^(q)=n=1Nρ^(qn),q=(q1,,qN)Γ, (4)

where we will assume that ρ(q)ρ^(q)L(Γ)<. Note that in contrast L(Γ) (without a subscript) is with respect to the supremium in the Lebesgue measure.

2.3. Sparse grids

Our objective is to efficiently approximate a function u:Γ defined on high dimensional domains using global polynomials. The accuracy of the approach will directly depend on the regularity of the function. Let Pp(Γ)L2(Γ) be the span of tensor product polynomials of degree at most p = (p1, …,pN); i.e., Pp(Γ)=n=1NPpn(Γn) with Ppn(Γn)span(qnm,m=0,,pn), n = 1,…,N. For univariate polynomial approximation, we define a sequence of levels i = 0, 1, 2, 3,… corresponding to increasing degrees m(i)0 polynomial approximation for given coordinates. Here for a given approximation scheme, m(·) is a fixed function. In general our multivariate approximation scheme will assume different levels of approximation in for different coordinates n = 1,…,N.

We consider separate univariate Lagrange interpolants in Γ along each dimension n, given as Inm(in):C0(Γn)Pm(in)1(Γn). Specifically, let

Inm(in)(u(qn))jn=1m(in)u(qjni)ln,jn(qn), (5)

where {ln,j}j=1m(in) is a Lagrange basis for the space Ppn(Γn), the set {qjni}jn=1m(in), represents m(in) discrete locations (the interpolation knots) in Γn, the index in ≥ 0 is the level of approximation, and m(in)+ is the number of collocation nodes at level in+ where m(0) = 0, m(1) = 1 and m(in) ≤ m(in +1) if in ≥ 1.

Remark 1 Since m(in) represents the number of interpolation knots for the Lagrange basis, we have pn = m(in) + 1.

One of the most common approaches to constructing Lagrange interpolants in high dimensions is the formation of tensor products of Inm(in) along each dimension n. However, as N increases the dimension of Pp increases as n=1N(pn+1). Thus even for moderate dimensions N the computational cost of a Lagrange approximation becomes prohibitive. However, in the case of sufficient complex analytic regularity of the function u with respect to the random variables defined on Γ, a better choice is the application of Smolyak sparse grids. In the rest of this section the construction of the classical Smolyak sparse grid (see e.g. [6, 29]) is summarized. More details can be found in [4].

Consider the difference operator along the nth dimension given by

Δnm(in)Inm(in)Inm(in1). (6)

Given an integer w ≥ 0, called the approximation level, and a multi-index i=(i1,,iN)0N, let g:0N be a strictly increasing function in each argument and define a sparse grid approximation of function u(q) ∈ C0(Γ), restricted in order by g:

Swm,g[u(q)]=i0N:g(i)wn1N(Δnm(in))(u(q)). (7)

We observe that the sparse grid approximation is constructed from a linear combination of polynomial tensor product interpolations. However, the size of the polynomial space is controlled by g(i) ≤ w in (7).

Define the following Ntuple m(i) := (m(i1),…,m(iN)) and consider the set of polynomial multi-degrees

Λm,g(w)={pN,g(m1(p+1))w},

where 1 is an N dimensional vector of ones, and the associated multivariate polynomial space

Λm,g(w)(Γ)=span{n=1Nqnpn, with pΛm,g(w)}.

For a Banach space V let

C0(Γ;V){u:ΓV is continuous on Γ and maxyΓu(y)V<}.

It can be shown that the approximation formula given by Swm,g is exact in Λm,g(w)(Γ). We state the following proposition that is proved in [4].

Proposition 1

  1. For any uC0(Γ; V ), we have Swm,g[u]Λm,g(w)V.

  2. Moreover, Swm,g[u]=uuΛm,g(w)V.

Remark 2 The tensor product space Λm,g(w)V is more easily understood as the space of polynomials with Banach-valued coefficients. Furthermore, Swm,g[u]Λm,g(w)V is interpreted as a sparse grid approximation of a V -valued continuous function.

A good choice of m and g is given by the Smolyak sparse grid definitions (see [6, 29])

m(in)={1for in=12in1+1for in>1  and  g(i)=n=1N(in1).

Furthermore Λm,g(w){pN:nf(pn)w} where

f(pn)={0,pn=01,pn=1log2(pn),pn2.

Other common choices are shown in Table 1.

Table 1.

Sparse grid approximations formulas for TD and HC.

Approx. space sparse grid: m, g polynomial space: Λ(w)
Total Degree (TD) m(in) = in
g(i) = ∑n(in − 1) ≤ w
{p0N:npnw}
Hyperbolic Cross (HC) m(i) = i
g(i) = ∏n(in) ≤ w + 1
{p0N:Πn(pn+1)w+1}

With this choice of (m, g) and the application of Clenshaw-Curtis abscissas (which are locations of interpolation points given as extrema of Chebyshev polynomials) leads to nested sequences of one dimensional interpolation formulae. In consequence, the sparse grid formed by this choice is highly compressed in comparison to the full tensor product grid. Another good choice includes Gaussian abscissas [23]. For any choice of m(in) > 1 the Clenshaw-Curtis abscissas are given by

qjnin=cos(π(jn1)m(in)1),  jn=1,,m(in).

In Figure 1 an example of Clenshaw Curtis and Gaussian abscissas are shown for w = 5.

Fig. 1.

Fig. 1

Clenshaw-Curtis (left) and Gaussian abscissas (right) for w = 5 levels.

As previously pointed out, the probability density function ρ does not necessarily factorize in higher dimensions. As an alternative we use the auxiliary distribution ρ^, which factorizes as ρ^(q)=n=1Nρ^n(qn) and is close to the original distribution ρ(q). Suppose k is a given global index determined by the set of indices k1,kN as k = k1+p1(k2−1)+p1p2(k3−1)+p1p2p3(k4−1)+…. Given a function u : ΓV , the quadrature scheme Eρ^p[u] that approximates the integral E[u(q)]Γu(q)ρ^(q)dq can now be computed based on the distribution ρ^(q) as

Eρ^p[u]=k=1NPωku(q(k)),ωk=n=1Nωknωkn=Γnln,kn2(qn)ρ^n(qn)dqn,

with q(k)n the locations of the quadrature knots, and Np the number of Gauss quadrature points controlling the accuracy of the quadrature scheme. Recall that ln,kn2(qn) define Lagrange polynomials defined in equation (5). The term E[u(q)] can be approximated as

E[Swm,g[u(q)]]Eρ^p[Swm,g[u(q)]ρρ^],

and similarly the variance var[u(q)] is approximated as

var[u(q)]E[(Swm,g[u(q)])2]E[Swm,g[u(q)]]2***Eρ^p[(Swm,g[u(q)])2ρρ^]Eρ^p[Swm,g[u(q)]ρρ^]2.

Remark 3 The weights ωkn and node locations q(k) are computed from the auxiliary density ρ^. For standard distributions of ρ^ such as uniform and Gaussian, these are already tabulated to full accuracy. Otherwise they must be computed by solving for roots of orthogonal polynomials and using a quadrature scheme. However, the integrals involved are only one dimensional. See [3] (Section 2) for details.

We now develop some rigorous numerical bounds for the accuracy of the sparse grid approximation. Let Cmixk(Γ;) denote the space of functions with continuous mixed derivatives up to degree k:

Cmixk(Γ;)={u:Γ:α1,,αNuα1q1αNqNC0(Γ;),n=1,,N,αnk}

and equipped with the following norm:

uCmixk(Γ;)={u:ΓmaxqΓ|α1,,αNu(q)α1q1αNqN|<}.

Assume that uCmixk(Γ;). In [6] the authors show that it must follow that

uSwm,g[u]L(Γ)C(k,N)uCmixk(Γ)ηk(logη)(k+2)(N1)+1,

where η is the number of knots of the sparse grid Swm,g. However, the coefficient C(k,N) is in general not known [6].

If the function u admits a complex analytic extension, a better approach for deriving error bounds for the polynomial approximation arises from exploitation of analysis in the complex plane. In [25] the authors derive Lρ(Γ) bounds based on analytic extensions of u on a well defined region ΨN with respect to the variables q. These bounds are explicit, and the coefficients can be estimated and depend on the size of the region Ψ.

In [24, 25] error estimates are derived for the isotropic and anisotropic Smolyak sparse grids with the choice of Clenshaw-Curtis and Gaussian abscissas. Let η be the number of collocation knots and Eσ^1,,σ^NΠn=1NEn,σ^nΨN, where

En,σ^n={z with Re z=eδn+eδn2cos(θ),  Im z=eδneδn2sin(θ):θ[0,2π),σ^nδn0}

and σ^n>0 (see the Bernstein ellipse in Figure 2). The authors show that the error uSwm,g[u]Lρ(Γ) exhibits algebraic or sub-exponential convergence with respect to η (see Theorems 3.10, 3.11, 3.18 and 3.19 in [25] for more details) whenever uC0(Γ,) admits an analytic extension on the polyellipse Eσ^1,,σ^N.

Fig. 2.

Fig. 2

Bernstein ellipse along the nth dimension. The ellipse crosses the real axis at eσ^n+eσ^n2 and the imaginary axis at eσ^neσ^n2.

We now recall the definition of Chebyshev polynomials, useful in deriving error estimates on sparse grids. Let Tk:Γ1, k = 0, 1,…, be a kth order Chebyshev polynomial over [−1, 1]. These polynomials are defined recursively as:

T0(y)=1,T1(y)=y,,Tk+1(y)=2yTk(y)Tk1(y),.

The following theorem characterizes approximation of analytic functions using Chebyshev polynomials.

Theorem 1

Let u be analytic and absolutely bounded by M on Elogζ, ζ>1. Then the expansion

u(y)=α0+2k=1αkTk(y),

holds for all yElogζ where

αk=1π11u(y)Tk(y)1y2dy

and, additionally, |αk| ≤ M/ζk. Furthermore if y ∈ [−1,1] then

|u(y)α02k=1mαkTk(y)|2Mζ1ζm.

Proof See Theorem 8.2 in [31] □

We follow the arguments in [3, 23] and using the fact that the interpolation operator Inm(in) is exact on the space Ppn1, i.e. for any vPpn1 we have that Inm(in)(v)=v, it can be shown that if u is continuous on [−1,1] and has an analytic extension on Eσn we have, from Theorem 1,

(IInm(in))uLρ(Γn)(1+Λm(i))minvPm(in)1uvC0(Γ,)****(1+Λm(i))2M(u)eσn1eσnm(in),

where Λm(in) is the Lebesgue constant and is bounded by 2π−1(log(m − 1)+1) (see [3]) and M = M(u) is the maximal value of u on Eσn. Thus, for n = 1,…,N we have

(IInm(i))uLρ(Γn)M(u)C(σ^n)ineσn2in, (8)

where σn=σ^n2>0 and C(σn)2(eσn1). Recalling the definition of Δnm(in) from equation (6), we have that for all n = 1,…,N

Δ(u)m(in)Lρ(Γn)=(Inm(in)Inm(in1))uLρ(Γn)(IInm(in))uLρ(Γn)+(IInm(in1))uLρ(Γn)2M(u)C(σn)ineσn2in1. (9)

By applying equation (9) to Lemma 3.5 in [25], we are now in a position to slightly modify Theorems 3.10 and 3.11 in [25] and restate them into a single theorem given below. However, the following assumptions and definitions are first needed:

  • We set σ^minn=1,,Nσ^n, i.e. for an isotropic sparse grid the general sub-exponential decay will be restricted by the smallest σ^.

  • Let
    M˜(u)=supgEσ^1,,σ^N|u(g)|,
    σ=σ^/2,μ1=σ1+log(2N), and μ2(N)=log(2)N(1+log(2N)),
    a(δ,σ)exp(δσ{1σ log2(2)+1log(2)2σ+2(1+1log(2)π2σ)}),
    C˜2(σ)=1+1log2π2σ,δ*(σ)=e log(2)1C˜2(σ),
    C1(σ,δ,M˜(u))=4M˜(u)C(σ)a(δ,σ)eδσ,
    μ3=σδ*C˜2(σ)1+2 log(2N), and 
    Q(σ,δ*(σ),N,M˜(u))=C1(σ,δ*(σ),M˜(u))exp(σδ*(σ)C˜2(σ))max{1,C1(σ,δ*(σ),M˜(u))}N|1C1(σ,δ*(σ),M˜(u))|.

Theorem 2

Suppose that uC0(Γ;) has an analytic extension on Eσ^1,,σ^N and is absolutely bounded by M˜(u). If w > N / log 2 and a sparse grid with Clenshaw-Curtis abscissas is used, then the following bound is valid:

uSwm,guLρ(Γ)Q(σ,δ*(σ),N,M˜(u))ημ3(σ,δ*(σ),N) exp(Nσ21/Nημ2(N)), (10)

Furthermore, if wN/log 2 then the following algebraic convergence bound holds:

uSwm,guLρ(Γ)C1(σ,δ*(σ),M˜(u))max{1,C1(σ,δ*(σ),M˜(u))}N|1C1(σ,δ*(σ),M˜(u))|ημ1. (11)

Proof This is proved by applying the inequality of equation (9) to the proof of Theorems 3.10 and 3.11 in [25]. □

In many practical cases not all dimensions of Γ are equally important. In these cases the dimensionality of the sparse grid can be significantly reduced by means of anisotropic sparse grids. It is not hard to construct associated anisotropic sparse approximation formulae by having the restriction function g depend on the input random variables qn.

2.4. Analyticity

Throughout this paper we will apply several important complex analysis results. Suppose that f:U is a complex valued function defined on the open set U. Rewrite f into real and complex parts as f(z) := u(z) + iv(z), where z = x + iy. The Cauchy-Riemann equations are then stated as:

ux=vyuy=vx (12)

From the Cauchy-Riemann criterion (Chap1, p3 in [14]) if f satisfies equation (12) and is continuously differentiable on the real axis then f is analytic.

We state Hartog’s theorem (Chap1, p32 in [20]). This theorem allows us to determine analyticity for multivariate functions a series of single dimensional analytic functions.

Theorem 3 (Hartog’s theorem)

Let UN be an open set and f:U. Suppose that for each j = 1,…,N and each fixed z1,…,zj−1,zj+1,…,zN the function ψf(z1,…,zj−1,ψ,zj+1,…,zN) is holomorphic, in the classical one variable sense, on the set U(z1,,zj1,zj+1,,zN){ψ:(z1,,zj1,ψ,zj+1,,zN)U}. Then f is continuous on U.

From Osgood’s lemma continuity of the function f on U implies analyticity (Chap1, p2 in [14]).

3. Analyticity of the Newton iteration

It is profitable here for purposes of clarification to consider the Newton iteration in a general function space context. Specifically, let X and Y (see section 2) be Banach spaces. Consider the following problem: Find xDX such that

f(x,q)=0, (13)

where qΓ and f : D × ΓY . Equation (13) is then solved using the Newton iteration under the conditions of the Newton-Kantorovich Theorem (see section 2).

The convergence rate of the Newton iterates based on a sparse grid approximation as a function of grid size is directly affected by the regularity properties with respect to parameters qΓ. Regularity is characterized in terms of an analytic extension in N of the iterates.

In the sequel we will treat qΓ as a random parameter, which will be suppressed occasionally. Thus for example, below we will write ffq, JJq, MvMv,q, etc., and we can write f(x, q) ≡ fq(x). For any qΓ (which we fix for now) consider the Newton sequence

xv=Mv(xv1)xv1J(xv1)1f(xv1). (14)

where J : XY is the Fréchet derivative of f : DY . Assume that x0D0D and for all v let DvD be the successive images under the map Mv, so that Mv(Dv−1) = Dv. Note at this point the random parameter qΓ is unchanging throughout the iteration; the iterated domains Dv however depend on q.

Suppose that the parameter q is now extended to a complex parameter g with gΨΓ, where ΨN. We can now form a complex extension of the sequence (14) as follows. We will complexify the pair (x, q) into a pair of complex variables (z, g), with z the complexification of x. Assume that for f(x0) ≡ fq(x0) : D0E0 there exists an analytic extension f(x0) ≡ fg(z0) : Θ0Φ0, where Θ0 and Φ0 are contained in a suitable complex Banach spaces, which are the respective complex extensions of X and Y . Given a function f on a real linear domain D, and a function f* on a complex linear domain ΘD, we say that f* is an analytic extension of f if f* is analytic on its domain, and the restriction f*|D = f. When there exists an analytic extension f* we say that f can be analytically extended.

Remark 4 Note that as before we write ffg, JJg, MvMv,g, etc. It is understood from context that the notational equivalence is over the extension of the variables (x, q) into the complex pair (z, g).

Similarly, assume that J(x0) ∈ L(D0,E0) can be extended analytically as J(z0) ∈ L(Θ00). Here L(·,·) is the space of bounded operators between two spaces. Through the above complexifications, equation (12) defines a complexification of the mapping Mv. We now repeat the above iteration using the complexified maps defined here. Thus there exists a series of sets Θ0,…,Θv and Φ0,…,Φv, such that for f(xv) : DvEv and J(xv) ∈ L(Dv,Ev) the analytic extensions f(zv) : ΘvΦv and J(zv) ∈ L(Θvv) are onto. Thus the sequence (14) is extended in N as follows: Let z0 = x0 and for all v and gΨ (which we also fix) form the sequence

zv=Mv(zv1)zv1J(zv1)1f(zv1), (15)

where Mv : Θv−1Θv.

Remark 5 The domain Θ0 contains the initial condition z0. Under certain assumptions and with a judicious choice of Θ0 it can be shown that the sequence in (15) converges in a pointwise sense inside Θ0. This will be explored in detail in section 3.1.

Suppose that zv is an analytic extension of xv on ΨN (see Figure 3). Then the convergence rates of the sparse grid applied to any entry of interest of zv can be characterized. The size of the set Ψ determines the regularity properties of the solution. From the sparse grid discussion in section 2.3 we embed a polyellipse Eσ^1,,σ^NΠn=1NEn,σ^n in Ψ. From Theorem 10 the Lρ(Γ) convergence rate of the sparse grid is sub-exponential (or algebraic) with respect to the number of sparse grid knots η. The decay of the sparse grid is dominated by σ = minn=1,…,N σn. Thus, the larger σ is the faster the convergence rate.

Fig. 3.

Fig. 3

Analytic extension of the domain Γ. Any vector qΓ is extended in Ψ by adding a vector vN i.e. g = q + v.

Remark 6 For finite dimensional spaces X=Y=m, m0, the Fréchet derivative J corresponds to the Jacobian of f. In the rest of the paper it is assumed that Dm and ‖ · ‖ corresponds to the standard Euclidean norm or the standard matrix norm, depending on context. For the case of the power flow equations m will be simply related to the number of nodes of the power system [8]. We will be using the notion of analytic extensions, which can be defined as follows.

We can now prove an important theorem for our purposes. First, denote f(zv) : ΘvΦv as fv, and J(zv) ∈ L(Θvv) as Jv (note that this depends on the generic initial z0Θ0 at which the Jacobian is computed).

Theorem 4

Assume that for all v0:DvX and

  1. fv : DvEv can be analytically extended to fv : ΘvΦv.

  2. There exists a coefficient cv > 0 such that
    σmin([JRvJIvJIvJRv])cv,

    where σmin(·) refers to the minimum singular value, JRvRe Jv and JIvIm Jv.

Then for all v0 there exists an analytic extension of xv on Ψ.

Proof The main strategy for this proof is to use the Cauchy-Riemann equations. This avoids having to explicitly show that the inverse of the complex Jacobian matrix Jv is analytic. The existence of an analytic extension for zv for each separate complex dimension is shown. The Hartog’s theorem and Osgood’s lemma (Chap1, p2 in [14]) is then used to show analyticity with respect to all the complex dimensions.

For fixed v consider the extension xvzv = xv + wv in Θv, where wvm and zvΘv. In complex form zv=zRv+izIv, where zRv=Re zv, and zIv=Im zv. Furthermore, consider the extension of qg = q + v in Ψ, where vN and gΨ. The extension of the iteration (14) on Θv ×Ψ leads to the following block form iteration

[JRvJIvJIvJRv]([zRv+1zIv+1][zRvzIv])=[fRvfIv], (16)

where fRvRe fv and fIvIm fv. From (ii) it follows that equation (16) is well posed and is a valid extension of equation (14) on Θv × Ψ. We now show that zv+1 is an analytic extension on Θv × Ψ.

We focus our attention on the kth variable of zv as zkv and write it in complex form as zkv=s+iw. By differentiating equation (16) with respect to s and w we obtain

s[JRvJIvJIvJRv][[zRv+1zIv+1][zRvzIv]]+[JRvJIvJIvJRv]s[[zRv+1zIv+1][zRvzIv]]=s[fRvfIv]w[JRvJIvJIvJRv][[zRv+1zIv+1][zRvzIv]]+[JRvJIvJIvJRv]w[[zRv+1zIv+1][zRvzIv]]=w[fRvfIv]. (17)

From assumption (ii) we conclude that szRv+1, szIv+1, wzRv+1 and wzIv+1 exist on Θv × Ψ. The following step is to show that the Cauchy-Riemann equations for zv+1 are satisfied on Θv × Ψ.

Let P(zv)szRvwzIv and Q(zv)wzRv+szIv, then from equation (17)

[(sJRvwJIv)(sJIv+wJRv)(sJIv+wJRv)(sJRvwJIv)]([zRv+1zIv+1][zRvzIv])+[JRvJIvJIvJRv]([P(zv+1)Q(zv+1)][P(zv)Q(zv)])=[sfRvwfIvsfIv+wfRv].

Now, (i) implies that JvL(Dv,Ev) can be analytically extended to JvL(Θvv). Since J(zv, g) and fv(zv, g) are analytic on Θv ×Ψ then from the Cauchy-Riemann equations

[(sJRvwJIv)(sJIv+wJRv)(sJIv+wJRv)(sJRv+wJIv)]=0 and [sfRvwfIvsfIv+wfRv]=0.

Since zkv is a linear polynomial of s + iw then P(zv) = Q(zv) = 0 on CN and thus P(zv+1) = Q(zv+1) = 0 on Θv × Ψ. We conclude that zv+1 is analytic for the kth variable for all zvΘv and gΨ. Following a similar argument we can show that for l = 1,…,N the lth variable extension of q has leads to an analytic extension of zv+1 whenever zvΘv and gΨ. We now extend the analyticity of zv+1 on all of Θv × Ψ.

Since zkv+1 is analytic for all k = 1,…,m and the lth variable of q has an analytic extension for all l = 1,…,N whenever zvΘv and gΨ, then from Hartog’s theorem we conclude that zv+1 is continuous on Θv × Ψ. From Osgood’s lemma it follows that zv+1 is analytic on Θv ×Ψ. From an induction argument and using that fact that the composition of analytic functions is analytic then it follows that zv+1 is analytic in Ψ, for all v. □

If the assumptions of Theorem 4 are satisfied then zv is complex analytic in Ψ and it is reasonable to construct a series of sparse grid surrogate models of the entries of the vector xv. Note that in practice we restrict out attention to a subset of the variables of interest of xv. With a slight abuse of notation denote Swm,g[xv(q)] as the sparse grid approximation of the entries of interest of the vector xv.

From Theorem 2 we observe that the accuracy of the sparse grid approximation is a function of i) the size of the polyellipse Eσ^1,,σ^NΨ and ii)

M˜(zv)=supzvΘv,k=1,,m|zkv(g)|.

Remark 7 It is important to note that if the complex sequence (15) does not converge, then the size of the sets Θv can become unbounded. In particular, it is possible that M˜(zv) as v → ∞ even if J(zv)−1L(Φvv) exists for all v0. Thus the sparse grid error bound given by Theorem 2 explodes. A control of the size of the sets Θv are need. To this end we shall use the Newton Kantorovich theorem to show that the sets Θv are bounded and there exists a constant cv > 0 such that σmin([JRvJIvJIvJRv])cv>0.

Our objective now is to analyze under what conditions the complex sequence remains bounded. In particular, for all v0, we ask if it is possible to construct bounded regions Um and ΨN such that zv is contained in U and thus

M˜(zv)supzvUzv.

To help answer this question we first show that the complex sequence (15) is itself a Newton sequence.

Remark 8 We have to clarify what we mean by the Fréchet derivative of the complex function f : ΘvΦv. The algebraic problem of equation (13) can be complexified as follows: Find zΘ0 such that f(z, g) = 0 for all gΨ. This can be re-written in vector form as: Find z = zR + izIΘ0 such that Ref(zR, zI, gR, gI) = 0 and Im f(zR, zI, gR, gI) = 0 for all g = gR +igIΨ. The corresponding Newton iteration is based on

[zRfRvzIfRvzRfIvzIfIv]([zRv+1zIv+1][zRvzIv])=[fRvfIv], (18)

where zRfRv is the Fréchet derivative of fRv with respect to the variables zR and similarly for the rest. We refer to the matrix

Jzv[zRfRvzIfRvzRfIvzIfIv]

as the Fréchet derivative of f : ΘvΦv.

Lemma 1

Suppose assumption i) of Theorem 4 are satisfied. Then the complex analytic extension J(zv) ∈ L(Θv, Φv) of J(xv) ∈ L(Dv,Ev) is equivalent to the Fréchet derivative of fv : ΘvΦv, i.e.

[JRvJIvJIvJRv]=[zRfRvzIfRvzRfIvzIfIv].

Proof We first prove this result for m = 1 dimension. Suppose that f:D, is a Fréchet differentiable function and let f:Ξ be the analytic continuation on the non-empty open set Ξ. The analytic function f:Ξ can be rewritten as f(x, y) = fR(x, y) + ifI(x, y) for all x + iyΞ. Since f is analytic on Ξ, from the the identity theorem [1] (uniqueness of complex analytic extensions) we have that fR and fI are unique in Ξ. Furthermore, since f is analytic the Cauchy-Riemann equations are satisfied. Thus

[xfRxfIxfIxfR]=[xfRyfRxfIyfI] (19)

in Ξ and f:Ξ is Fréchet differentiable. Now, xf(x, y) = xfR(x, y) + i∂xfI(x, y) in Ξ, and from the uniqueness property of the Fréchet derivative all the terms are unique. Recall that xf(x) defined in D is the Fréchet derivative of f:D. Write the analytic extension of xf(x) (defined in D) on Ξ as g(x, y) + ih(x, y), with x + iyΞ. Since xf(x) = xf(x, y) = xfR(x, y) + i∂xfI(x, y) for y = 0 and xD, from the uniqueness of the analytic extension we conclude g = xfR and h = xfI for all x + iyΞ. From equation (19) the conclusion follows.

We can now prove our statement for the general case using a simple extension of the above argument. Since fv : ΘvΦv is complex analytic, from the identity theorem [1] it is the unique extension of fv : DvEv. (Note that the unique extension of the identity theorem applies in multi-variate case, which includes the variables xv and q in the domains Θv and Ψ respectively.) From the Cauchy-Riemann equations the functions fv : ΘvΦv are Fréchet differentiable and unique. Now, with a slight abuse of notation, denote JZRv as the Jacobian of fv : ΘvΦv, with respect to the real variables ZRv only. By using the above one dimensional argument we can show that the analytic extension of each entry of J(xv) matches JZv on the real part of Θv. From the Cauchy-Riemann equations we conclude that J(zv) ∈ L(Θv, Φv) is equivalent to the Fréchet derivative of fv : ΘvΦv. □

From Theorem 1 it follows that the complex sequence (15) is a Newton sequence. We can now apply the Newton-Kantorovich Theorem to study the sequence convergence as v → ∞.

3.1. Regions of Analyticity

The size of a polyellipse embedded in the domain Ψ and the magnitude of zvΘv (for any v0) directly impacts the accuracy of the sparse grid (c.f. Theorem 2). For each v0 the size of the domains Θv and Ψ will be characterized by the magnitude of the minimum singular value

σmin([JRvJIvJIvJRv])cv>0 (20)

for some cv > 0. However, constructing the domain Ψ would require imposing inequality conditions for each Newton iteration. This leads to a highly complex coupled problem that is hard to solve. Moreover, if the complex extension zv grows rapidly with respect to v then the size of the domain Ψ will be most likely severely constrained. In contrast, by applying the Newton-Kantorovich Theorem it is sufficient to impose conditions on the initial Jacobian (v = 0) to construct a region of analyticity for ΨN. Furthermore, the size of the iteration zv will be controlled.

Consider the iteration

αv+1=αvJ(αv,g)1f(αv,g), (21)

where gΨ,

α0[x00],  αv[zRvzIv],  J(αv,g)[JRvJIvJIvJRv], and f(αv,g)[fRvfIv],

for all v0.

Remark 9 From Lemma 1 or, alternatively, the Cauchy-Riemann equations, the matrix J(αv,g) corresponds to the Fréchet derivative of f(αv,g). Thus the sequence (21) is an Newton iteration and the Newton-Kantorovich Theorem can be used to analyze its convergence properties.

Assumption 2

For all qΓ Assumption 1 is satisfied.

Assumption 3

Assume that D˜, where DD˜, is an open convex set in 2m and the following Lipschitz condition is satisfied:

J(x,g)J(y,g)λexy,

for all x, yD˜, gΨ, and λe ≥ 0. Furthermore assume that for all gΨ

J(α0,g)1xe,
J(α0,g)1f(α0,g)δe,
he=2xeλeδe1,

and U(α0,te*)D˜, where te*=2he(11he)δe.

Theorem 5

If Assumption 3 is satisfied then for all gΨ

  1. The Newton iterates αv+1 = αv + J(αv, g)−1f(αv, g) exist and αv  U(α0,te*)D˜.

  2. α* := limv→∞ αv exists, α*U(α0,te*)¯, and f(α*, g) = 0.

  3. J(αv, g)−1 exists for all v and equation (20) is satisfied.

Proof From Theorem 1 and Remark 9 we have that the iterates αv+1 = αv + J(αv, g)−1f(αv, g) are a valid Newton sequence. In particular, J(αv, g) corresponds to the Fréchet derivative of f(αv, g). From Assumption 3 and the Newton-Kantorovich theorem [2] the result follows.

Remark 10 Recall from condition (ii) of Theorem 4 for each v the minimum singular value of the Jacobian matrix J(αv, g) is bounded by a constant cv > 0. The constant cv can be obtained from the proof of the Newton-Kantorovich theorem. See the details of the proof of Theorem 2.2.4 in [2]. In particular, the proof of this theorem shows the existence of a sequence of constants bv such that σmax(J(αv, g)−1) ≤ bv (Equation (2.2.21) on page 44). Note that the sequence b0,…,bv depends on the initial parameters ϰe, δe and λe , i.e. bv(ϰe, δe, λe).

Remark 11 From Assumptions 2 and 3 and from the fact that extended Newton iteration is a valid extension of the sequence (14) then we have that λeλ, ϰeϰ, δeδ, heh. This implies that te*t* and therefore U(x0,t*)U(α0,te*) for all gΨ (See Figure 4). From the Newton-Kantorovich Theorem it follows that that αvU(α0,te*) for all  v0.

Fig. 4.

Fig. 4

Region of convergence U(α0,te*) for the extended Newton iteration.

We can construct a region Ψ such that for all gΨ the extended Newton iteration converges. Let y=[yRyI]=[RegImg], and apply the multivariate Taylor theorem for each k = 1,…,n, l = 1,…,n entry of the Jacobian matrix JR(α0, g). Evaluating g at q + v, we have that

[JR(α0,q+vR,0+vI)]k,l=[JR(x0,q,0)]k,l+Rk,l(x0)][vRvI],

vR := Rev, vI := Im v, and

Rk,l(x0)=[Rk,l1(x0),,Rk,l2m(x0)].

The entries of the remainder term Rk,l(x0) are bounded by

|Rk,lβ(x0)|maxt(0,1)|yβ[JR(x0,[q0]+t[vRvI])]k,l|,

where refers to the derivative of the βth variable of the vector y. Form the matrix

E[0JI(x0,q,vR,vI)JI(x0,q,vR,vI)0]+[Q(x0,q,vR,vI)00Q(x0,q,vR,vI)]

and let

Qk,l(x0,q,vR,vI)=Rk,l(x0)[vRvI]

be the k = 1,…,n, l = 1,…,n entry of the matrix Q. Then

[JR(x0,q)JI(x0,q,vR,vI)JI(x0,q,vR,vI)JR(x0,q)]=J+E=J(I+J1E),

where J[JR(x0,q)00JR(x0,q)].

Theorem 6

Suppose that ϰeϰ and

E(α0,g)<1xxex

whenever gΨ then

J(α0,g)1xe.

Proof First note that

J(α0,g)1(J(I+J1E))1J1(I+J1E)1. (22)

From Lemma 2.2.3 in [13], if J1E2<1 then (I+J1E) is invertible and

(I+J1E)1<11J1E.

Given that J(x0,q)1x (From Assumption 1) whenever qΓ, it follows

J1I+J1E<x1J1E and J1EJ1ExE. (23)

We conclude that if

E(α0,g)<1xxex

whenever gΨ, then from Equations (22) and (23)

J(α0,g)1xe.

Applying the multivariate Taylor’s theorem for each k = 1,…,n, entry of the vector fR(α0, g) where g = q + v, we have

[fR(α0,q+vR,0+vI)]k=[fR(x0,q,0)]k+Sk(x0,q,0)][vRvI],

where vR := Rev, vI := Im v, and

Sk(x0,q,0)=[Sk1(x0,q,0),,Sk2m(x0,q,0)].

The remainder term Sk(x0, q) is bounded by

|Skβ(x0,q,0)|maxt(0,1)|yβ[fR([q0]+t[vRvI])]k|,

where refers to the derivative of the βth variable of y. We can now rewrite the vector f(α0, g) as

f(α0,g)=F(x0,q)+G(x0,q,vR,vI),

where F[fR(x0,q)0], G[P(x0,q,vR,vI)fI(x0,q,vR,vI)] and

Pk(x0,q,vR,vI)=Sk(x0,q,0)[vRvI].

Theorem 7

Suppose that ϰeϰ and δeδ. Then if

G(α0,g)<δexeδx

whenever gΨ, it follows

J(α0,g)1f(α0,g)δe.

Proof For each of the entries k = 1,…,n of the vector P.

J(α0,g)1f(α0,g)=(J+E)1(F+G)(I+J1E)J1(F+G)***(I+J1E)J1F+(I+J1E)J1G***I+J1EJ1F+I+J1EJ1G***I+J1Eδ+I+J1EGx. (24)

Since E2<1xxex<x (from Theorem 6) from Lemma 2.2.3 in [13] it follows that (I+J1E) is invertible and

(I+J1E)1<11J1Exex. (25)

Combining equations (24) and (25) we have

J(α0,g)1f(α0,g)<xex(δ+G(x0,q,vR,vI)x).

The result follows. □

From the values of ϰ, δ, λ and ϰe, δe, λe and Theorems 6 and 7, the region of analyticity Ψ can be constructed. From this region, convergence rates from Theorem 2 for the sparse grid interpolation can be estimated. If we are interested in forming a sparse grid for each of the entries of the vector xvn, then there are potentially n sparse grids. To estimate the convergence rate of the sequence of sparse grids it is sufficient to embed a polydisk Eσ1,,σNΨ for a suitable set of coefficients {σ1 …,σN}. Furthermore since αvU(α0,te*), the maximal coefficient M˜(zv) can be bounded as

M˜(zv)te*+x0l2(2m).

4. Application to power flow

The theory developed in Section 3 can be applied to the computation of the statistics of stochastic power flow. In particular we concentrate on the random perturbations of the generators, loads and admittance uncertainty of the transmission lines. Much of the power system network model presented in this section is based on [8].

Consider a network with m+1 mechanical constant power generators. The electrical power injected into the network at each generator is given by

PGk=l=0mVkVlsin(θkθl+φk,l)|Yk,l|, (26)

where the operands of the summation are the power from bus k transmitted to bus l through a line with admittance Yk,l = Gk,l + iBk,l, phase shift φk,l, and voltage Vk at the buses. These form the algebraic constraints of the power system. The dynamic constraints at generator k are given by

Mkθ¨i+Dkθ˙i+PGk=PMk+PIk(ω)+PLk(ω) (27)

where Mk is the moment of inertia of generator i, Dk is the damping factor, PMk denotes the mechanical power, PLk(ω) is the stochastic load and PIk(ω) is the intermittent stochastic power applied to bus i. Equations (26) and (27) constitute the swing equation model. Since the intermittent power generators and loads are stochastic, the rotor angle θk(ω) and power generation PGk(ω) will be stochastic as well. A simple example of a 3 bus power system is shown in Figure 5. From the steady state response the power flow equations are given by

Pk(x)=l=0mVkVl[Gikcos(θkθl)+Biksin(θkθl)]
Qk(x)=l=0mVkVl[Giksin(θkθl)+Bikcos(θkθl)]

for k = 0,…,m.

Fig. 5.

Fig. 5

2 generators, 3 buses, 1 load simple power system example. This figure is modified from [10]. Bus 1 is the slack bus. Bus 2 contains a stochastic generator. Bus 3 contains the random load. Note that voltages and power flows are in p.u.

It is assumed that at each node the active and reactive power injections (or loads) are given by P1,…,Pm and Q1,…,Qm. The first bus is assumed to be slack bus with known angle θ0 = 0 and fixed voltage V0.

Remark 12 According to power system convention, the numbering of the buses (nodes) starts with 1 instead of 0. To simplify the notation in this section we start from 0. However, for the examples and numerical results we revert to the power system standard.

In this paper we limit our discussion of power flow to the case where the power injections P1,…,Pm and Q1,…,Qm are assumed to be known, but could be stochastic. The unknowns are formed by the angles θ1,…,θm and voltages V1,…,Vm. The power flow equations are solved with a Newton iteration and posed as

θ[θ1θm],V[V1Vm],x[θV],f(x)=[ΔP(x)ΔQ(x)],

where

ΔP(x)[P1(x)P1Pm(x)Pm] and ΔQ(x)[Q1(x)Q1Qm(x)Qm].

The Jacobian matrix is given in block form as

J=[J11J12J21J22],

where J11, J12, J21, J22n×n. For k, l = 1,…,m let θk,l := θkθl and if kl

Jk,l11=VkVlGk,lsin(θk,l)Bk,l(cos(θk,l)),
Jk,l21=VkVlGk,lcos(θk,l)+Bk,l(sin(θk,l)),
Jk,l12=VkGk,lcos(θk,l)+Bk,l(sin(θk,l)),
Jk,l22=VkGk,lsin(θk,l)Bk,l(cos(θk,l)),

otherwise

Jk,k11=Qk(x)Bk,kVk2Jk,k21=Pk(x)Gk,kVk2Jk,k12=Pk(x)Vk+Gk,kVkJk,k22=Qk(x)VkBk,kVk.

It is clear from the structure of f and the Jacobian J that they are analytic everywhere except for Vk = 0, for k = 1,…,m. However, in practice the domain Θ0 is chosen such that the origin is avoided. Otherwise the analyticity assumptions of Theorem 4 are not satisfied.

There are many forms of uncertainty that can be present in the solution of the power flow equations. We concentrate on the following cases:

  • Random loads: The power loads Pk and Qk, k = 1,…,m, will be a function of the random vector qΓ:
    Pk+iQk=Pk0(1+ckqk)+Qk0(1+ck+1qk+1),

    where Pk0 and  Qk0 are the nominal power loads (or generators), qk ∈ [−1, 1] and ck,ck+1.

  • Random admittances: The transmission line admittances Yk,l will be functions of the random vector qΓ. Let A be the set of network index tuples (k, l) such that the admittance is stochastic. Thus for all k,lA let
    Yk,l=Gk,l+iBk,l=Gk,l0(1+ck,l,1qk,l,1)+Bk,l0(1+ck,l,2qk,l,2),
    where Gk,l0 and Bk,l0 are the nominal conductance and susceptance, qk,l,1, qk,l,2 ∈ [−1, 1] and ck,l,1,ck,l,2. Note that with a slight of abuse of notation the vector q consists of all the stochastic random variables {qk,l,1,qk,l,2}(k,l)A.

For sufficiently small coefficients ck, ck+1 with k = 1,…,m and ck,l,1, ck,l,2 for all tuples (k,l)A the assumptions of Theorems 5, 6 and 7 are satisfied for some initial condition x0 and thus we can justify the use of the sparse grids. Due to the extent of a detailed analysis of the size of these coefficients, it is left for a future work emphasizing the details of power systems. However, in Appendix A we present the case for random generators and loads.

We test the sparse grid approximation on the New England 39 Bus, 10 Generator, power system model provided from the Matpower 6.0 steady state simulator [21, 32]. In this model buses 1 – 29 are PQ buses, buses 30, 32–39 are generators and bus 31 is the reference (slack).

The following numerical examples indicate that the conclusions of the analyticity theorems we proved are valid i.e. algebraic and sub-exponential convergence of the stochastic norm with respect to the random variables. This is despite the conservative bounds placed on the analyticity region of Ψ from the Newton-Kantorovich Theorem. We expand on this point in the conclusion section.

Two tests are performed. We randomly perturb either the loads or the admittances of the transmission lines. The mean and variance of the voltage V22 at bus 22 are computed. The mean E[V22] and variance var[V22] are computed with the Clenshaw-Curtis isotropic Sparse Grid Matlab Kit [4, 30] for N = 2, 4, 12 dimensions and up to the w = 7 level. This last level, w = 7 is taken as the “true” solution. The errors are computed up to level w = 4 with respect to this solution. Two tests are performed:

  • Random loads: The loads are considered stochastic and are perturbed by up to ± 50% of their nominal value. For each k = 1,…,N, the so-called kth PQ bus is stochastically perturbed as
    Pk+iQk=Pk0(1+qk2)+Qk0(1+qk2),

    where Pk0 and Qk0 are the nominal power loads, qk ∈ [−1, 1], ρ(qk) has a uniform distribution and the random variables q1,…,qN are independent. Note that although the load random perturbations are independent, the power flows will be dependent on all the random variables q1,…,qN.

    In Figure 6 (a) & (b) the mean and variance convergence error for the stochastic voltage V22 of bus 22 are shown. A surrogate model based on the sparse grid operator is formed as Swm,g[V22] with Clenshaw-Curtis abscissas. Each of the circles corresponds to a sparse grid Swm,g starting with level w = 1 up to level w = 4. The y-axis corresponds to the error of the mean or variance. The x-axis is the number of sparse grid knots needed to form the grid Swm,g. The dimension of the sparse grid is given by N = 2, 4, 12.

    From Figure 6 (a) & (b) we observe that the error decreases faster than polynomially with respect to the number of knots η. As we increase the number w of levels, sub-exponential convergence is achieved. This is much faster than the η12 convergence rate of the Monte Carlo method. For example, for N = 12, then mean is computed approximately 1011 times faster for the same accuracy. This is the difference between 2 hours of computation on a simple 4 core processor and 20 million years with Monte Carlo. However, as the number of dimensions N increases the convergence rate of the sparse grid decreases, as predicted by Theorem 2. Moreover, if the level w is not large enough then the error bound gives algebraic convergence.

  • Random transmission line admittances: The admittances of the network are assumed to be random with
    Yk,l=Gk,l+iBk,l=Gk,l0(1+qk,l,12)+Bk,l0(1+qk,l,22),

    where Gk,l0 and Bk,l0 are the nominal conductance and susceptance. The coefficients qk,l,1, qk,l,2 ∈ [−1,1] have a uniform distribution and are all independent. Figure 6 (c) & (d) indicate sub-exponential convergence of the mean and variance of the voltage V22 at bus 22 for a sufficiently large number of knots. However, as the number of stochastic dimensions N increases to 12, the convergence rate decreases and almost approaches polynomial convergence. From Theorem 1 the sufficient condition w > N/log2 leads to subexponential convergence. For N = 12 we have that w has to be larger than 18 to guarantee sub-exponential convergence. In Figures (c) & (d) the largest level for w is 4.

Fig. 6.

Fig. 6

Sparse grid convergence rates. (a) & (b) Mean and variance error of the voltage V22 of bus 22 given a stochastic load perturbation with dimension N and the number of knots of the sparse grid. (c) & (d) Mean and variance of error of the voltage V22 of bus 22 given a random admittance with dimension N. Notice that for all 4 cases the convergence rates are faster than polynomial, indicating a sub-exponential convergence rate.

5. Conclusions

In this paper we have introduced ideas from UQ and numerical analysis, typically used in the field of stochastic PDEs, and applied them to non-linear stochastic networks. More specifically, these ideas are applied to the Newton iteration. We have developed a regularity analysis of the solution with respect to the random perturbations. Under sufficient conditions based on the Newton-Kantorovich Theorem there exists analytic extensions of the solution of the Newton iteration. These indicate that the application of sparse grids for the computation of the stochastic moments leads to sub-exponential or algebraic convergence. For a moderate number of dimensions the convergence rates are much faster than traditional Monte Carlo approaches (η12). In addition, numerical experiments applied to the power flow problem confirm these subexponential and algebraic convergence rates.

A weakness in the application of the Newton-Kantorovich Theorem is that it constricts the size of the region of analyticity Ψ, thus leading to a conservative convergence rate of the sparse grid. This motivates the application of less restrictive methods such as damped Newton iterates [5]. In addition, if we incorporate the assumption that all the Newton iterations converge for each of the knots of the space grid, then by developing an a posteriori method convergence rates can be further improved.

Future work includes the important application of this method to the security constrained problem [28] from the probabilistic perspective. In other words, given stochastic perturbations of the loads and sources what are the optimal power injections into the grid such that the probability of failure is below a tolerance level. Current approaches rely on simplifications of the stochastic perturbations to deal with the high dimensions. However, this can lead to suboptimal results. The high dimensional stochastic quadrature approach developed in this paper will allow more optimal results.

A Analyticity regions for random generators and loads

We examine the case of random generators and loads to obtain convergence rates of the sparse grid. The task is to synthesize an analyticity region Ψ for the extended Newton iteration to converge. We only check that the conditions of Theorem 7 are satisfied and assume that 6 is satisfied. A full analysis will be done in a future work. Thus, we synthesis the region of analyticity for Ψ by checking that

G(α0,g)<δexeδx,

whenever gΨ. Without loss of generality assume that the first τm buses contain stochastic power generators (or loads) with active power P1(ω) := (q1 + v1,R + iv1,I)c1 + a1,,…,Pτ(ω) := (qτ +vq,R+ivq,I)cτ +aτ and reactive power Q1(ω) := (qτ+1 +vτ+1,R + ivτ+1,I)cτ+1 +aτ+1,…, Qτ(ω) := (qN +vN,R +ivN,I)cN +aN. The random vector qΓ is assumed to be stochastic with joint distribution ρ(q) and for all k = 1,...,N vk := vk,R + ivk,I is the complex extension of each qkΓk. Furthermore, for k = 1,…,N the variables ak and ck+, where ak + ck and ak indicate the maximum and minimum range of the stochastic perturbation. Thus we have

f(α0)[P1(x0)P1(ω)Pτ(x0)Pτ(ω)Pτ+1(x0)Pτ+1Pm(x0)PmQ1(x0)Q1(ω)Qτ(x0)Qτ(ω)Qτ+1(x0)Qτ+1Qm(x0)Qm],
Re f(α0)=[P1(x0)(q1+v1,R)c1+a1Pτ(x0)(qτ+vq,R)cτ+aτPτ+1(x0)Pτ+1Pm(x0)PmQ1(x0)(qτ+1+vq+1,R)cτ+1+aτ+1Qτ(x0)(qN+vN,R)cN+aNQτ+1(x0)Qτ+1Qm(x0)Qm],
Im f(α0)=[v1,Ic1    vτ,Icτ  0    0  |vτ+1,Icτ+1    vN,IcN  0    0]T,

and therefore

PP=[(v1,R+v1,I)c1(vτ,R+vτ,I)cτ00],PQ=[(vτ+1,R+vτ+1,I)cτ+1(vN,R+vN,I)cN00],

P=[PPQQ] and thus G22=k=1N(vk,R+vk,I)2ck2+k=1Nvk,I2ck2. The last equality is due to the integral remainder form of Taylor’s Theorem. Let vk,R = vR/ck and vk,I = vI/ck for k = 1,…,2N, where vR, vI, then for any ϵ > 0

G22=NvR2+2NvI2+2NvRvI***vR2N(1+2ϵ)+vI2N(2+ϵ1/2)***(δexeδx)2.

The last inequality is obtained by using Cauchy’s inequality. Let γeδexeδx, thus the inequality

vR2α2+vI2β21, (28)

where α2γe2N(1+2ϵ) and β2γe2N(2+ϵ1/2), forms an elliptical region Σ (See Figure 7) such that ‖G2γe.

Fig. 7.

Fig. 7

Embedding of Bernstein ellipse Eσ in the domain Φ.

Consider the region Φ := {g = q+vR +ivI |q ∈ [−1,1], (vR, vI) ∈ Σ} where ‖G2γe is satisfied. Suppose that we want to embed a Bernstein ellipse Eσ in Φ. To achieve this consider the foci points of Eσ at −1 and 1. It is not hard to show that eσeσ2eσ+eσ21 for σ > 0. At the foci point ±1 trace the ellipse from Equation (28) and set β=eσeσ2 (See Figure 7).

Choose an ϵ > 0 such that Eσ is embedded in Φ by solving the following equation

αβ=4+ϵ12(1+2ϵ)

leading to

ϵ=(c2+4)12c+24c>0,

where c := (α/β)2 > 0. Pick ϵ > 0 such that c = (α/β)2 = 1. Pick σ > 0 such that eσeσ2=β. This leads to

σ=log(β+(β2+1)12)

and eσ+eσ21β=α. The region bounded by the ellipse Eσ is therefore embedded in Φ.

A polyellipse in CN can now be constructed such that such that ‖G2γe. Recall that vk,R = vR/ck and vk,I = vI/ck for k = 1,…,N and consider the regions Ψk := {g = q + vk,R + ivk,I |q ∈ [−1, 1], (vR, vI) ∈ Σ}. By following the procedure for embedding Eσ in Φ for each k = 1,…,N an ellipse Eϱk can be embedded in Ψk with

ϱklog(βck+(β2ck2+1)12).

The polyellipse εϱ1,,ϱNEϱ1××EϱN is embedded in Ψ1 × ⋯ × ΨN. Thus for any gεϱ1,,ϱN

G(α0,g)γe.

With the region bounded by the polyellipse εϱ1,,ϱN the convergence rate of the sparse grid can be estimated with respect to the magnitude of the coefficients ck, for k = 1,...,N.

Remark 13 From this analysis we observe that the size of the analyticity region of Ψ depends directly on

β=γe2N(2+ϵ1/2),

where ϵ=5+14. The size of the region Ψ decays as square root with respect to the number of stochastic dimensions N, thus reducing the convergence rate of the sparse grid.

Example 1 Consider the 3 Bus simple power system with stochastic load and generator based on Example 10.6 in [8] and Figure 5. Bus 1 is the slack bus with Inline graphic. Bus 2 voltage is fixed as V2 = 1.05 p.u. and contains a stochastic generator PE = q1c1 + a1, where q1 ∈ [−1, 1] with a1 = 0.6661 p.u. and c1+, i.e. the generator is random within the range 0.6661±c1 . Bus 3 contains the random load with PL+iQL = (q2c2+a2)+i(q3c3+a3), where q2,q3 ∈ [−1, 1] with a2 = 2.8653 p.u., a3 = 1.2244 p.u. and c2, c3+, i.e. the load is random within the range 2.8653 ± c2 of the active power and 1.2244 ± c3 for the reactive power. The admittance matrix is to the network is

Ybusi[201010102010101020].

The vector of unknowns is x := [θ2, θ3, V3]T and f(x, q) = [P1(x)−q1c1a1,P2(x)−q2c2a2,P3(x)−q3c3a3]T for all qΓ and therefore f(α, g) = [P1(α)−(q1+v1,R+iv1,I)c1+a1,P2(α)−(q2+v2,R+iv2,I)c2+a3, P3(α)−(q3+v3,R+iv3,I)c3+a3]T for all gΨ. With q = 0 the Newton algorithm converges in 20 iterations to x* = [−5.2361×10−2,−1.7445×10−1, 0.9500]T with 10−15 tolerance. Furthermore,

J(x*)=(0.0654270.0338470.00135310.0338470.071120.0112730.0012840.0106970.065493)  and x=J1(x*)2=0.1043.

Since there is no direct stochastic components (q1, q2, q3) in the Jacobian matrix, then

J(x)=[10.5(cos(θ2)+V3cos(θ2θ3))10.5V3cos(θ2θ3)10.5sin(θ2θ3)10.5V3cos(θ3θ2)10V3cos(θ3)+10.5V3cos(θ3θ2)10.5sin(θ3)+10.5sin(θ3θ2)10.5V3sin(θ3θ2)10.5V3(sin(θ3)+sin(θ3θ2))(10cos(θ3)+10.5cos(θ3θ2)39.96V32)].

From the mean value theorem we have that

λ=k,l=1mJk,l(x)L(D×Γ)<,

where D is a bounded set, Jk,l(x) is the kth row and lth column entry of the Jacobian matrix J(x).

With the initial condition x0 = x*(0), for small enough coefficients c1, c2 and c3 we have that:

  1. J−1(x0)‖ ≤ ϰ = 0.1043.

  2. Furthermore, since f is continuous and f(x0, 0) = 0 then ‖J−1(x0)f(x0, q)‖ ≤ δ < ∞, where δ is arbitrarily small.

  3. It follows that h < 1 for all xB(x*(0),t*)¯D where t*=2h(11h)δ.

  4. From the Newton-Kantorovich theorem the iteration converges for all qΓ and

V3(q)L(Γ)=supxB(x0,t*)¯|x[3]|t*+x*(0)l2(m).

Now, pick δe > 0 such that δ < δe and also pick ϰe = ϰ, λe = λ such that h < he ≤ 1. From the random load analysis we have γeδe1xeδx and

G(α0,g)γe

for all gEϱ1,ϱ2,ϱ3 where

β=((1+5)6(2+5))12γe

and for k = 1,…,3

ϱklog(βck+(β2ck2+1)12).

Assuming that Theorem 6 is satisfied, then from Theorem 7 it follows that whenever gΨ then

J(x0)1f(α0,g)δe,

the limit of the Newton iteration converges and is holomorphic in Eϱ1,ϱ2,ϱ3Ψ3.

Acknowledgments

This material is based upon work supported by the National Science Foundation under Grant No. 1736392. Research reported in this technical report was supported in part by the National Institute of General Medical Sciences (NIGMS) of the National Institutes of Health under award number 1R01GM131409–01.

References

  • 1.Ablowitz MJ, Fokas AS (2003) Complex variables : introduction and applications, 2nd edn Cambridge, UK ; New York: : Cambridge University Press [Google Scholar]
  • 2.Argyros IK (2008) Convergence and Applications of Newton-type Iterations. Springer [Google Scholar]
  • 3.Babuska I, Nobile F, Tempone R (2010) A stochastic collocation method for elliptic partial differential equations with random input data. SIAM Review 52(2):317–355, DOI 10.1137/100786356, http://epubs.siam.org/doi/pdf/10.1137/100786356 [DOI] [Google Scholar]
  • 4.Bäck J, Nobile F, Tamellini L, Tempone R (2011) Stochastic spectral Galerkin and collocation methods for PDEs with random coefficients: A numerical comparison In: Hesthaven JS, Rnquist EM (eds) Spectral and High Order Methods for Partial Differential Equations, Lecture Notes in Computational Science and Engineering, vol 76, Springer Berlin; Heidelberg, pp 43–62 [Google Scholar]
  • 5.Bank RE, Rose DJ (1982) Analysis of a multilevel iterative method for nonlinear finite element equations. Mathematics of Computation 39(160):453–465 [Google Scholar]
  • 6.Barthelmann V, Novak E, Ritter K (2000) High dimensional polynomial interpolation on sparse grids. Advances in Computational Mathematics 12:273–288 [Google Scholar]
  • 7.Beck J, Nobile F, Tamellini L, Tempone R (2014) Convergence of quasi-optimal stochastic Galerkin methods for a class of PDEs with random coefficients. Computers & Mathematics with Applications 67(4):732 – 751, DOI 10.1016/j.camwa.2013.03.004, URL http://www.sciencedirect.com/science/article/pii/S0898122113001569, high-order Finite Element Approximation for Partial Differential Equations [DOI] [Google Scholar]
  • 8.Bergen AR, Vittal V (2000) Power systems analysis, 2nd edn Pearson/Prentice Hall, [Google Scholar]
  • 9.Castrillón-Candás JE, Nobile F, Tempone R (2016) Analytic regularity and collocation approximation for PDEs with random domain deformations. Computers and Mathematics with applications 71(6):1173–1197 [Google Scholar]
  • 10.Fiandrino C (2013) How can I do a power electric system in circuitikz? tex.stackexchange.com/questions/145197/how-can-i-do-a-power-electric-system-in-circuitikz
  • 11.Fishman GS (1996) Monte Carlo : concepts, algorithms, and applications Springer series in operations research, Springer, New York, Berlin, URL http://opac.inria.fr/record=b1079070, with 98 illustrations (p. de titre) [Google Scholar]
  • 12.Gerstner T, Griebel M (2003) Dimension-adaptive tensor-product quadrature. Computing 71(1):65–87 [Google Scholar]
  • 13.Golub GH, Van Loan CF (1996) Matrix Computations (3rd Ed.). Johns Hopkins University Press, Baltimore, MD, USA [Google Scholar]
  • 14.Gunning R, Rossi H (1965) Analytic Functions of Several Complex Variables. American Mathematical Society [Google Scholar]
  • 15.Hockenberry JR, Lesieutre BC (2004) Evaluation of uncertainty in dynamic simulations of power system models: The probabilistic collocation method. IEEE Transactions On Power Systems 19(3) [Google Scholar]
  • 16.Holst M (1994) The Poisson-Boltzmann equation: Analysis and multilevel numerical solution, 1st edn. Applied Mathematics and CRPC, California Institute of Technology [Google Scholar]
  • 17.J Tang FN, Ponci F, Monti A (2015) Dimension-adaptive sparse grid interpolation for uncertainty quantification in modern power systems: Probabilistic power flow. IEEE Transactions On Power Systems 19 [Google Scholar]
  • 18.Klimke A (2007) Sparse Grid Interpolation Toolbox – user’s guide Tech. Rep IANS report 2007/017, University of Stuttgart [Google Scholar]
  • 19.Klimke A, Wohlmuth B (2005) Algorithm 847: spinterp: Piecewise multilinear hierarchical sparse grid interpolation in MATLAB. ACM Transactions on Mathematical Software 31(4) [Google Scholar]
  • 20.Krantz SG (1992) Function Theory of Several Complex Variables. AMS Chelsea Publishing, Providence, Rhode Island [Google Scholar]
  • 21.Murillo-Sánchez CE, Zimmerman RD, Anderson CL, Thomas RJ (2013) Secure planning and operations of systems with stochastic sources, energy storage, and active demand. IEEE Transactions on Smart Grid 4(4):2220–2229 [Google Scholar]
  • 22.National Academies of Sciences, Engineering, and Medicine (2016) Analytic Research Foundations for the Next-Generation Electric Grid. The National Academies Press, Washington, DC, DOI 10.17226/21919 [DOI] [Google Scholar]
  • 23.Nobile F, Tempone R (2009) Analysis and implementation issues for the numerical approximation of parabolic equations with random coefficients. International Journal for Numerical Methods in Engineering 80(6–7):979–1006, DOI 10.1002/nme.2656 [DOI] [Google Scholar]
  • 24.Nobile F, Tempone R, Webster C (2008) An anisotropic sparse grid stochastic collocation method for partial differential equations with random input data. SIAM Journal on Numerical Analysis 46(5):2411–2442, http://epubs.siam.org/doi/pdf/10.1137/070680540 [Google Scholar]
  • 25.Nobile F, Tempone R, Webster C (2008) A sparse grid stochastic collocation method for partial differential equations with random input data. SIAM Journal on Numerical Analysis 46(5):2309–2345, http://epubs.siam.org/doi/pdf/10.1137/060663660 [Google Scholar]
  • 26.Nobile F, Tamellini L, Tempone R (2016) Convergence of quasi-optimal sparse-grid approximation of Hilbert-space-valued functions: application to random elliptic PDEs. Numerische Mathematik 134(2):343–388, DOI 10.1007/s00211-015-0773-y, URL 10.1007/s00211-015-0773-y [DOI] [Google Scholar]
  • 27.Prempraneerach P, Hover F, Triantafyllou M, Karniadakis G (2010) Uncertainty quantification in simulations of power systems: Multi-element polynomial chaos methods. Reliability Engineering & System Safety 95(6):632 – 646, DOI 10.1016/j.ress.2010.01.012, URL http://www.sciencedirect.com/science/article/pii/S0951832010000281 [DOI] [Google Scholar]
  • 28.Roald L, Oldewurtel F, Krause T, Andersson G (2013) Analytical reformulation of security constrained optimal power flow with probabilistic constraints. In: 2013 IEEE Grenoble Conference, pp 1–6, DOI 10.1109/PTC.2013.6652224 [DOI] [Google Scholar]
  • 29.Smolyak S (1963) Quadrature and interpolation formulas for tensor products of certain classes of functions. Soviet Mathematics, Doklady 4:240–243 [Google Scholar]
  • 30.Tamellini L, Nobile F (2009–2015) Sparse grids matlab kit. http://csqi.epfl.ch/page-107231-en.html
  • 31.Trefethen LN (2012) Approximation Theory and Approximation Practice (Other Titles in Applied Mathematics) Society for Industrial and Applied Mathematics, Philadelphia, PA, USA [Google Scholar]
  • 32.Zimmerman RD, Murillo-Sánchez CE, Thomas RJ (2011) Matpower: Steady-state operations, planning, and analysis tools for power systems research and education. IEEE Transactions on Power Systems 26(1):12–19 [Google Scholar]

RESOURCES