Kernel‐based active subspaces with application to computational fluid dynamics parametric problems using the discontinuous Galerkin method

Francesco Romor; Marco Tezzele; Andrea Lario; Gianluigi Rozza

doi:10.1002/nme.7099

. 2022 Sep 6;123(23):6000–6027. doi: 10.1002/nme.7099

Kernel‐based active subspaces with application to computational fluid dynamics parametric problems using the discontinuous Galerkin method

Francesco Romor ¹, Marco Tezzele ¹, Andrea Lario ¹, Gianluigi Rozza ^1,^✉

PMCID: PMC9825883 PMID: 36632376

Abstract

Nonlinear extensions to the active subspaces method have brought remarkable results for dimension reduction in the parameter space and response surface design. We further develop a kernel‐based nonlinear method. In particular, we introduce it in a broader mathematical framework that contemplates also the reduction in parameter space of multivariate objective functions. The implementation is thoroughly discussed and tested on more challenging benchmarks than the ones already present in the literature, for which dimension reduction with active subspaces produces already good results. Finally, we show a whole pipeline for the design of response surfaces with the new methodology in the context of a parametric computational fluid dynamics application solved with the discontinuous Galerkin method.

Keywords: active subspaces, dimension reduction, discontinuous Galerkin, kernel methods, ridge approximation

1. INTRODUCTION

Nowadays, in many industrial settings, the simulation of complex systems requires a huge amount of computational power. Problems involving high‐fidelity simulations are usually large‐scale, moreover the number of solutions required increases with the number of parameters. In this context, we mention optimization tasks, inverse problems, optimal control problems, and uncertainty quantification; they all suffer from the curse of dimensionality, that is, in this case, the computational time grows exponentially with the dimension of the input parameter space. Data‐driven reduced order methods (ROMs) ¹ ^, ² ^, ³ have been developed to deal with such costly outer loop applications for parametric PDEs, but the limit for high dimensional parameter spaces remains.

One approach to alleviate the curse of dimensionality is to identify and exploit some notion of low‐dimensional structure of the model or objective function that maps the inputs to the outputs of interest. A possible linear input coordinate transformation technique is the sliced inverse regression (SIR) ⁴ approach and its extensions. ⁵ ^, ⁶ ^, ⁷ Sharing some characteristics with SIR, there is the active subspaces (AS) property^*, ⁸ ^, ⁹ ^, ¹⁰ ^, ¹¹ which, in the last years, has emerged as a powerful linear data‐driven technique to construct ridge approximations using gradients of the model function. AS has been successfully applied to quantify uncertainty in the numerical simulation of the HyShot II scramjet, ¹² and for sensitivity analysis of an integrated hydrologic model. ¹³ Reduction in parameter space has been coupled with model order reduction techniques ¹⁴ ^, ¹⁵ ^, ¹⁶ to enable more complex numerical studies without increasing the computational load. We mention the use of AS in cardiovascular applications with POD‐Galerkin, ¹⁷ in nonlinear structural analysis, ¹⁸ in nautical and naval engineering, ¹⁹ ^, ²⁰ ^, ²¹ ^, ²² coupled with POD with interpolation for structural and computational fluid dynamics (CFD) analysis, ²³ ^, ²⁴ and with dynamic mode decomposition in Reference ²⁵. Applications in automotive engineering within a multi‐fidelity setting can be found in Reference ²⁶, for turbomachinery, see Reference ²⁷, while for results in chemistry, see References 28 and 29. Advances in efficient global design optimization with surrogate modeling are presented in References 30 and 31 and applied to the shape design of the $N + 2$ Supersonic Passenger Jet. Applications to enhance optimization methods have been developed in References 32, 33, 34, 35. AS has also been successfully used to reduce the memory consumption of highly parameterized systems such as artificial neural networks. ³⁶ ^, ³⁷

Possible extensions and variants of the active subspaces property are the local active subspace method, ³⁸ the active manifold method ³⁹ which reduces the problem to the analysis of a 1D manifold by traversing the level sets of the model function at the expense of high online costs, the shared active subspace method, ⁴⁰ the active subspaces property for multivariate functions, ¹¹ and more recently an extension of AS to dynamical systems. ⁴¹ Another method is nonlinear level set learning (NLL) ⁴² which exploits RevNets to reduce the input parameter space with a nonlinear transformation.

The search for low dimensional structures is also investigated in machine learning with manifold learning algorithms. In this context, the active subspaces methodology can be seen as a supervised dimension reduction technique along with kernel principal component analysis (KPCA) ⁴³ and supervised kernel principal component analysis (SKPCA). ⁴⁴ Other methods in the context of kernel‐based ROMs are. ⁴⁵ ^, ⁴⁶ ^, ⁴⁷ In Reference ⁴⁸, a nonlinear extension of the active subspaces property based on random Fourier features ⁴⁹ ^, ⁵⁰ is introduced and compared with machine learning manifold learning algorithms for the construction of Gaussian process regressions (GPR). ⁵¹

From the preliminary work, ⁴⁸ in the context of supervised dimension reduction algorithms in machine learning, we develop the kernel‐based active subspaces (KAS) method. The novelties of our contribution are the following:

Regarding the AS theoretical background, we provide an upper bound of the ridge approximation error (2) for vector‐valued objective functions and for a wide collection of probability distributions (see Assumption 3).
We extend kernel‐based AS to vector‐valued model functions and develop a detailed algorithmic procedure for the optimization of the feature map. We also test different spectral measures (see Equation (20) for the definition), differently from Reference ⁴⁸ where only the Gaussian measure is employed.
The application to several test problems of increasing complexity. In particular, we mainly test KAS on problems where the active subspace is not present or the behavior is not linear, differently from Reference ⁴⁸, where the comparison is made with KPCA and its variants on datasets with linear trends in the reduced parameter space, apart from the hyperparaboloid test case that we have also included among our toy problems.
The KAS method is finally applied to a computational fluid dynamics problem and compared with the standard AS technique. We study the evolution of fluid flow past a NACA 0012 airfoil in a duct composed by an initialization channel and a chamber. The motion is modeled with the unsteady incompressible Navier–Stokes equations, and discretized with the discontinuous Galerkin (DG) method. ⁵² Physical and geometrical parameters are introduced and sensitivity analysis of the lift and drag coefficients with respect to these parameters is provided.

The work is divided as follows: In Section 2, we briefly present the active subspaces property of a model function with a focus on the construction of Gaussian process response surfaces. Then, Section 3 illustrates the novel method called kernel‐based active subspaces for both scalar and vector‐valued model functions. Several tests to compare AS and KAS are provided in Section 4 where we start from scalar functions with radial symmetry, we analyze an epidemiology model and a vector‐valued output generated from a stochastic elliptic PDE. A parametric CFD test case for the study of the flow past a NACA airfoil using the DG method is presented in Section 5. Finally, we outline some perspectives and future studies in Section 6.

2. ACTIVE SUBSPACES FOR PARAMETER SPACE REDUCTION

Active subspaces (AS) approach proposed in Reference ⁸ and developed in Reference ⁹ is a technique for dimension reduction in parameter space. In brief, AS are defined as the leading eigenspaces of the second moment matrix of the model function's gradient (for scalar model functions) and constitutes a global sensitivity index. ¹¹ In the context of ridge approximation, the choice of the active subspace corresponds to the minimizer of an upper bound of the mean square error obtained through Poincaré‐type inequalities. ¹¹ After performing dimension reduction in the parameter space through AS, the method can be applied to reduce the computational costs of different parameter studies such as inverse problems, optimization tasks and numerical integration. In this work we are going to focus on the construction of response surfaces with Gaussian process regression.

Definition 1

(Hypothesis on input and output spaces) The quantities related to the input space are:

$m \in ℕ$ the dimension of the input space.

$(Ω, ℱ, P)$ the probability space.

$X : (Ω, ℱ, P) \to ℝ^{m}$ , the absolutely continuous random vector representing the parameters.

$ρ : ℝ^{m} \to ℝ$ , the probability density of $X$ with support $𝒳 \subset ℝ^{m}$ .

The quantities related to the output are:

$d \in ℕ$ the dimension of the output space.

$V = (ℝ^{d}, R_{V})$ the Euclidean space with metric $R_{V} \in ℳ (d \times d)$ ^† and norm
$‖ x ‖_{R_{V}}^{2} = x^{T} R_{V} x .$

$f : 𝒳 \subset ℝ^{m} \to V$ , the quantity/function of interest, also called objective function in optimization tasks.

Let $ℬ (ℝ^{m})$ be the Borel $σ$ ‐algebra of $ℝ^{m}$ . We will consider the Hilbert space $L^{2} (ℝ^{m}, ℬ (ℝ^{m}), ρ; V)$ , of the measurable functions $f : (ℝ^{m}, ℬ (ℝ^{m}), ρ) \to (ℝ^{d}, R_{V})$ such that

‖ f ‖_{L^{2}}^{2} : = \int_{𝒳} ‖ f (x) ‖_{R_{V}}^{2} d ρ (x) \leq \infty;

and the Sobolev space $H^{1} (ℝ^{m}, ℬ (ℝ^{m}), ρ; V)$ of measurable functions $f : (ℝ^{m}, ℬ (ℝ^{m}), ρ) \to (ℝ^{d}, R_{V})$ such that

‖ f ‖_{H^{1}}^{2} : = ‖ f ‖_{L^{2}}^{2} + ‖ \nabla f ‖_{L^{2}}^{2} = ‖ f ‖_{L^{2}}^{2} + | f |_{H^{1}}^{2} \leq \infty,

(1)

where $\nabla f$ is the weak derivative of $f$ , and $‖ \nabla f ‖_{L^{2}} = : | f |_{H^{1}}$ .

We briefly recall how dimension reduction in parameter space is achieved in the construction of response surfaces. The first step involves the approximation of the model function with ridge approximation. We will follow References 11 and 53 for a review of the method.

The ridge approximation problem can be stated in the following way:

Definition 2

(Ridge approximation) Let $ℬ (ℝ^{m})$ be the Borel $σ$ ‐algebra of $ℝ^{m}$ . Given $r \in ℕ, r ≪ d$ and a tolerance $ϵ \geq 0$ , find the profile $h : (ℝ^{m}, ℬ (ℝ^{m}), ρ) \to V$ and the $r$ ‐rank projection $P_{r} : ℝ^{m} \to ℝ^{m}$ such that

$𝔼_{P} [‖ f (X) - h (P_{r} X) ‖_{R_{V}}^{2}] \leq ϵ^{2} .$ (2)

In particular, we are interested in the minimization problem

\underset{P_{r} \in ℳ (m \times m)}{arg min} 𝔼_{P} [‖ f (X) - \tilde{h} (P_{r} X) ‖_{R_{V}}^{2}],

(3)

where $\tilde{h} = 𝔼_{ρ} [f | σ (P_{r})]$ is the conditional expectation of $f$ under the distribution $ρ$ given the $σ$ ‐algebra $σ (P_{r})$ . The range of the projector $P_{r}$ , $ℝ^{r} \sim Im (P_{r}) \subset ℝ^{m}$ , is the reduced parameter space. The kernel of the projector $P_{r}$ , $ℝ^{m - r} \sim Im (P_{r}) \subset ℝ^{m}$ , is the inactive subspace. The existence of $\tilde{h}$ is guaranteed by the Doob–Dynkin lemma. ⁵⁴ The function $\tilde{h}$ is proven to be the optimal profile for each fixed $P_{r}$ , as a consequence of the definition of the conditional expectation of a random variable with respect to a $σ$ ‐algebra.

Dimension reduction is effective if the inequality (2) is satisfied for a specific tolerance. The choice of $r$ is certainly of central importance. The dimension of the reduced parameter space can be chosen a priori for a specific parameter study (e.g., $r$ ‐dimensional regression), it can be chosen in order to satisfy the inequality (2) or it is determined to guarantee a good accuracy of the numerical method used to evaluate it. ^{55(Corollary3.10)}

Dividing the left term of the inequality (2) with $𝔼_{ρ} [‖ f (X) - 𝔼_{ρ} [f (X]) ‖_{R_{V}}^{2}]$ we obtain the Relative Root Mean Square Error (RRMSE) and since it is a normalized quantity, we will use it to make comparisons between different models

RRMSE = \sqrt{\frac{𝔼_{P} [‖ f (X) - h (P_{r} X) ‖_{R_{V}}^{2}]}{𝔼_{P} [‖ f (X) - 𝔼_{P} [f (X)] ‖_{R_{V}}^{2}]}} .

(4)

We remark that $P_{r}$ is not unique. It can be shown that if $\tilde{h}$ is the optimal profile, then $P_{r}$ is not uniquely defined and can be chosen arbitrarily from the set ${Q_{r} : ℝ^{m} \to ℝ^{m} | \ker Q_{r} = \ker P_{r}}$ , see Proposition 2.2 in Reference ¹¹.

The following lemma is the key ingredient in the proof of the existence of an active subspace. It is inherently linked to probability Poincaré inequalities of the kind

\int_{𝒳} ‖ h (x) ‖_{L^{2}}^{2} d ρ (x) \leq C_{P} (𝒳, ρ) \int_{𝒳} ‖ \nabla h (x) ‖_{L^{2}}^{2} d ρ (x),

(5)

for zero‐mean functions in the Sobolev space $h \in H^{1} (𝒳)$ , where $C_{P} (𝒳, ρ)$ is the Poincaré constant dependent on the domain $𝒳$ and on the probability density functions (p.d.f.), $ρ$ . We need to make the following assumption to prove the next lemma and the next theorem.

Definition 3

The probability density function $ρ : 𝒳 \to ℝ$ belongs to one of the following classes:

$𝒳$ is convex and bounded, $\exists δ, D > 0 : 0 < δ \leq ‖ ρ (x) ‖_{L^{\infty}} \leq D < \infty \forall x \in 𝒳$ .

$ρ (x) \sim \exp (- V (x))$ where $V : ℝ^{m} \to (- \infty, \infty], V \in 𝒞^{2}$ is $α$ ‐uniformly convex
$\begin{align} u^{T} Hess (V (x)) u \geq α ‖ u ‖_{2}^{2}, \forall x, u \in ℝ^{m}, \end{align}$ (6)
where $Hess (V (x))$ is the Hessian of $V (x)$ .

$ρ (x) \sim \exp (- V (x))$ where $V$ is a convex function. In this case, we require also $f$ Lipschitz continuous.

In particular, the uniform distribution belongs to the first class, the multivariate Gaussian distribution $𝒩 (m, \sum)$ to the second with $α = 1 / (σ_{\max} (\sum))$ and the exponential and Laplace distributions to the third. A complete analysis of the various cases is done in Reference ⁵³.

Proposition 1

Let $(Ω, ℱ, P)$ be a probability space, $X : (Ω, ℱ, P) \to ℝ^{m}$ an absolutely continuous random vector with probability density function $ρ$ belonging to one of the classes from Assumption 3 . Then the following inequality is satisfied

$\begin{align} 𝔼_{ρ} [{(h - 𝔼_{ρ} [h | σ (P_{r})])}^{2} | σ (P_{r})] \leq C_{P} (P_{r}, ρ) 𝔼_{ρ} [‖ (I - P_{r}^{T}) \nabla h ‖_{2}^{2} | σ (P_{r})] \end{align}$ (7)

for all scalar functions $h \in H^{1} (𝒳)$ and for all $r$ ‐rank orthogonal projectors, $P_{r}$ , where $C_{P} (P_{r}, ρ)$ is the Poincaré constant depending on $P_{r}$ and on the p.d.f. $ρ$ .

A summary of the values of the Poincaré constant in relationship with the choice of the probability density function $ρ$ is reported in Reference ⁵³.

In the next theorem, the projection $P_{r}$ will depend on the output function $f$ , so also the Poincaré constant $C_{P} (P_{r}, ρ)$ will depend in fact on $f$ .

We introduce the following notation for the matrix that substitutes the uncentered covariance matrix of the gradient $\nabla f$ in the case of the application of AS to scalar model functions ⁵⁵

H = \int_{𝒳} {(D_{x} f (x))}^{T} R_{V} (ρ) (D_{x} f (x)) d ρ (x),

where $D_{x} f (x) \in ℳ (d \times m)$ is the Jacobian matrix of $f$ . The matrix $R_{V} (ρ)$ depends on the class which $ρ$ belongs to, see the Appendix.

Theorem 1

(Existence of an active subspace) Under Hypothesis 1 , let $f \in H^{1} (ℝ^{m}, ℬ (ℝ^{m}), ρ; V)$ and let the p.d.f. $ρ$ satisfy Lemma 1 and Assumption 3 . Then the solution ${\tilde{P}}_{r}$ of the ridge approximation problem 2 is the orthogonal projector to the eigenspace of the first $r$ ‐eigenvalues of $H$ ordered by magnitude

$H v_{i} = λ_{i} v_{i} \forall i \in {1, \dots, m}, {\tilde{P}}_{r} = \sum_{j = 1}^{r} v_{j} \otimes v_{j},$

with $r \in ℕ$ chosen such that

$𝔼_{ρ} [‖ f - \tilde{h} ‖_{R_{V}}^{2}] \leq C (C_{P}, τ) {(\sum_{i = r + 1}^{m} λ_{i})}^{\frac{1}{1 + τ}} \leq ϵ^{2},$ (8)

with $C (C_{P}, τ)$ a constant depending on $τ > 0$ related to the choice of $ρ$ and on the Poincaré constant from lemma 1 , and $\tilde{h} = 𝔼_{ρ} [f | σ (P_{r})]$ is the conditional expectation of $f$ given the $σ$ ‐algebra generated by the random variable $P_{r} \circ X$ .

This theorem summarizes the results from Propositions 2.5 and 2.6 of Reference ¹¹, and from Lemmas 3.1, 4.2–4.4, and Theorem 4.5 of Reference ⁵³. The proof is expanded in the Appendix.

The eigenspace $span {v_{1}, \dots, v_{r}} \subset ℝ^{m}$ is the active subspace and the remaining eigenvectors generate the inactive subspace $span {v_{r + 1}, \dots, v_{m}} \subset ℝ^{m}$ . The condition $f \in L^{2} (ℝ^{m}, ℬ (ℝ^{m}), ρ; V)$ is necessary for $f$ to satisfy the error bound (2).

For the explicit procedure to compute the active subspace given its dimension $r$ , see Algorithm 1: from $W_{1}$ and $W_{2}$ we define the approximations of the projector $P_{r}$ with ${\hat{P}}_{r} = W_{1} W_{1}^{T}$ .

Algorithm 1. Active subspace computation.

1.

nme7099-gra-1000-b

2.1. Response surfaces

The term response surface refers to the general procedure of finding the values of a model function $f$ for new inputs without directly computing it but exploiting regression or interpolation from a training set ${x_{i}, f (x_{i})}$ . The procedure for constructing a Gaussian process response is reported in Algorithm 2, while in Algorithm 3, we show how to exploit it to predict the model function at new input parameters.

Algorithm 2. Response surface construction with Gaussian process regression over the active subspace.

1.

nme7099-gra-1001-b

Algorithm 3. Prediction phase using the Gaussian process response surface over the active subspace.

1.

nme7099-gra-1002-b

Directly applying the simple Monte Carlo method with $N$ samples, we get a reduced approximation of $f$ as

({\tilde{h}}_{ϵ} \circ P_{r}) (X) = 𝔼_{ρ} [f | σ (P_{r})] \approx \frac{1}{N} \sum_{i = 1}^{N} f ({\hat{P}}_{r} X + (I_{d} - {\hat{P}}_{r}) Y_{i}) = : ĥ_{ϵ, N} ({\hat{P}}_{r} X),

(9)

where we have made explicit the dependence of the optimal profile ${\tilde{h}}_{ϵ}$ on $ϵ$ , $Y_{1}, \dots, Y_{N}$ are independent and identically distributed samples of $Y \sim ρ$ , and ${\hat{P}}_{r}$ is an approximation of $P_{r}$ obtained with the simple Monte Carlo method from $H$ , see Algorithm 1. An intermediate approximation error is obtained employing the Poincaré inequality and the central limit theorem for the Monte Carlo approximation

𝔼_{P} [{(f (X) - ĥ_{ϵ, N} ({\hat{P}}_{r} X))}^{2}] \leq C_{1} {(1 + N^{- 1 / 2})}^{2} (λ_{n + 1} + \dots + λ_{m}),

(10)

where $C_{1}$ is a constant, and $λ_{n + 1}, \dots, λ_{m}$ are the eigenvalues of the inactive subspace of $H$ . ⁵⁵ (Theorem 4.4)

In practice, $ĥ_{ϵ, N} ({\hat{P}}_{r} X)$ is approximated with a regression or an interpolation such that a response surface $R$ satisfying $𝔼_{ρ} [(ĥ_{ϵ, N} ({\hat{P}}_{r} x) - R {({\hat{P}}_{r} x)}^{2})] \leq C_{2} δ$ is built, where $C_{2}$ is a constant, and $δ$ depends on the chosen method. An estimate for the successive approximations

f (X) \approx {\tilde{h}}_{ϵ} (P_{r} X) \approx ĥ_{ϵ, N} ({\hat{P}}_{r} X) {\approx R}_{ϵ, N, δ} ({\hat{P}}_{r} X),

(11)

is given by

\begin{align} 𝔼_{P} [{(f (X) - R ({\hat{P}}_{r} X))}^{2}] \\ \leq C_{1} {(1 + N^{- 1 / 2})}^{2} {(τ {(λ_{1} + \dots + λ_{n})}^{1 / 2} + {(λ_{n + 1} + \dots + λ_{m})}^{1 / 2})}^{2} + C_{2} λ, \end{align}

where $dist (Im (P_{r}), Im ({\hat{P}}_{r})) \leq τ$ , and $λ_{i}$ are the eigenvalues of $H$ . ⁵⁵ (Theorem 4.8)

In our numerical simulations, we will build the response surface $R$ with Gaussian process regression (GPR). ⁵¹

3. KERNEL‐BASED ACTIVE SUBSPACES EXTENSION

Keeping the notations of section 1, $X : (Ω, ℱ, P) \to ℝ^{m}$ is the absolutely continuous random vector representing the $m$ ‐dimensional inputs with density $ρ : 𝒳 \subset ℝ^{m} \to ℝ$ , and $f : 𝒳 \subset ℝ^{m} \to (V, R_{V})$ is the model function that we assume to be continuously differentiable and Lipschitz continuous.

One drawback of sufficient dimension reduction with AS applied to ridge approximation is that if a clear linear trend is missing, projecting the inputs as $P_{r} X$ represents a loss of accuracy on the approximation of the model $f$ that may not be compensated even by the choice of the optimal profile $\tilde{h} \circ P_{r} = 𝔼_{ρ} [f | σ (P_{r})]$ . In order to overcome this, nonlinear dimension reduction to one‐dimensional parameter space could be achieved discovering a curve in the space of parameters that cuts transversely the level sets of $f$ , this variation is presented in Reference ³⁹ as active manifold. Another approach could consist in finding a diffeomorphism $ϕ$ that reshapes the level sets such that subsequently applying AS dimension reduction to the new model function $\tilde{f} \circ ϕ = f$ could be more profitable:

Unfortunately constructing the active manifold or finding the right diffeomorphism $ϕ$ could be a complicated matter. If we renounce to have a backward map and we weaken the bond of the method with the model, we can consider an immersion $ϕ$ from the space of parameters $𝒳$ to an infinite‐dimensional Hilbert space $ℍ$ obtaining

This is a common procedure in machine learning in order to increase the number of features. ⁵¹ Then AS is applied to the new model function $\tilde{f} : ϕ (𝒳) \subset ℍ \to V$ with parameter space $ϕ (𝒳) \subset ℍ$ . A response surface can be built with Algorithm 2 remembering to replace every occurrence of the inputs $x$ with their images $ϕ (x)$ . A synthetic scheme of the procedure is represented in Figure 1.

NME-7099-FIG-0001-c — Illustration of the construction of a one‐dimensional response surface with kernel‐based active subspaces and Gaussian process regression

In practice, we consider a discretization of the infinite‐dimensional Hilbert space $ℝ^{D} ≃ ℍ$ with $D > m$ . Dimension reduction with AS results in the choice of a $r$ ‐rank projection in the much broader set of $r$ ‐rank projections in $ℍ$ .

Since for AS only the samples of the Jacobian matrix of the model function are employed, we can ignore the definition of the new map $\tilde{f} : ϕ (𝒳) \subset ℍ \to (V, R_{V})$ and focus only on the computation of the Jacobian matrix of $\tilde{f}$ with respect to the new input variable $z : = ϕ (x)$ . The uncentered covariance matrix becomes

\begin{align} H & = \int_{ϕ (𝒳)} [{(D_{z} \tilde{f})}^{T} (z)] R_{V} [(D_{z} \tilde{f}) (z)] d μ (z) \\ = \int_{𝒳} [{(D_{z} \tilde{f})}^{T} (ϕ (x))] R_{V} [(D_{z} \tilde{f}) (ϕ (x))] d ℒ_{X} (x), \end{align}

where $μ : = ϕ_{#} (ℒ_{X})$ is the pushforward probability measure of $ℒ_{X}$ (the law of probability of $X$ ) with respect to the map $ϕ$ . Simple Monte Carlo can be applied sampling from the distribution $ρ$ in the input space $𝒳$

\begin{align} H & = \int_{𝒳} [{(D_{z} \tilde{f})}^{T} (ϕ (x))] R_{V} [(D_{z} \tilde{f}) (ϕ (x))] d ℒ_{X} (x) \\ \approx \frac{1}{M} \sum_{i = 1}^{M} [{(D_{z} \tilde{f})}^{T} (ϕ (x_{i}))] R_{V} [(D_{z} \tilde{f}) (ϕ (x_{i}))] . \end{align}

The gradients of $\tilde{f}$ with respect to the new input variable $Z$ are computed from the known values $D_{x} f$ with the chain rule.

The application of the chain rule to the composition of functions $\tilde{f} \circ ϕ : ℝ^{m} \to ℍ \to V$ is applicable if $\tilde{f}$ is defined in an open set $U \supset ϕ (𝒳)$ . If $ϕ$ is nonsingular and also injective the new input space is a $m$ ‐dimensional submanifold of $ℍ$ . If $ϕ$ is also smooth there exists a smooth extension of $\tilde{f} : ϕ (𝒳) \subset ℍ \to V$ onto the whole domain $ℍ$ , see Proposition 1.36 from Reference ⁵⁶.

If the Hilbert space $ℍ$ has finite dimension $ℍ \sim ℝ^{D}$ this procedure leaves us with an underdetermined linear system to solve for $D_{z} \tilde{f}$

\begin{align} D_{z} \tilde{f} (ϕ (x)) D ϕ (x) = D_{x} f (x), \\ D_{z} \tilde{f} (ϕ (x)) = D_{x} f (x) {(D ϕ (x))}^{†}, \end{align}

(12)

where $^{†}$ stands for the right Moore–Penrose inverse of the matrix $D ϕ (x)$ with rank $r$ , that is

{(D ϕ (x))}^{†} = V \sum^{†} U^{T},

with the usual notation for the singular value decomposition (SVD) of $D ϕ (x)$

D ϕ (x) = U \sum V^{T},

(13)

and $\sum^{†} \in ℳ (r \times r)$ equal to the diagonal matrix with the inverse of the singular values as diagonal elements. As anticipated if $f$ is smooth enough and $ϕ$ is an embedding, so that $D ϕ$ has full rank, the previous system has an unique solution. The most crucial part is the evaluation of the gradients $D_{x} f (x)$ from the input output couples, when they are not available analytically or from the adjoint method applied to PDEs models: different approaches are present in the literature, like local polynomial regressions and Gaussian process regression on the whole domain to approximate the gradients; both are available in the ATHENA package. ⁵⁷ For an estimate of the ridge approximation error due to inaccurate gradients, see Reference ⁹.

Finally, we remark that in the AS method we approximate the random variable $X$ as

P_{r} X = v_{1} (v_{1} \cdot X) + \dots + v_{r} (v_{r} \cdot X),

(14)

with ${v_{i}} \subset ℝ^{m}$ the active eigenvectors, whereas with KAS the reduced input space is contained in $ℋ$

P_{r} X = v_{1} (v_{1} \cdot ϕ (X)) + \dots + v_{r} (v_{r} \cdot ϕ (X)),

(15)

with ${v_{i}} \subset ℋ$ the active eigenvectors of KAS. In this case, the model is enriched by the nonlinear feature map $ϕ$ .

3.1. Choice of the feature map

The choice for the map $ϕ$ is linked to the theory of reproducing kernel Hilbert spaces (RKHS), ⁵⁸ and it is defined as

\begin{align} z = ϕ (x) & = \sqrt{\frac{2}{D}} σ_{f} \cos (W x + b), \end{align}

(16)

\begin{align} \cos (W x + b) & : = \frac{1}{\sqrt{D}} {(\cos (W [1, :] \cdot x + b_{1}), \dots, \cos (W [D, :] \cdot x + b_{D}))}^{T}, \end{align}

(17)

where $σ_{f}$ is an hyperparameter corresponding to the empirical variance of the model, $W \in ℳ (D \times m)$ is the projection matrix whose rows are sampled from a probability distribution $μ$ on $ℝ^{m}$ and $b \in ℝ^{D}$ is a bias term whose components are sampled independently and uniformly in the interval $[0, 2 π]$ . We remark that its Jacobian can be computed analytically as

\frac{\partial z^{j}}{\partial x^{i}} = - \sqrt{\frac{2}{D}} σ_{f} \sin (\sum_{k = 1}^{D} W_{i k} x_{k} + b_{k}) W_{i j},

(18)

for all $i \in {1, \dots, m}$ , and for all $j \in {1, \dots, D}$ .

We remark that in order to guarantee the correctness of the procedure for evaluating the gradients we have to prove that the feature map is injective and nonsingular. In general, however, the feature map (16) cannot not be injective due to the periodicity of the cosine but at least it is almost surely nonsingular if the dimension of the feature space is high enough.

The feature map (16) is not the only effective immersion that provides a kernel‐based extension of the active subspaces. For example an alternative is the following composition of a linear map with a sigmoid

ϕ (z) = \frac{C}{1 + α e^{- W z}},

where $C$ is a constant, $α$ is an hyperparameter to be tuned, and $W \in ℳ (D, m)$ is, as before, a matrix whose rows are sampled from a probability distribution on $ℝ^{m}$ .

Other choices involve the use of deep neural networks to learn the profile $h$ and the projection function $P_{r}$ of the ridge approximation problem. ⁵⁹

The tuning of the hyperparameters of the spectral measure consists in a global optimization problem where the dimension of the domain can vary between 1 and the dimension of the input space $m$ . The object function to optimize is the relative root mean square error (RRMSE)

RRMSE (Y_{test}, T_{test}) = \sqrt{\frac{\sum_{i = 1}^{N} {(t_{i} - y_{i})}^{2}}{\sum_{i = 1}^{N} {(t_{i} - \overline{y})}^{2}}},

(19)

where $T_{test} = {(t_{i})}_{i \in {1, \dots, N}}$ are the predictions obtained from the response surface built with KAS and associated to the test set, $Y_{test} = {(y_{i})}_{i \in {1, \dots, N}}$ are the targets associated to the test set, and $\overline{y}$ is the mean value of the targets. We implemented a logarithmic grid‐search, see Algorithm 5, making use of the SciPy library. ⁶⁰ Another choice could be Bayesian stochastic optimization implemented in the open‐source library GPyOpt. ⁶¹

Algorithm 5. Tuning the feature map with logarithmic grid‐search.

1.

nme7099-gra-1004-b

The tuning of the hyperparameters of the spectral measure chosen is the most computationally expensive part of the procedure. We report the computational complexity of the algorithms introduced to have a better understanding of the additional cost implied by the implementation of response surface design with KAS. Let us assume that the number of random Fourier features $D$ , the number of input, output, and gradient samples $M$ , and the dimension of the parameter space $m$ , are ordered in this manner $D > M > m$ , as is usually the case, and that the quantity of interest $f$ is a scalar function. The cost of computing an active subspace is $O (M m^{2})$ , that is the cost of the SVD of the gradients matrix $d Y$ used to get the active and inactive eigenvectors in Algorithm 1. The cost of the training of a response surface with Gaussian process regression in Algorithm 2 depends on the cost of minimization of the log‐likelihood: each evaluation of the log‐likelihood involves the computation of the determinant and the inverse of the regularized Gram matrix $K (θ) + σ I_{M}$ , that is $O (M^{3})$ . Finally, the cost for the evaluation of the kernel‐based active subspace is associated to the SVD of $d \tilde{Y}$ that is $O (D M^{2})$ in Algorithm 4, and to the resolution of the overdetermined linear system to obtain the gradients $d \tilde{Y}$ , that is $M$ times $O (D m^{2})$ since it is related to the evaluation of the pseudo‐inverse of $D ϕ$ . So, the computational complexity for the response surface design with AS and GPR is $O (n_{GPR} M^{3})$ , while for the response surface design with KAS and GPR is $O (n_{grid‐search} n (D \frac{M^{2}}{n^{2}} + \frac{M}{n} D m^{2} + n_{GPR} \frac{M^{3}}{n^{3}}))$ , where $n_{GPR}$ is the maximum number of steps of the optimization algorithm used to minimize the log‐likelihood, $n_{grid‐search}$ is the number of hyperparameter instances $γ \in G$ to try in Algorithm 5, and $n$ is the number of batches in the $n$ ‐fold cross validation procedure. In particular, for each grid search hyperparameter the main cost is associated to the GPR training since $n_{GPR}$ usually satisfy $D n < n_{GPR} M$ , when the optimizer chosen is L‐BFGS‐B from SciPy, ⁶⁰ accounting also for the number of restarts of the optimizer: in the numerical tests we performed the number of restarts of the training of the GPR is problem‐dependent but always less than 10. In general, the number $n_{grid‐search}$ depends on the chosen application, and the multiplicative factor between the computational complexity of the response surface design procedure with KAS or AS is lower than $3 n_{grid‐search} n$ .

Algorithm 4. Kernel‐based active subspace computation.

1.

nme7099-gra-1003-b

3.2. Random Fourier features

The motivation behind the choice for this map from Equation (16) comes from the theory on RKHS. The infinite‐dimensional Hilbert space $(ℍ, ⟨ \cdot, \cdot ⟩)$ is assumed to be a RKHS with real shift‐invariant kernel $k : 𝒳 \times 𝒳 \to ℝ$ with $k (0) = 1$ and feature map $ϕ$ .

In order to get a discrete approximation of $ϕ : 𝒳 \subset ℝ^{m} \to ℍ$ , random Fourier features are employed. ⁴⁹ ^, ⁵⁰ Bochner's theorem ⁶² guarantees the existence of a spectral probability measure $μ$ such that

k (x, y) = \int_{ℝ^{m}} e^{i ω \cdot (x - y)} d μ (ω) .

(20)

From this identity, we can get a discrete approximation of the scalar product $⟨ \cdot, \cdot ⟩$ with Monte Carlo method, exploiting the fact that the kernel is real

\begin{align} ⟨ ϕ (x), ϕ (y) ⟩ & = k (x, y) \approx \frac{1}{D} \sum_{i = 1}^{D} \cos (ω_{i} \cdot x + b_{i}) \cos (ω_{i} \cdot y + b_{i}) = z^{T} z, \end{align}

(21)

\begin{align} z & = \frac{1}{\sqrt{D}} (\cos (ω_{1} \cdot x + b_{1}), \dots, \cos (ω_{D} \cdot x + b_{D})), \end{align}

(22)

and from this relation we obtain the approximation $ϕ \approx z$ . The sampled vectors ${ω_{i}}_{i = 1, \dots, D}$ are called random Fourier features. The scalars ${b_{i}}_{i = 1, \dots, D}$ are bias terms introduced since in the approximation we have excluded some trigonometric terms from the following initial expression

\frac{1}{D} \sum_{i = 1}^{D} (\cos (ω_{i} \cdot x) \cos (ω_{i} \cdot y) - \sin (ω_{i} \cdot x) \sin (ω_{i} \cdot y)) .

Random Fourier features are frequently used to approximate kernels. We consider only spectral probability measures which have a probability density, usually named spectral density. In the approximation of the kernel with random Fourier features, under some regularity conditions on the kernel, an explicit probabilistic bound depending on the dimension of the feature space $D$ can be proved. ⁶² This technique is used to scale up kernel principal component analysis ⁶³ ^, ⁶⁴ and supervised kernel principal component analysis, ⁴⁴ but in the case of kernel‐based AS the resulting overdetermined linear system employed to compute the Jacobian matrix of the new model function increases in dimension instead.

The most famous kernel is the squared exponential kernel also called Radial Basis Function kernel (RBF)

k_{RBF} (x, y) = \exp (- \frac{‖ x - y ‖^{2}}{2 l^{2}}),

(23)

where $l$ is the characteristic length‐scale. The spectral density is Gaussian $𝒩 (0, 1 / 4 π^{2} l^{2})$ :

S (ω) = {(2 π l^{2})}^{D / 2} \exp (- 2 π^{2} l^{2} ω^{2}) .

(24)

Thanks to Bochner's theorem to every probability distribution that admits a probability density function corresponds a stationary positive definite kernel. So having in mind the definition of the feature map $ϕ$ from Equation (16), we can choose any probability distribution for sampling the random projection matrix $W \in ℳ (D \times m)$ without focusing on the corresponding kernel since it is not needed by the numerical procedure.

After the choice of the spectral measure the corresponding hyperparameters have to be tuned. This is linked to the choice of the hypothesis models in machine learning and it is usually carried out for the hyperparameters of the employed kernel. From the choice of the kernel and the corresponding hyperparameters, some regularity properties of the model are implicitly assumed. ⁵¹

4. BENCHMARK TEST PROBLEMS

In this section, we are going to present some benchmarks to prove the potential gain of KAS over standard linear AS, for both scalar and vectorial model functions. In particular, we test KAS on radial symmetric functions, with 2‐dimensional and 8‐dimensional parameter spaces, on the approximation of the reproduction number $R_{0}$ of the SEIR model, and finally on a vectorial output function that is the solution of a Poisson problem.

One‐dimensional response surfaces are built following the algorithm described in Section 2.1. The tuning of the hyperparameters of the feature map is carried out with a logarithmic grid‐search and 5‐fold cross validation for the Ebola test case, while for the other cases we employed Bayesian stochastic optimization implemented in Reference ⁶⁵ with 3‐fold cross validation. The score function chosen is the relative root mean square error (RRMSE). The spectral measure for each test case is chosen by brute force among the Laplace, Gaussian, Beta, and multivariate Gaussian distributions. The number of Fourier features is not established based on a criterion but we have seen experimentally that above a certain threshold the number of features is high enough to at least reproduce the accuracy of the AS method. Since the most sensitive part to the final accuracy of the response surface is the tuning of the hyperparameters of the spectral measures, we suggest to choose an affordable number of features between 1000 and 2000, and focus on the tuning of said hyperparameters instead.We remark that the number of samples employed is problem dependent: some heuristics to determine it can be found in Reference ⁹, but the crucial point is that additional training samples with respect to the ones used for the AS method are not needed for the novel KAS method. Moreover, the CPU time for the hyperparameters tuning procedure is usually negligible with respect to the time required to obtain input‐output pairs from the numerical simulation of PDEs models: in our applications the tuning procedure's computational time is in the order of minutes (usually around 10–15 min for most testcases), while for the CFD application of Section 5 it is in the order of days and for the stochastic elliptic partial differential equation of Section 4.3 it is in the order of hours. We also remark that the tuning Algorithm 5, the GPR training restarts, and the choice of the spectral measure can be easily parallelized.

For the radial symmetric and Ebola test cases, the inputs are sampled from a uniform distribution with problem dependent ranges. For the stochastic elliptic partial differential case, the inputs are the coefficients of a Karhunen–Loève expansion and are sampled from a normal distribution. All the computations regarding AS and KAS are done using the open source Python package called ATHENA. ⁵⁷

4.1. Radial symmetric functions

Radial symmetric functions represent a class of model functions for which AS is not able to unveil any low dimensional behavior. In fact for these functions any rotation of the parameter space produce the same model representation. Instead kernel‐based AS is able to overcome this problem thanks to the mapping onto the feature space.

We present two benchmarks: an 8‐dimensional hyperparaboloid defined as

f : {[- 1, 1]}^{8} \subset ℝ^{8} \to ℝ, f (x) = \frac{1}{2} ‖ x ‖^{2},

(25)

and the surface of revolution in $ℝ^{3}$ with generatrix $g (x) = \sin (x^{2})$

f : {[- 3, 3]}^{2} \subset ℝ^{2} \to ℝ, f (x) = g (‖ x ‖) = \sin (‖ x ‖^{2}) .

(26)

The gradients are computed analytically.

For the hyperparaboloid we use $N_{s} = 500$ independent, uniformly distributed training samples in ${[- 1, 1]}^{8}$ , while for the sine case the training samples are $N_{s} = 800$ in ${[- 3, 3]}^{2}$ . In both cases, the test samples are 500. The feature space has dimension 1000 for both the first and the second case. The spectral distribution chosen is the multivariate normal with hyperparameter a uniform variance $λ I_{d}$ , and a product of Laplace distributions with $γ$ and $b$ as hyperparameters, respectively. The tuning is carried out with 3‐fold cross validation. The results are summarized in Table 1.

TABLE 1.

Performance results for AS and KAS methods

Case

Dim

N_{s}

Spectral distribution

Feature space dim

RRMSE AS

RRMSE KAS

Hyperparaboloid

500

𝒩 (0, λ I_{d})

1000

0.98

\pm

0.03

0.23

\pm

0.02

Sine

800

Laplace (γ, b)

1000

1.011

\pm

0.01

0.31

\pm

0.06

Ebola

800

Beta (α, β)

1000

0.46

\pm

0.31

0.31

\pm

0.03

SPDE (31)

1000

𝒩 (0, \sum)

1500

0.611

\pm

0.001

0.515

\pm

0.013

Open in a new tab

Note: For each case, we report the parameter space dimension, the number of samples $N_{s}$ used for the training, the chosen distribution, the dimension of the feature space, and the RRMSE mean and standard deviation for AS and KAS. The best results are given in bold.

Looking at the eigenvalues of the uncentered covariance matrix of the gradients $\tilde{H}$ for the hyperparaboloid case in Figure 2, we can clearly see how the decay for AS is almost absent, while using KAS the decay after the first eigenvalue is pronounced, suggesting the presence of a kernel‐based active subspace of dimension 1.

NME-7099-FIG-0002-b — Eigenvalues of the covariance matrix $\tilde{H} \in ℝ^{8 \times 8}$ applied to the hyperparaboloid case for the AS procedure on the left, and the first 10 eigenvalues of the covariance matrix $\tilde{H} \in ℝ^{1000 \times 1000}$ for the KAS procedure applied to the same case on the right

The one‐dimensional sufficient summary plots, which are $f (x)$ against $W_{1}^{T} x$ —in the AS case—or against $W_{1}^{T} ϕ (x)$ —in the KAS case, are shown in Figures 3 and 4, respectively. On the left panels, we present the Gaussian process response surfaces obtained from the active subspaces reduction, while on the right panels the ones obtained with the kernel‐based AS extension. As we can see AS fails to properly reduce the parameter spaces, since there are no preferred directions over which the model functions vary the most. The KAS approach, on the contrary, is able to unveil the corresponding generatrices. This results in a reduction of the RMS by a factor of at least 3 (see Table 1).

NME-7099-FIG-0003-c — Comparison between the sufficiency summary plots obtained from the application of AS and KAS methods for the hyperparaboloid model function with domain ${[- 1, 1]}^{8}$ , defined in Equation (25). The left plot refers to AS, the right plot to KAS. With the blue solid line, we depict the posterior mean of the GP, with the shadow area the 68% confidence intervals, and with the blue dots the testing points.

NME-7099-FIG-0004-c — Comparison between the sufficiency summary plots obtained from the application of AS and KAS methods for the surface of revolution model function with domain ${[- 3, 3]}^{2}$ , defined in Equation (26). The left plot refers to AS, the right plot to KAS. With the blue solid line, we depict the posterior mean of the GP, with the shadow area the 68% confidence intervals, and with the blue dots the testing points.

4.2. SEIR model for the spread of Ebola

In most engineering applications, the output of interest presents a monotonic behavior with respect to the parameters. This means that, for example, the increment in the inputs produces a proportional response in the outputs. Rarely, the model function has a radial symmetry, and in such cases the parameter space can be divided in subdomains, which are analyzed separately. In this section, we are going to present a test case where there is no radial symmetry, showing that, even in this case the kernel‐based AS presents better performance with respect to AS.

For the Ebola test case, ^‡ the output of interest is the basic reproduction number $R_{0}$ of the SEIR model, described in Reference ⁶⁶, which reads

R_{0} = \frac{β_{1} + \frac{β_{2} ρ_{1} γ_{1}}{ω} + \frac{β_{3}}{γ_{2}} ψ}{γ_{1} + ψ},

(27)

with parameters distributed uniformly in $Ω \subset ℝ^{8}$ . The parameter space $Ω$ is an hypercube defined by the lower and upper bounds summarized in Table 2.

TABLE 2.

Parameter ranges for the Ebola model

β_{1}

β_{2}

β_{3}

ρ_{1}

γ_{1}

γ_{2}

ω

ψ

Lower bound

0.1

0.05

0.41

0.0276

0.081

0.25

0.0833

Upper bound

0.4

0.2

0.1702

0.21

0.5

0.7

Open in a new tab

Note: Data taken from Reference ⁶⁶.

We can compare the two one‐dimensional response surfaces obtained with Gaussian process regression. The training samples are $N_{s} = 800$ , and we use 1000 features. As spectral measure we use again the multivariate Gaussian distribution $𝒩 (0, \sum)$ with hyperparameters the elements of the diagonal of the covariance matrix. The tuning is carried out with 5‐fold cross validation. Even in this case, the KAS approach results in smaller RMS with respect to the use of AS (around 60% less), as reported in Table 1. In Figure 5, we report the comparison of the two approaches over an active subspace of dimension 1.

NME-7099-FIG-0005-c — Comparison between the sufficiency summary plots obtained from the application of AS and KAS methods for the $R_{0}$ model function with domain $Ω$ , defined in Equation (27). The left plot refers to AS, the right plot to KAS. With the blue solid line, we depict the posterior mean of the GP, with the shadow area the 68% confidence intervals, and with the blue dots the testing points.

4.3. Elliptic partial differential equation with random coefficients

In our last benchmark, we apply the kernel‐based AS to a vectorial model function, that is the solution of a Poisson problem with heterogeneous diffusion coefficient. We refer to Reference ¹¹ for an application, on the same problem, of the AS approach.

We consider the following stochastic Poisson problem on the square $x = (x, y) \in Ω : = {[0, 1]}^{2}$ :

\{\begin{cases} - \nabla \cdot (κ \nabla u) = 1, & x \in Ω, \\ u = 0, & x \in \partial Ω_{top} \cup \partial Ω_{bottom}, \\ u = 10 y (1 - y), & x \in \partial Ω_{left}, \\ n \cdot \nabla u = 0, & x \in \partial Ω_{right}, \end{cases}

(28)

with homogeneous Neumann boundary condition on the right side of the domain, that is $\partial Ω_{right}$ , Neumann boundary conditions on the left side of the domain, that is $\partial Ω_{left}$ , and Dirichlet boundary conditions on the remaining part of $\partial Ω$ . The diffusion coefficient $κ : (Ω, 𝒜, P) \times Ω \to ℝ$ , where $𝒜$ is a $σ$ ‐algebra, is such that $\log (κ)$ is a Gaussian random field, with covariance function $C (x, y)$ defined by

C (x, y) = \exp (- \frac{‖ x - y ‖^{2}}{β^{2}}), \forall x, y \in Ω,

(29)

where $β = 0.03$ is the correlation length. This random field is approximated with the truncated Karhunen–Loève decomposition

κ (s, x) \approx \exp (\sum_{i = 0}^{m} X_{i} (s) γ_{i} ψ_{i} (x)), \forall (s, x) \in Ω \times Ω,

(30)

where ${(X_{i})}_{i \in 1, \dots, m}$ are independent standard normal distributed random variables, and ${(γ_{i}, ψ_{i})}_{i \in 1, \dots, d}$ are the eigenpairs of the Karhunen–Loève decomposition of the zero‐mean random field $κ$ .

In our simulation, the domain $Ω$ is discretized with a triangular unstructured mesh $𝒯$ with 3194 triangles. The parameter space has dimension $m = 10$ . The simulations are carried out with the finite element method (FEM) with polynomial order one, and for each simulation the parameters ${(X_{i})}_{i = 1, \dots, m}$ are sampled from a standard normal distribution. The solution $u$ is evaluated at $d = 1668$ degrees of freedom, thus $(V, R_{V}) \approx (ℝ^{d}, S + M)$ where the metric $R_{V}$ is approximated with the sum of the stiffness matrix $S \in ℝ^{d} \times ℝ^{d}$ and the mass matrix $M \in ℝ^{d} \times ℝ^{d}$ . This sum is a discretization of the norm of the Sobolev space $H^{1} (Ω)$ . The number of features used in the KAS procedure is $D = 1500$ , the number of different independent simulations is $M = 1000$ .

Three outputs of interest are considered. The first target function $f : ℝ^{m} \to ℝ$ is the mean value of the solution at the right boundary $\partial Ω_{right}$ , which reads

f (X) = \frac{1}{| \partial Ω_{right} |} \int_{\partial Ω_{right}} u (s) d s,

(31)

and it is used to tune the feature map minimizing the RRMSE of the Gaussian process regression, as described in Algorithm 5. A summary of the results for the first output is reported in Table 1. The plots of the regression are reported in Figure 6. Even in this case both from a qualitative and a quantitative point of view, the kernel‐based approach achieves the best results.

NME-7099-FIG-0006-c — Comparison between the sufficiency summary plots obtained from the application of AS and KAS methods for the stochastic PDE model, defined in Equations (28) and (31). The left plot refers to AS, the right plot to KAS. With the blue solid line, we depict the posterior mean of the GP, with the shadow area the 68% confidence intervals, and with the blue dots the testing points.

The second output we consider is the solution function

f : ℝ^{m} \to (V, R_{V}) \approx (ℝ^{d}, S), f (X) = u \in ℝ^{d} .

(32)

This output can be employed as a surrogate model to predict the solution $u$ given the parameters $X$ that define the diffusion coefficient instead of carrying out the numerical simulation. The surrogate model should be constructed over the span of the modes identified by the chosen reduction strategy, after projecting the data. AS and KAS modes are distinguished but can detect some common regions of interest as shown in Table 3.

TABLE 3.

First 3 modes using Karhunen–Loève (K‐L) decomposition, AS, and KAS, for the outputs defined in Equations (31)–(33)

Case	Mode 1	Mode 2	Mode 3
K‐L
AS (31)
KAS (31)
AS (32)
KAS (32)
AS (33)
KAS (33)

Open in a new tab

The third output is the evaluation of the solution at a specific degree of freedom with index $î$ , that is

f : ℝ^{m} \to ℝ, f (X) = u_{î} \in ℝ,

(33)

in this case the dimension of the input space is $m = 100$ . Since we use a Lagrangian basis in the finite element formulation and the polynomial order is 1, the node of the mesh associated to the chosen degree of freedom has coordinates $[0.27, 0.427] \in Ω$ . Qualitatively we can see from Table 3 that the AS modes locate features in the domain which are relatively more regular with respect to the KAS modes. To obtain this result, we increased the dimension of the input space, otherwise not even the AS modes could locate properly the position in the domain $Ω$ of the degree of freedom.

In the second and third case the diffusion coefficient is given by

κ (x) = \exp (\sum_{i = 1}^{D} v_{j} [i] {\tilde{ψ}}_{j} (x)), \forall (s, x) \in Ω \times Ω,

(34)

where $v_{j} \in ℝ^{D}$ , $j \in {1, \dots, D}$ , is the $j$ th active eigenvector from the KAS procedure and the functions $\tilde{Ψ} : = ({\tilde{ψ}}_{1}, \dots, {\tilde{ψ}}_{D})$ are defined by

\tilde{Ψ} = ϕ (Ψ),

(35)

where $ϕ$ is the feature map defined in Equation (16) with the projection matrix $W$ and bias $b$ , and $Ψ : = (γ_{1} ψ_{1}, \dots, γ_{m} ψ_{m})$ .

The gradients of the three outputs of interest considered are evaluated with the adjoint method.

5. A CFD PARAMETRIC APPLICATION OF KAS SOLVED WITH THE DG METHOD

We want to test the kernel‐based extension of the active subspaces in a CFD context. The lift and drag coefficients of a NACA 0012 airfoil are considered as model functions. Numerical simulations are carried out with different input parameters for quantities that describe the geometry and the physical conditions of the problem. The evolution of the model is protracted until a periodic regime is reached. Once the simulation data have been collected, sensitivity analysis is performed searching for an active subspace and response surfaces with GPR are then built from the application of AS and KAS techniques.

The fluid motion is modeled through the unsteady incompressible Navier–Stokes equations approximated through the Chorin–Temam operator‐splitting method implemented in HopeFOAM. ⁶⁷ HopeFOAM is an extension of OpenFOAM, ⁶⁸ ^, ⁶⁹ an open source software for the solution of complex fluid flows problems, to variable higher order element method and it adopts a DG method, based on the formulation proposed by Hesthaven and Warburton. ⁵²

The DG method is a high‐order method, which has appealing features such as the low artificial viscosity and a convergence rate which is optimal also on unstructured grids, commonly used in industrial frameworks. In addition to this, DG is naturally suited for the solution of problems described by conservative governing equations (Navier–Stokes equations, Maxwell's equations, and so on) and for parallel computing. All these properties are due to the fact that, differently from formulations based on standard finite elements, no continuity is imposed on the cell boundaries and neighboring elements only exchange a common flux. The major drawback of DG is its high computational cost with respect to continuous Galerkin methods, due to the need of evaluating fluxes during each time step and the presence of extra degree of freedoms in correspondence of the elemental edges.

Nowadays, efforts are aimed at applying the DG in problems which involve deformable domains ⁷⁰ and at improving the computational efficiency of the DG adopting techniques based on hybridization methods, matrix‐free implementations, and massive parallelization. ⁷¹ ^, ⁷²

5.1. Domain and mesh description

The domain $Ω$ of the fluid dynamic simulation is a two‐dimensional duct with a sudden area expansion and a NACA 0012 airfoil is placed in the largest section. The inflow $\partial Ω_{I}$ is placed at the beginning of the narrowest part of the duct, and here the fluid velocity is set constant along all the inlet boundary. The outlet is placed on the right‐hand side and it is denoted with $\partial Ω_{O}$ . We refer with $\partial Ω_{W} : = \partial Ω ∖ {\partial Ω_{O} \cup \partial Ω_{I}}$ to the boundaries of the airfoil and to the walls of the duct, where no slip boundary conditions are applied. The horizontal lengths of the sections of the channels are 0.6 and 1.35 m, respectively. The vertical length of the duct after the area expansion is 0.4 m, while the width of the first one depends on two distinct parameters. The airfoil has a chord‐length equal to 0.1 m but its position with respect to the duct and its angle of attack are described by geometric parameters. Further details about the geometric parameterization of the geometry are provided in the following section. A proper triangulation is designed with the aid of the gmsh ⁷³ tool and the domain is discretized with 4445 unstructured elements.

The evaluation of adimensional magnitudes, commonly used for characterizing the fluid flow field, requires the definition of some reference magnitudes. For the problem at hand, we consider the equivalent diameter of the channel in correspondence of the inlet as the reference lengthscale, while the reference velocity is the one imposed at the inlet.

5.2. Parameter space description

We chose seven heterogeneous parameters for the model: two physical, and five geometrical which describe the width of the channel and the position of the airfoil. In Table 4, the ranges for the geometrical and physical parameters of the simulation are reported. $U$ is the first component of the initial velocity, $ν$ is the kinematic viscosity, $x_{0}$ and $y_{0}$ are the horizontal and vertical components of the translation of the airfoil with respect to its reference position (see Figure 7), $α$ is the angle of the counterclockwise rotation and the center of rotation is located right in the middle of the airfoil, $y^{+}$ and $y^{-}$ are the module of the vertical displacements of the upper and lower side of the initial conduct from a prescribed position.

TABLE 4.

Parameter ranges for the NACA problem

ν

U

x_{0}

y_{0}

α

y^{+}

y^{-}

Lower bound

0.00036

0.5

-

0.099

-

0.035

-

0.02

-

0.02

Upper bound

0.00060

0.099

0.035

0.0698

0.02

Open in a new tab

NME-7099-FIG-0007-c — Domain configuration for minimum and maximum values of some geometric parameters. The maximum angle of attack $α$ , the ranges for the horizontal translation $x_{0}$ , the ranges for the vertical translation $y_{0}$ , and the minimum opening of the channel which depends on the parameters $y^{+}$ and $y^{-}$ are represented in Table 4.

In Figure 7, the different configurations of the domain for the minimum and maximum values of the parameters $α$ , $x_{0}$ , $y_{0}$ , and the minimum opening of the channel are reported.

We have considered only the counterclockwise rotation of the airfoil for symmetrical reasons. The range of the Reynolds number varies from 400 to 2000, still under the regime of laminar flow.

5.3. Governing equations

The CFD problem is modeled through the incompressible Navier–Stokes and the open source solver HopeFOAM ⁶⁷ has been employed for solving this set of equations. ⁵²

Let $Ω \subset ℝ^{2}$ be the two‐dimensional domain introduced in Section 5.1, and let us consider the incompressible Navier–Stokes equations. Omitting the dependence on $(x, t) \in Ω \times ℝ^{+}$ in the first two equations for the sake of compactness, the governing equations are

\begin{align} \{\begin{cases} \partial_{t} u + (u \cdot \nabla) u = - \nabla p + ν Δ u, & x \in Ω, \\ \nabla \cdot u = 0, & x \in Ω, \\ u (x, 0) = u_{0}, p (x, 0) = 0, & x \in Ω, \\ u (x, t) = u_{0}, n \cdot \nabla p (x, t) = 0, & x \in \partial Ω_{I}, \\ u (x, t) = 0, n \cdot \nabla p (x, t) = 0, & x \in \partial Ω_{W}, \\ n \cdot \nabla u (x, t) = 0, p (x, t) = 1, & x \in \partial Ω_{O}, \end{cases} \end{align}

(36)

where $p$ is the scalar pressure field, $u = (u, v)$ is the velocity field, $ν$ is the viscosity constant and $u_{0}$ is the initial velocity. In conservative form, the previous equations can be rewritten as

\{\begin{cases} \partial_{t} u + \nabla \cdot ℱ = - \nabla p + ν Δ u, \\ \nabla \cdot u = 0, \end{cases}

(37)

with the flux $ℱ$ given by

ℱ = [F_{1}, F_{2}] = [\begin{array}{cc} u^{2} & u v \\ u v & v^{2} \end{array}] .

(38)

From now on, in order to have a more compact notation, the advection term is written as $𝒩 (u) = \nabla \cdot ℱ (u)$ .

For each timestep, the procedure is broken into three stages accordingly to the algorithm proposed by Chorin and adapted for a DG framework by Hesthaven and Warburton: ⁵² the solution of the advection dominated conservation law component, the pressure correction weak divergence‐free velocity projection, and the viscosity update. The nonlinear advection term is treated explicitly in time through a second order Adams–Bashforth method, ⁷⁴ while the diffusion term implicitly. The Chorin algorithm is reported in Algorithm 6.

Algorithm 6. Chorin algorithm.

1.

nme7099-gra-1005-b

In order to recover the DG formulation, the equations introduced by the Chorin method are projected onto the solution space by introducing a proper set of test functions and then the variables are approximated over each element as a linear combination of local shape functions. The DG does not impose the continuity of the solution between neighboring elements and therefore it requires the adoption of methods for the evaluation of the flux exchange between neighboring elements. In the present work, the convective fluxes are treated accordingly to the Lax–Friedrichs scheme, while the viscous ones are solved through the interior penalty method. ⁷⁵ ^, ⁷⁶

The aerodynamic quantities we are interested in are the lift and drag coefficients in the incompressible case computed from the quantities $u$ , $p$ , $ν$ , $A_{ref}$ , and $u_{0}$ with a contour integral along the airfoil $Γ$ as

f = \oint_{Γ} p n - ν (\nabla u + \nabla u^{T}) n d s .

(39)

The vector $n$ is the outward normal along the airfoil surface. The circulation in $Γ$ is affected by both the pressure and stress distributions around the airfoil. The projection of the force along the horizontal and vertical directions gives the drag and lift coefficients, respectively

C_{D} = \frac{f \cdot e_{1}}{\frac{1}{2} | u_{0} |^{2} A_{ref}},

(40)

C_{L} = \frac{f \cdot e_{2}}{\frac{1}{2} | u_{0} |^{2} A_{ref}},

(41)

where the reference area $A_{ref}$ is the chord of the airfoil times a length of 1 m. For the aerodynamic analysis of the fluid flow past an airfoil, see Reference ⁷⁷.

5.4. Numerical results

In this section, a brief review of the procedure and some details about the numerical method and the computational domain will be presented along the results obtained. For what concerns the DG the polynomial order chosen is 3. The total number of degrees of freedom is 133,350. Small variations on the mesh are present in each of the 285 simulations due to the different configurations of the domain. Each simulation is carried out until a periodic behavior is reached and for this reason the final times range between 3.5 and 5 s, depending on the specific configuration. The integration time intervals are variable and they are updated at the end of each step in order to satisfy the CFL condition. The seven physical and geometrical parameters of the simulation are sampled uniformly from the intervals in Table 4. In total, we consider a dataset of 285 samples.

With the purpose of qualitatively visualizing the results, four different simulations are reported in Figure 8 for the module of the velocity field and the scalar pressure field, respectively, both evaluated at the last time instant. These simulations were chosen from the 285 collected in order to show significant differences in the evolution of the fluid flow. In Table 5, the corresponding parameters are reported. Depending on the position of the airfoil and the other physical parameters, different fluid flow patterns can be qualitatively observed.

NME-7099-FIG-0008-c — Module of the velocity fields (on the left) and pressure fields (on the right) evaluated at the last time instant of four different simulations. The corresponding parameters are reported in Table 5.

TABLE 5.

Parameters associated to the simulations plotted in Figure 8

ν

U

x_{0}

y_{0}

α

y^{+}

y^{-}

0.000405

1.99

-

0.096

-

0.00207

0.00282

0.00784

0.0188

0.000541

0.763

-

0.084

0.00279

0.0260

-

0.0108

0.0195

0.000406

0.533

-

0.0503

-

0.0327

0.0604

-

0.0193

0.0068

0.000430

1.11

-

0.0897

-

0.0279

0.0278

-

0.00624

0.0197

Open in a new tab

The lift ( $C_{L}$ ) and drag ( $C_{D}$ ) coefficients are evaluated when stationary or periodic regimes are reached, starting from the values of pressure and viscous stresses evaluated on the nodes close to the airfoil. After this sensitivity analysis is carried out. First the AS method is applied. The gradients necessary for the application of the AS method are obtained from the Gaussian process regression of the model functions $C_{L}$ and $C_{D}$ on the whole parameters' domain. The eigenvalues of the uncentered covariance matrix for the lift and drag coefficients suggest the presence of a one‐dimensional active subspace in both cases.

The plots of the first active eigenvector components are useful as sensitivity measures, see Figure 9. The greater the absolute value of a component is, the greater is its influence on the model function. We observe that the lift coefficient is influenced mainly by the vertical position of the airfoil and the angle of attack, while the drag coefficient depends mainly on the initial velocity, and secondarily on the viscosity and on the angle of attack.

NME-7099-FIG-0009-c — Components of the first active eigenvector for the lift coefficient (on the left), and for the drag coefficient (on the right). Values near 0 suggest little sensitivity for the target function.

As one could expect from physical considerations, the angle of attack affects both drag and lift coefficients, while the viscosity, which governs the wall stresses, is relevant for the evaluation of the $C_{D}$ . The vertical position of the airfoil with respect to the symmetric axis of the section of the duct after the area expansion also greatly affects both coefficients, and this is mainly due to the fact that the fluid flow conditions change drastically between the core, where the speed is higher, and the one close to the wall of the duct, where the speed tends to zero. On the other hand, the horizontal translation has almost no impact on the results, given the regularity of the fluid flow along the $x$ ‐axis for the considered range of $x_{0}$ . Moreover, the nonsymmetric behavior of the upper and lower parameters which determine the opening of the channel is due to the nonsymmetric choice of the range considered for the angle of attack.

The KAS method was applied with 1500 features. In order to compare the AS and KAS methods 5‐fold cross validation was implemented. The score of cross validation is the RRMSE defined in Equation (19).

The GPR for the two methods are shown in Figure 10 for the lift coefficient, and in Figure 11 for the drag coefficient. They were obtained as a single step of 5‐fold cross validation with one fifth of the 285 samples used as test set. The spectral distribution of the feature map is the Gaussian distribution for the lift, and the Beta for the drag, respectively. The RRMSE mean and standard deviation from 5‐fold cross validation, are reported for different active dimensions in Table 6. The feature map from Equation (16) was adopted. The hyperparameters of the spectral distributions were tuned with logarithmic grid‐search with 5‐fold cross validation as described in Algorithm 5.

NME-7099-FIG-0010-c — Comparison between the sufficiency summary plots obtained from the application of AS and KAS methods for the lift coefficient $C_{L}$ defined in Equation (41). The left plot refers to AS, the right plot to KAS. With the blue solid line, we depict the posterior mean of the GP, with the shadow area the c68% confidence intervals, and with the blue dots the testing points.

NME-7099-FIG-0011-c — Comparison between the sufficiency summary plots obtained from the application of AS and KAS methods for the drag coefficient $C_{D}$ defined in Equation (40). The left plot refers to AS, the right plot to KAS. With the blue solid line, we depict the posterior mean of the GP, with the shadow area the c68% confidence intervals, and with the blue dots the testing points.

TABLE 6.

Summary of the results for AS and KAS procedures

Method

Dim

Feature space dim

Lift spectral distribution

RRMSE lift

Drag spectral distribution

RRMSE drag

–

0.37

\pm

0.09

–

0.268

\pm

0.032

KAS

1500

𝒩 (0, λ I_{d})

0.344

\pm

0.048

Beta (α, β)

0.218

\pm

0.045

–

0.384

\pm

0.073

–

0.183

\pm

0.027

KAS

1500

𝒩 (0, λ I_{d})

0.328

\pm

0.071

Beta (α, β)

0.17

\pm

0.02

Open in a new tab

Note: The best results are given in bold.

Regarding the drag coefficient, the relative gain using the KAS method reaches the 19.2% on average when employing the Beta spectral measure for the definition of the feature map. The relative gain of the one dimensional response surface built with GPR from the KAS method is 7% on average for the lift coefficient. This result could be due to the higher noise in the evaluation of the $C_{L}$ . In this case, the relative gain increases when the dimension of the response surface increases to 2 with a gain of 14.6%. A slight reduction of the AS RRMSE relative to the drag coefficient is ascertained when increasing the dimension of the response surface.

6. CONCLUSIONS AND PERSPECTIVES

In this work, we presented a new nonlinear extension of the active subspaces property that introduces KAS. The method exploits random Fourier features to find active subspaces on high‐dimensional feature spaces. We tested the new method over five different benchmarks of increasing complexity, and we provided pseudo‐codes for every aspects of the proposed kernel‐extension. The tested model functions range from scalar to vector‐valued. We also provide a CFD application discretized by the DG method. We compared the kernel‐based active subspaces to the standard linear active subspaces and we observed in all the cases an increment of the accuracy of the Gaussian response surfaces built over the reduced parameter spaces. The most interesting results regard the possibility to apply the KAS method when an active subspace does not exist. This was shown for radial symmetric model functions.

Future developments will involve the study of more efficient procedures for tuning the hyperparameters of the spectral distribution. Other possible advances could be done finding an effective back‐mapping from the targets to the actual parameters in the full original space. This could promote the implementation of optimization algorithms or other parameter studies enhanced by the kernel‐based active subspaces extension.

CONFLICT OF INTEREST

The authors declare no potential conflict of interest.

ACKNOWLEDGMENT

This work was partially supported by an industrial Ph.D. grant sponsored by Fincantieri S.p.A. (IRONTH Project), by MIUR (Italian Ministry for University and Research) through FARE‐X‐AROMA‐CFD project, and partially funded by European Union Funding for Research and Innovation—Horizon 2020 Program—in the framework of European Research Council Executive Agency: H2020 ERC CoG 2015 AROMA‐CFD project 681447 “Advanced Reduced Order Methods with Applications in Computational Fluid Dynamics” P.I. Professor Gianluigi Rozza. Open Access Funding provided by Scuola Internazionale Superiore di Studi Avanzati within the CRUI‐CARE Agreement.

PROOF DETAILS

1.

In this section, we provide an expanded version of the proof of Theorem 1.

The proof is remodeled from References 11 and 53, and it is developed in five steps:

1.
Since $R_{V} \in ℳ (d, d)$ is symmetric positive definite there exists a basis of eigenvectors ${(w_{i})}_{i \in {1, \dots, d}}$ and a corresponding set of positive eigenvalues ${(β_{i})}_{i \in {1, \dots, d}}$ such that
$R_{V} = \sum_{i = 1}^{d} β_{i} w_{i} \otimes w_{i} .$ (A1)

2.
Let us define the ridge approximation error as
$\begin{align} e = ‖ f - h \circ P_{r} ‖_{L^{2} (ℝ^{m}, ℬ (ℝ^{m}), ρ; V)} = 𝔼_{P} [‖ (f (X) - h (P_{r} (X)) ‖_{R_{V}}^{2}] . \end{align}$ (A2)
Then we can decompose the error analysis for each component employing the spectral decomposition (A1)
$\begin{align} 𝔼_{P} [‖ e (X) ‖_{R_{V}}^{2}] & = 𝔼_{P} [tr ((R_{V} e (X)) \otimes e (X))] \\ = \sum_{i = 1}^{d} β_{i} 𝔼_{P} [tr (((w_{i} \otimes w_{i}) e (X)) \otimes e (X))] \\ = \sum_{i = 1}^{d} β_{i} 𝔼_{P} [(w_{i} \cdot e (X)) tr (w_{i} \otimes e (X))] \\ = \sum_{i = 1}^{d} β_{i} 𝔼_{P} [{(w_{i} \cdot e (X))}^{2}], \end{align}$ (A3)
so we can define $e_{i} (X) = w_{i} \cdot e (X) = f_{i} (X) - h_{i} (P_{r} (X)), \forall i \in {1, \dots, d}$ and treat each component separately.

3.
The next step involves the application of Lemma 1 to the scalar functions $f_{i} (X) - h_{i} (P_{r} (X)), \forall i \in {1, \dots, d}$
$\begin{align} 𝔼_{P} [{(f_{i} (X) - h_{i} (P_{r} (X)))}^{2}] & = 𝔼_{P} [𝔼_{P} [{(f_{i} (X) - h_{i} (P_{r} (X)))}^{2} | σ (P_{r})]] \\ \leq 𝔼_{P} [C_{p} (ρ, P_{r} (X)) 𝔼_{P} [‖ (I - P_{r}^{T}) \nabla f_{i} (X) ‖_{2}^{2} | σ (P_{r})]] \\ \leq 𝔼_{P} {[C_{p} (ρ, P_{r} (X))]}^{\frac{1}{p}} 𝔼_{P} {[‖ (I - P_{r}^{T}) \nabla f_{i} (X) ‖_{2}^{2}]}^{\frac{1}{q}}, \end{align}$ (A4)
where we used the Hölder inequality with indexes $(p, q) = (\infty, 1)$ when $ρ$ belongs to the first and second classes of Assumption 3, and $(p, q) = (\frac{τ + 1}{τ}, 1 + τ)$ when $ρ$ belongs to the third class.

Then we can bound $𝔼_{P} {[(C_{p}, P_{r} (X))]}^{\frac{1}{p}}$ with a constant $C (C_{p} (ρ, P_{r} (X)))$ which depends on the class of $ρ$ (see Lemmas 3.1, 4.2–4.4 and Theorem 4.5 of Reference ⁵³) as follows
$\begin{align} 𝔼_{P} {[C_{p} (ρ, P_{r} (X))]}^{\frac{1}{p}} 𝔼_{P} {[‖ (I - P_{r}^{T}) \nabla f_{i} (X) ‖_{2}^{2}]}^{\frac{1}{q}} \\ \leq C (C_{p} (ρ, P_{r} (X))) tr (𝔼_{P} {[(I - P_{r}^{T}) \nabla f_{i} (X) {(\nabla f_{i} (X))}^{T} (I - P_{r})]}^{\frac{1}{q}}) \\ = C (C_{p} (ρ, P_{r} (X))) tr {((I - P_{r}^{T}) 𝔼_{P} [\nabla f_{i} (X) {(\nabla f_{i} (X))}^{T}] (I - P_{r}))}^{\frac{1}{q}} . \end{align}$ (A5)

4.
The spectral decomposition (A1) is employed again and the covariance matrix $H$ is introduced in the last equation
$\begin{align} 𝔼_{P} [‖ e (X) ‖_{R_{V}}^{2}] \\ \leq \sum_{i = 1}^{d} β_{i} C (C_{p} (ρ, P_{r} (X))) tr {((I - P_{r}^{T}) 𝔼_{P} [({(\nabla f (X))}^{T} w_{i}) \otimes ({(\nabla f (X))}^{T} w_{i})] (I - P_{r}))}^{\frac{1}{q}} \\ = C (C_{p} (ρ, P_{r} (X))) tr {((I - P_{r}^{T}) 𝔼_{P} [{(\nabla f (X))}^{T} (\sum_{i = 1}^{d} β_{i}^{q} w_{i} \otimes w_{i}) \nabla f (X)] (I - P_{r}))}^{\frac{1}{q}} \\ = C (C_{p} (ρ, P_{r} (X))) tr {((I - P_{r}^{T}) 𝔼_{P} [{(\nabla f (X))}^{T} R_{V} (ρ) \nabla f (X)] (I - P_{r}))}^{\frac{1}{q}} \\ = C (C_{p} (ρ, P_{r} (X))) tr {((I - P_{r}^{T}) H (I - P_{r}))}^{\frac{1}{q}}, \end{align}$ (A6)
where $R_{V} (ρ)$ is the original metric matrix if $ρ$ belongs to the first or second class of Assumption 3 and is equal to
$\sum_{i = 1}^{d} β_{i}^{1 + τ} w_{i} \otimes w_{i},$ (A7)
if $ρ$ belongs to the third class.

5.
Finally the bound in the statement of the theorem is recovered solving the following minimization problem with classical model reduction arguments employing SVD
${\tilde{P}}_{r} = \underset{P_{r} \in 𝒪 (m, m)}{arg min} tr ((I - P_{r}^{T}) H (I - P_{r})) .$ (A8)

Romor F, Tezzele M, Lario A, Rozza G. Kernel‐based active subspaces with application to computational fluid dynamics parametric problems using the discontinuous Galerkin method. Int J Numer Methods Eng. 2022;123(23):6000–6027. doi: 10.1002/nme.7099

Funding information H2020 European Research Council

ENDNOTES

Some authors refer to the active subspaces method, we prefer to employ the term active subspaces property, as suggested by Constantine.

^†

In this work with $ℳ (m \times n)$ , we denote the set of real matrices with $m$ rows and $n$ columns.

^‡

The dataset was taken from https://github.com/paulcon/as‐data‐sets.

DATA AVAILABILITY STATEMENT

The data that support the findings of this study are available from the corresponding author upon reasonable request.

REFERENCES

1. Brunton SL, Kutz JN. Data‐Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control. Cambridge University Press; 2019. [Google Scholar]
2. Rozza G, Malik MH, Demo N, et al. Advances in reduced order methods for parametric industrial problems in computational fluid dynamics. In: Owen R, Borst R, Reese J, Chris P, eds. ECCOMAS ECFD 7 ‐ Proceedings of 6th European Conference on Computational Mechanics (ECCM 6) and 7th European Conference on Computational Fluid Dynamics (ECFD 7). IEEE; 2018:59‐76. [Google Scholar]
3. Salmoiraghi F, Ballarin F, Corsi G, Mola A, Tezzele M, Rozza G. Advances in geometrical parametrization and reduced order models and methods for computational fluid dynamics problems in applied sciences and engineering: overview and perspectives. ECCOMAS Congress 2016 ‐ Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering; Vol. 1, 2016:1013‐1031. doi: 10.7712/100016.1867.8680 [DOI]
4. Li KC. Sliced inverse regression for dimension reduction. J Am Stat Assoc. 1991;86(414):316‐327. [Google Scholar]
5. Cook RD, Ni L. Sufficient dimension reduction via inverse regression: a minimum discrepancy approach. J Am Stat Assoc. 2005;100(470):410‐428. [Google Scholar]
6. Li L. Sparse sufficient dimension reduction. Biometrika. 2007;94(3):603‐613. [Google Scholar]
7. Wu Q, Mukherjee S, Liang F. Localized sliced inverse regression. Advances in Neural Information Processing Systems. Curran Associates; 2009:1785‐1792. [Google Scholar]
8. Russi TM. Uncertainty Quantification with Experimental Data and Complex System Models. PhD thesis. UC Berkeley, 2010.
9. Constantine PG. Active Subspaces: Emerging Ideas for Dimension Reduction in Parameter Studies. Volume 2 of SIAM Spotlights. SIAM; 2015. [Google Scholar]
10. Constantine PG, Diaz P. Global sensitivity metrics from active subspaces. Reliab Eng Syst Saf. 2017;162:1‐13. [Google Scholar]
11. Zahm O, Constantine PG, Prieur C, Marzouk YM. Gradient‐based dimension reduction of multivariate vector‐valued functions. SIAM J Sci Comput. 2020;42(1):A534‐A558. doi: 10.1137/18M1221837 [DOI] [Google Scholar]
12. Constantine PG, Emory M, Larsson J, Iaccarino G. Exploiting active subspaces to quantify uncertainty in the numerical simulation of the HyShot II scramjet. J Comput Phys. 2015;302:1‐20. [Google Scholar]
13. Jefferson JL, Gilbert JM, Constantine PG, Maxwell RM. Active subspaces for sensitivity analysis and dimension reduction of an integrated hydrologic model. Comput Geosci. 2015;83:127‐138. [Google Scholar]
14. Hesthaven JS, Rozza G, Stamm B. Certified Reduced Basis Methods for Parametrized Partial Differential Equations. Springer; 2016. [Google Scholar]
15. Quarteroni A, Rozza G. Reduced Order Methods for Modeling and Computational Reduction. Springer; 2014:9. [Google Scholar]
16. Rozza G, Hess M, Stabile G, Tezzele M, Ballarin F. Basic ideas and tools for projection‐based model reduction of parametric partial differential equations. Model Order Reduction. Vol 2. De Gruyter; 2020:1‐47. [Google Scholar]
17. Tezzele M, Ballarin F, Rozza G. Combined parameter and model reduction of cardiovascular problems by means of active subspaces and POD‐Galerkin methods. Mathematical and Numerical Modeling of the Cardiovascular System and Applications. Vol 16. Springer International Publishing; 2018:185‐207. [Google Scholar]
18. Guo M, Hesthaven JS. Reduced order modeling for nonlinear structural analysis using Gaussian process regression. Comput Methods Appl Mech Eng. 2018;341:807‐826. [Google Scholar]
19. Tezzele M, Salmoiraghi F, Mola A, Rozza G. Dimension reduction in heterogeneous parametric spaces with application to naval engineering shape design problems. Adv Model Simul Eng Sci. 2018;5(1):25. doi: 10.1186/s40323-018-0118-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
20. Tezzele M, Demo N, Mola A, Rozza G. An integrated data‐driven computational pipeline with model order reduction for industrial and applied mathematics. In: Günther M, Schilders W, eds. Novel Mathematics Inspired by Industrial Challenges. Mathematics in Industry. Vol 38. Springer International Publishing; 2022. [Google Scholar]
21. Tezzele M, Demo N, Rozza G. Shape optimization through proper orthogonal decomposition with interpolation and dynamic mode decomposition enhanced by active subspaces. Proceedings of MARINE 2019: VIII International Conference on Computational Methods in Marine Engineering; 2019:122‐133.
22. Tezzele M, Demo N, Gadalla M, Mola A, Rozza G. Model order reduction by means of active subspaces and dynamic mode decomposition for parametric hull shape design hydrodynamics. Proceedings of the Technology and Science for the Ships of the Future: Proceedings of NAV 2018: 19th International Conference on Ship & Maritime Research; 2018:569‐576; IOS Press.
23. Demo N, Tezzele M, Rozza G. A non‐intrusive approach for reconstruction of POD modal coefficients through active subspaces. Comptes Rendus Mécanique de l'Académie des Sciences, DataBEST 2019 Special Issue. 2019;347(11):873‐881. doi: 10.1016/j.crme.2019.11.012 [DOI] [Google Scholar]
24. Tezzele M, Fabris L, Sidari M, Sicchiero M, Rozza G. A multi‐fidelity approach coupling parameter space reduction and non‐intrusive POD with application to structural optimization of passenger ship hulls. arXiv preprint arXiv:2206.01243 Submitted; 2022. [DOI] [PMC free article] [PubMed]
25. Tezzele M, Demo N, Stabile G, Mola A, Rozza G. Enhancing CFD predictions in shape design problems by model and parameter space reduction. Adv Model Simul Eng Sci. 2020;7(40). doi: 10.1186/s40323-020-00177-y [DOI] [Google Scholar]
26. Romor F, Tezzele M, Mrosek M, Othmer C, Rozza G. Multi‐fidelity data fusion through parameter space reduction with applications to automotive engineering. arXiv preprint arXiv:2110.14396 Submitted; 2021.
27. Seshadri P, Shahpar S, Constantine P, Parks G, Adams M. Turbomachinery active subspace performance maps. J Turbomach. 2018;140(4):041003. doi: 10.1115/1.4038839 [DOI] [Google Scholar]
28. Ji W, Ren Z, Marzouk Y, Law CK. Quantifying kinetic uncertainty in turbulent combustion simulations using active subspaces. Proc Combust Inst. 2019;37(2):2175‐2182. doi: 10.1016/j.proci.2018.06.206 [DOI] [Google Scholar]
29. Vohra M, Alexanderian A, Guy H, Mahadevan S. Active subspace‐based dimension reduction for chemical kinetics applications with epistemic uncertainty. Combust Flame. 2019;204:152‐161. doi: 10.1016/j.combustflame.2019.03.006 [DOI] [Google Scholar]
30. Lukaczyk TW, Constantine P, Palacios F, Alonso JJ. Active subspaces for shape optimization. Proceedings of the 10th AIAA multidisciplinary design optimization conference; 2014:1171.
31. Lukaczyk TW. Surrogate Modeling and Active Subspaces for Efficient Optimization of Supersonic Aircraft. PhD thesis. Stanford University, 2015.
32. Tripathy R, Bilionis I, Gonzalez M. Gaussian processes with built‐in dimensionality reduction: applications to high‐dimensional uncertainty propagation. J Comput Phys. 2016;321:191‐223. doi: 10.1016/j.jcp.2016.05.039 [DOI] [Google Scholar]
33. Ghoreishi SF, Friedman S, Allaire DL. Adaptive dimensionality reduction for fast sequential optimization with Gaussian processes. J Mech Des. 2019;141(7):071404. [Google Scholar]
34. Demo N, Tezzele M, Rozza G. A supervised learning approach involving active subspaces for an efficient genetic algorithm in high‐dimensional optimization problems. SIAM J Sci Comput. 2021;43(3):B831‐B853. doi: 10.1137/20M1345219 [DOI] [Google Scholar]
35. Demo N, Tezzele M, Mola A, Rozza G. Hull shape design optimization with parameter space and model reductions, and self‐learning mesh morphing. J Marine Sci Eng. 2021;9(2):185. doi: 10.3390/jmse9020185 [DOI] [Google Scholar]
36. Cui C, Zhang K, Daulbaev T, Gusak J, Oseledets I, Zhang Z. Active subspace of neural networks: structural analysis and universal attacks. SIAM J Math Data Sci. 2020;2(4):1096‐1122. doi: 10.1137/19M1296070 [DOI] [Google Scholar]
37. Meneghetti L, Demo N, Rozza G. A dimensionality reduction approach for convolutional neural networks. arXiv preprint arXiv:2110.09163 Submitted; 2021.
38. Romor F, Tezzele M, Rozza G. A local approach to parameter space reduction for regression and classification tasks. arXiv preprint arXiv:2107.10867 Submitted; 2021.
39. Bridges RA, Gruber AD, Felder C, Verma ME, Hoff C. Active manifolds: a non‐linear analogue to active subspaces. Proceedings of International Conference on Machine Learning; 2019.
40. Ji W, Wang J, Zahm O, et al. Shared low‐dimensional subspaces for propagating kinetic uncertainty to multiple outputs. Combust Flame. 2018;190:146‐157. [Google Scholar]
41. Aguiar IP. Dynamic Active Subspaces: A Data‐Driven Approach to Computing Time‐Dependent Active Subspaces in Dynamical Systems. Master's thesis. University of Colorado Boulder; 2018.
42. Zhang G, Zhang J, Hinkle J. Learning nonlinear level sets for dimensionality reduction in function approximation. Advances in Neural Information Processing Systems. Curran Associates; 2019:13199‐13208. [Google Scholar]
43. Sriperumbudur B, Sterge N. Approximate kernel PCA using random features: computational vs. statistical trade‐off. arXiv preprint arXiv:1706.06296, 2017.
44. Barshan E, Ghodsi A, Azimifar Z, Jahromi MZ. Supervised principal component analysis: visualization, classification and regression on subspaces and submanifolds. Pattern Recogn. 2011;44(7):1357‐1371. [Google Scholar]
45. Héas P, Herzet C, Combes B. Generalized kernel‐based dynamic mode decomposition. Proceedings of the ICASSP 2020‐2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 2020:3877‐3881; IEEE.
46. Kevrekidis I, Rowley C, Williams M. A kernel‐based method for data‐driven Koopman spectral analysis. J Comput Dyn. 2015;2(2):247‐265. [Google Scholar]
47. Mika S, Schölkopf B, Smola AJ, Müller KR, Scholz M, Rätsch G. Kernel PCA and de‐noising in feature spaces. Advances in Neural Information Processing Systems. MIT Press; 1999:536‐542. [Google Scholar]
48. Palaci‐Olgun M. Gaussian Process Modeling and Supervised Dimensionality Reduction Algorithms via Stiefel Manifold Learning. Master's thesis. University of Toronto Institute of Aerospace Studies. 2018.
49. Rahimi A, Recht B. Random features for large‐scale kernel machines. Advances in Neural Information Processing Systems. Curran Associates; 2008:1177‐1184. [Google Scholar]
50. Li Z, Ton JF, Oglic D, Sejdinovic D. Towards a unified analysis of random Fourier features. Proceedings of Machine Learning Research. Vol 97. ICML; 2019:3905‐3914. [Google Scholar]
51. Williams CK, Rasmussen CE. Gaussian Processes for Machine Learning. Adaptive Computation and Machine Learning series. MIT Press; 2006. [Google Scholar]
52. Hesthaven JS, Warburton T. Nodal Discontinuous Galerkin Methods: Algorithms, Analysis, and Applications. Springer Science & Business Media; 2007. [Google Scholar]
53. Parente MT, Wallin J, Wohlmuth B. Generalized bounds for active subspaces. Electron J Stat. 2020;14(1):917‐943. [Google Scholar]
54. Bobrowski A. Functional Analysis for Probability and Stochastic Processes: An Introduction. Cambridge University Press; 2005. [Google Scholar]
55. Constantine PG, Dow E, Wang Q. Active subspace methods in theory and practice: applications to kriging surfaces. SIAM J Sci Comput. 2014;36(4):A1500‐A1524. [Google Scholar]
56. Warner FW. Foundations of Differentiable Manifolds and Lie Groups. 4th ed. Springer Science & Business Media; 1983. [Google Scholar]
57. Romor F, Tezzele M, Rozza G. ATHENA: advanced techniques for high dimensional parameter spaces to enhance numerical analysis. Software Impacts. 2021;10:100133. doi: 10.1016/j.simpa.2021.100133 [DOI] [Google Scholar]
58. Berlinet A, Thomas‐Agnan C. Reproducing Kernel Hilbert Spaces in Probability and Statistics. Springer Science & Business Media; 2011. [Google Scholar]
59. Tripathy R, Bilionis I. Deep active subspaces: a scalable method for high‐dimensional uncertainty propagation. Proceedings of the ASME 2019 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. American Society of Mechanical Engineers Digital Collection; 2019.
60. Virtanen P, Gommers R, Oliphant TE, et al. SciPy 1.0: fundamental algorithms for scientific computing in python. Nat Methods. 2020;17:261‐272. doi: 10.1038/s41592-019-0686-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
61. GPyOpt: a Bayesian optimization framework in python; 2016. http://github.com/SheffieldML/GPyOpt
62. Mohri M, Rostamizadeh A, Talwalkar A. Foundations of Machine Learning. MIT Press; 2018. [Google Scholar]
63. Schölkopf B, Smola AJ, Bach F. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press; 2002. [Google Scholar]
64. Schölkopf B, Smola A, Müller KR. Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 1998;10(5):1299‐1319. [Google Scholar]
65. GPy . GPy: a Gaussian process framework in Python; 2012. http://github.com/SheffieldML/GPy
66. Diaz P, Constantine P, Kalmbach K, Jones E, Pankavich S. A modified SEIR model for the spread of Ebola in Western Africa and metrics for resource allocation. Appl Math Comput. 2018;324:141‐155. [Google Scholar]
67. HopeFOAM extension of OpenFOAM; 2017. https://github.com/HopeFOAM/HopeFOAM
68. https://openfoam.org/
69. Weller HG, Tabor G, Jasak H, Fureby C. A tensorial approach to computational continuum mechanics using object‐oriented techniques. Comput Phys. 1998;12(6):620‐631. doi: 10.1063/1.168744 [DOI] [Google Scholar]
70. Zahr MJ, Persson PO. An adjoint method for a high‐order discretization of deforming domain conservation laws for optimization of flow problems. J Comput Phys. 2016;326:516‐543. [Google Scholar]
71. Nguyen NC, Peraire J, Cockburn B. An implicit high‐order hybridizable discontinuous Galerkin method for linear convection–Diffusion equations. J Comput Phys. 2009;228(9):3232‐3254. [Google Scholar]
72. Pazner W, Persson PO. Stage‐parallel fully implicit Runge–Kutta solvers for discontinuous Galerkin fluid simulations. J Comput Phys. 2017;335:700‐717. doi: 10.1016/j.jcp.2017.01.050 [DOI] [Google Scholar]
73. Gmesh . A three‐dimensional finite element mesh generator with built‐in pre‐ and post‐processing facilities. http://gmsh.info/
74. Gazdag J. Time‐differencing schemes and transform methods. J Comput Phys. 1976;20(2):196‐207. [Google Scholar]
75. Arnold DN. An interior penalty finite element method with discontinuous elements. SIAM J Numer Anal. 1982;19(4):742‐760. doi: 10.1137/0719052 [DOI] [Google Scholar]
76. Shahbazi K. An explicit expression for the penalty parameter of the interior penalty method. J Comput Phys. 2005;205(2):401‐407. doi: 10.1016/j.jcp.2004.11.017 [DOI] [Google Scholar]
77. Kundu PK, Cohen IM, Dowling DR. Fluid Mechanics. 5th ed. Academic Press; 2012. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

[nme7099-bib-0001] 1. Brunton SL, Kutz JN. Data‐Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control. Cambridge University Press; 2019. [Google Scholar]

[nme7099-bib-0002] 2. Rozza G, Malik MH, Demo N, et al. Advances in reduced order methods for parametric industrial problems in computational fluid dynamics. In: Owen R, Borst R, Reese J, Chris P, eds. ECCOMAS ECFD 7 ‐ Proceedings of 6th European Conference on Computational Mechanics (ECCM 6) and 7th European Conference on Computational Fluid Dynamics (ECFD 7). IEEE; 2018:59‐76. [Google Scholar]

[nme7099-bib-0003] 3. Salmoiraghi F, Ballarin F, Corsi G, Mola A, Tezzele M, Rozza G. Advances in geometrical parametrization and reduced order models and methods for computational fluid dynamics problems in applied sciences and engineering: overview and perspectives. ECCOMAS Congress 2016 ‐ Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering; Vol. 1, 2016:1013‐1031. doi: 10.7712/100016.1867.8680 [DOI]

[nme7099-bib-0004] 4. Li KC. Sliced inverse regression for dimension reduction. J Am Stat Assoc. 1991;86(414):316‐327. [Google Scholar]

[nme7099-bib-0005] 5. Cook RD, Ni L. Sufficient dimension reduction via inverse regression: a minimum discrepancy approach. J Am Stat Assoc. 2005;100(470):410‐428. [Google Scholar]

[nme7099-bib-0006] 6. Li L. Sparse sufficient dimension reduction. Biometrika. 2007;94(3):603‐613. [Google Scholar]

[nme7099-bib-0007] 7. Wu Q, Mukherjee S, Liang F. Localized sliced inverse regression. Advances in Neural Information Processing Systems. Curran Associates; 2009:1785‐1792. [Google Scholar]

[nme7099-bib-0008] 8. Russi TM. Uncertainty Quantification with Experimental Data and Complex System Models. PhD thesis. UC Berkeley, 2010.

[nme7099-bib-0009] 9. Constantine PG. Active Subspaces: Emerging Ideas for Dimension Reduction in Parameter Studies. Volume 2 of SIAM Spotlights. SIAM; 2015. [Google Scholar]

[nme7099-bib-0010] 10. Constantine PG, Diaz P. Global sensitivity metrics from active subspaces. Reliab Eng Syst Saf. 2017;162:1‐13. [Google Scholar]

[nme7099-bib-0011] 11. Zahm O, Constantine PG, Prieur C, Marzouk YM. Gradient‐based dimension reduction of multivariate vector‐valued functions. SIAM J Sci Comput. 2020;42(1):A534‐A558. doi: 10.1137/18M1221837 [DOI] [Google Scholar]

[nme7099-bib-0012] 12. Constantine PG, Emory M, Larsson J, Iaccarino G. Exploiting active subspaces to quantify uncertainty in the numerical simulation of the HyShot II scramjet. J Comput Phys. 2015;302:1‐20. [Google Scholar]

[nme7099-bib-0013] 13. Jefferson JL, Gilbert JM, Constantine PG, Maxwell RM. Active subspaces for sensitivity analysis and dimension reduction of an integrated hydrologic model. Comput Geosci. 2015;83:127‐138. [Google Scholar]

[nme7099-bib-0014] 14. Hesthaven JS, Rozza G, Stamm B. Certified Reduced Basis Methods for Parametrized Partial Differential Equations. Springer; 2016. [Google Scholar]

[nme7099-bib-0015] 15. Quarteroni A, Rozza G. Reduced Order Methods for Modeling and Computational Reduction. Springer; 2014:9. [Google Scholar]

[nme7099-bib-0016] 16. Rozza G, Hess M, Stabile G, Tezzele M, Ballarin F. Basic ideas and tools for projection‐based model reduction of parametric partial differential equations. Model Order Reduction. Vol 2. De Gruyter; 2020:1‐47. [Google Scholar]

[nme7099-bib-0017] 17. Tezzele M, Ballarin F, Rozza G. Combined parameter and model reduction of cardiovascular problems by means of active subspaces and POD‐Galerkin methods. Mathematical and Numerical Modeling of the Cardiovascular System and Applications. Vol 16. Springer International Publishing; 2018:185‐207. [Google Scholar]

[nme7099-bib-0018] 18. Guo M, Hesthaven JS. Reduced order modeling for nonlinear structural analysis using Gaussian process regression. Comput Methods Appl Mech Eng. 2018;341:807‐826. [Google Scholar]

[nme7099-bib-0019] 19. Tezzele M, Salmoiraghi F, Mola A, Rozza G. Dimension reduction in heterogeneous parametric spaces with application to naval engineering shape design problems. Adv Model Simul Eng Sci. 2018;5(1):25. doi: 10.1186/s40323-018-0118-3 [DOI] [PMC free article] [PubMed] [Google Scholar]

[nme7099-bib-0020] 20. Tezzele M, Demo N, Mola A, Rozza G. An integrated data‐driven computational pipeline with model order reduction for industrial and applied mathematics. In: Günther M, Schilders W, eds. Novel Mathematics Inspired by Industrial Challenges. Mathematics in Industry. Vol 38. Springer International Publishing; 2022. [Google Scholar]

[nme7099-bib-0021] 21. Tezzele M, Demo N, Rozza G. Shape optimization through proper orthogonal decomposition with interpolation and dynamic mode decomposition enhanced by active subspaces. Proceedings of MARINE 2019: VIII International Conference on Computational Methods in Marine Engineering; 2019:122‐133.

[nme7099-bib-0022] 22. Tezzele M, Demo N, Gadalla M, Mola A, Rozza G. Model order reduction by means of active subspaces and dynamic mode decomposition for parametric hull shape design hydrodynamics. Proceedings of the Technology and Science for the Ships of the Future: Proceedings of NAV 2018: 19th International Conference on Ship & Maritime Research; 2018:569‐576; IOS Press.

[nme7099-bib-0023] 23. Demo N, Tezzele M, Rozza G. A non‐intrusive approach for reconstruction of POD modal coefficients through active subspaces. Comptes Rendus Mécanique de l'Académie des Sciences, DataBEST 2019 Special Issue. 2019;347(11):873‐881. doi: 10.1016/j.crme.2019.11.012 [DOI] [Google Scholar]

[nme7099-bib-0024] 24. Tezzele M, Fabris L, Sidari M, Sicchiero M, Rozza G. A multi‐fidelity approach coupling parameter space reduction and non‐intrusive POD with application to structural optimization of passenger ship hulls. arXiv preprint arXiv:2206.01243 Submitted; 2022. [DOI] [PMC free article] [PubMed]

[nme7099-bib-0025] 25. Tezzele M, Demo N, Stabile G, Mola A, Rozza G. Enhancing CFD predictions in shape design problems by model and parameter space reduction. Adv Model Simul Eng Sci. 2020;7(40). doi: 10.1186/s40323-020-00177-y [DOI] [Google Scholar]

[nme7099-bib-0026] 26. Romor F, Tezzele M, Mrosek M, Othmer C, Rozza G. Multi‐fidelity data fusion through parameter space reduction with applications to automotive engineering. arXiv preprint arXiv:2110.14396 Submitted; 2021.

[nme7099-bib-0027] 27. Seshadri P, Shahpar S, Constantine P, Parks G, Adams M. Turbomachinery active subspace performance maps. J Turbomach. 2018;140(4):041003. doi: 10.1115/1.4038839 [DOI] [Google Scholar]

[nme7099-bib-0028] 28. Ji W, Ren Z, Marzouk Y, Law CK. Quantifying kinetic uncertainty in turbulent combustion simulations using active subspaces. Proc Combust Inst. 2019;37(2):2175‐2182. doi: 10.1016/j.proci.2018.06.206 [DOI] [Google Scholar]

[nme7099-bib-0029] 29. Vohra M, Alexanderian A, Guy H, Mahadevan S. Active subspace‐based dimension reduction for chemical kinetics applications with epistemic uncertainty. Combust Flame. 2019;204:152‐161. doi: 10.1016/j.combustflame.2019.03.006 [DOI] [Google Scholar]

[nme7099-bib-0030] 30. Lukaczyk TW, Constantine P, Palacios F, Alonso JJ. Active subspaces for shape optimization. Proceedings of the 10th AIAA multidisciplinary design optimization conference; 2014:1171.

[nme7099-bib-0031] 31. Lukaczyk TW. Surrogate Modeling and Active Subspaces for Efficient Optimization of Supersonic Aircraft. PhD thesis. Stanford University, 2015.

[nme7099-bib-0032] 32. Tripathy R, Bilionis I, Gonzalez M. Gaussian processes with built‐in dimensionality reduction: applications to high‐dimensional uncertainty propagation. J Comput Phys. 2016;321:191‐223. doi: 10.1016/j.jcp.2016.05.039 [DOI] [Google Scholar]

[nme7099-bib-0033] 33. Ghoreishi SF, Friedman S, Allaire DL. Adaptive dimensionality reduction for fast sequential optimization with Gaussian processes. J Mech Des. 2019;141(7):071404. [Google Scholar]

[nme7099-bib-0034] 34. Demo N, Tezzele M, Rozza G. A supervised learning approach involving active subspaces for an efficient genetic algorithm in high‐dimensional optimization problems. SIAM J Sci Comput. 2021;43(3):B831‐B853. doi: 10.1137/20M1345219 [DOI] [Google Scholar]

[nme7099-bib-0035] 35. Demo N, Tezzele M, Mola A, Rozza G. Hull shape design optimization with parameter space and model reductions, and self‐learning mesh morphing. J Marine Sci Eng. 2021;9(2):185. doi: 10.3390/jmse9020185 [DOI] [Google Scholar]

[nme7099-bib-0036] 36. Cui C, Zhang K, Daulbaev T, Gusak J, Oseledets I, Zhang Z. Active subspace of neural networks: structural analysis and universal attacks. SIAM J Math Data Sci. 2020;2(4):1096‐1122. doi: 10.1137/19M1296070 [DOI] [Google Scholar]

[nme7099-bib-0037] 37. Meneghetti L, Demo N, Rozza G. A dimensionality reduction approach for convolutional neural networks. arXiv preprint arXiv:2110.09163 Submitted; 2021.

[nme7099-bib-0038] 38. Romor F, Tezzele M, Rozza G. A local approach to parameter space reduction for regression and classification tasks. arXiv preprint arXiv:2107.10867 Submitted; 2021.

[nme7099-bib-0039] 39. Bridges RA, Gruber AD, Felder C, Verma ME, Hoff C. Active manifolds: a non‐linear analogue to active subspaces. Proceedings of International Conference on Machine Learning; 2019.

[nme7099-bib-0040] 40. Ji W, Wang J, Zahm O, et al. Shared low‐dimensional subspaces for propagating kinetic uncertainty to multiple outputs. Combust Flame. 2018;190:146‐157. [Google Scholar]

[nme7099-bib-0041] 41. Aguiar IP. Dynamic Active Subspaces: A Data‐Driven Approach to Computing Time‐Dependent Active Subspaces in Dynamical Systems. Master's thesis. University of Colorado Boulder; 2018.

[nme7099-bib-0042] 42. Zhang G, Zhang J, Hinkle J. Learning nonlinear level sets for dimensionality reduction in function approximation. Advances in Neural Information Processing Systems. Curran Associates; 2019:13199‐13208. [Google Scholar]

[nme7099-bib-0043] 43. Sriperumbudur B, Sterge N. Approximate kernel PCA using random features: computational vs. statistical trade‐off. arXiv preprint arXiv:1706.06296, 2017.

[nme7099-bib-0044] 44. Barshan E, Ghodsi A, Azimifar Z, Jahromi MZ. Supervised principal component analysis: visualization, classification and regression on subspaces and submanifolds. Pattern Recogn. 2011;44(7):1357‐1371. [Google Scholar]

[nme7099-bib-0045] 45. Héas P, Herzet C, Combes B. Generalized kernel‐based dynamic mode decomposition. Proceedings of the ICASSP 2020‐2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 2020:3877‐3881; IEEE.

[nme7099-bib-0046] 46. Kevrekidis I, Rowley C, Williams M. A kernel‐based method for data‐driven Koopman spectral analysis. J Comput Dyn. 2015;2(2):247‐265. [Google Scholar]

[nme7099-bib-0047] 47. Mika S, Schölkopf B, Smola AJ, Müller KR, Scholz M, Rätsch G. Kernel PCA and de‐noising in feature spaces. Advances in Neural Information Processing Systems. MIT Press; 1999:536‐542. [Google Scholar]

[nme7099-bib-0048] 48. Palaci‐Olgun M. Gaussian Process Modeling and Supervised Dimensionality Reduction Algorithms via Stiefel Manifold Learning. Master's thesis. University of Toronto Institute of Aerospace Studies. 2018.

[nme7099-bib-0049] 49. Rahimi A, Recht B. Random features for large‐scale kernel machines. Advances in Neural Information Processing Systems. Curran Associates; 2008:1177‐1184. [Google Scholar]

[nme7099-bib-0050] 50. Li Z, Ton JF, Oglic D, Sejdinovic D. Towards a unified analysis of random Fourier features. Proceedings of Machine Learning Research. Vol 97. ICML; 2019:3905‐3914. [Google Scholar]

[nme7099-bib-0051] 51. Williams CK, Rasmussen CE. Gaussian Processes for Machine Learning. Adaptive Computation and Machine Learning series. MIT Press; 2006. [Google Scholar]

[nme7099-bib-0052] 52. Hesthaven JS, Warburton T. Nodal Discontinuous Galerkin Methods: Algorithms, Analysis, and Applications. Springer Science & Business Media; 2007. [Google Scholar]

[nme7099-bib-0053] 53. Parente MT, Wallin J, Wohlmuth B. Generalized bounds for active subspaces. Electron J Stat. 2020;14(1):917‐943. [Google Scholar]

[nme7099-bib-0054] 54. Bobrowski A. Functional Analysis for Probability and Stochastic Processes: An Introduction. Cambridge University Press; 2005. [Google Scholar]

[nme7099-bib-0055] 55. Constantine PG, Dow E, Wang Q. Active subspace methods in theory and practice: applications to kriging surfaces. SIAM J Sci Comput. 2014;36(4):A1500‐A1524. [Google Scholar]

[nme7099-bib-0056] 56. Warner FW. Foundations of Differentiable Manifolds and Lie Groups. 4th ed. Springer Science & Business Media; 1983. [Google Scholar]

[nme7099-bib-0057] 57. Romor F, Tezzele M, Rozza G. ATHENA: advanced techniques for high dimensional parameter spaces to enhance numerical analysis. Software Impacts. 2021;10:100133. doi: 10.1016/j.simpa.2021.100133 [DOI] [Google Scholar]

[nme7099-bib-0058] 58. Berlinet A, Thomas‐Agnan C. Reproducing Kernel Hilbert Spaces in Probability and Statistics. Springer Science & Business Media; 2011. [Google Scholar]

[nme7099-bib-0059] 59. Tripathy R, Bilionis I. Deep active subspaces: a scalable method for high‐dimensional uncertainty propagation. Proceedings of the ASME 2019 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. American Society of Mechanical Engineers Digital Collection; 2019.

[nme7099-bib-0060] 60. Virtanen P, Gommers R, Oliphant TE, et al. SciPy 1.0: fundamental algorithms for scientific computing in python. Nat Methods. 2020;17:261‐272. doi: 10.1038/s41592-019-0686-2 [DOI] [PMC free article] [PubMed] [Google Scholar]

[nme7099-bib-0061] 61. GPyOpt: a Bayesian optimization framework in python; 2016. http://github.com/SheffieldML/GPyOpt

[nme7099-bib-0062] 62. Mohri M, Rostamizadeh A, Talwalkar A. Foundations of Machine Learning. MIT Press; 2018. [Google Scholar]

[nme7099-bib-0063] 63. Schölkopf B, Smola AJ, Bach F. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press; 2002. [Google Scholar]

[nme7099-bib-0064] 64. Schölkopf B, Smola A, Müller KR. Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 1998;10(5):1299‐1319. [Google Scholar]

[nme7099-bib-0065] 65. GPy . GPy: a Gaussian process framework in Python; 2012. http://github.com/SheffieldML/GPy

[nme7099-bib-0066] 66. Diaz P, Constantine P, Kalmbach K, Jones E, Pankavich S. A modified SEIR model for the spread of Ebola in Western Africa and metrics for resource allocation. Appl Math Comput. 2018;324:141‐155. [Google Scholar]

[nme7099-bib-0067] 67. HopeFOAM extension of OpenFOAM; 2017. https://github.com/HopeFOAM/HopeFOAM

[nme7099-bib-0068] 68. https://openfoam.org/

[nme7099-bib-0069] 69. Weller HG, Tabor G, Jasak H, Fureby C. A tensorial approach to computational continuum mechanics using object‐oriented techniques. Comput Phys. 1998;12(6):620‐631. doi: 10.1063/1.168744 [DOI] [Google Scholar]

[nme7099-bib-0070] 70. Zahr MJ, Persson PO. An adjoint method for a high‐order discretization of deforming domain conservation laws for optimization of flow problems. J Comput Phys. 2016;326:516‐543. [Google Scholar]

[nme7099-bib-0071] 71. Nguyen NC, Peraire J, Cockburn B. An implicit high‐order hybridizable discontinuous Galerkin method for linear convection–Diffusion equations. J Comput Phys. 2009;228(9):3232‐3254. [Google Scholar]

[nme7099-bib-0072] 72. Pazner W, Persson PO. Stage‐parallel fully implicit Runge–Kutta solvers for discontinuous Galerkin fluid simulations. J Comput Phys. 2017;335:700‐717. doi: 10.1016/j.jcp.2017.01.050 [DOI] [Google Scholar]

[nme7099-bib-0073] 73. Gmesh . A three‐dimensional finite element mesh generator with built‐in pre‐ and post‐processing facilities. http://gmsh.info/

[nme7099-bib-0074] 74. Gazdag J. Time‐differencing schemes and transform methods. J Comput Phys. 1976;20(2):196‐207. [Google Scholar]

[nme7099-bib-0075] 75. Arnold DN. An interior penalty finite element method with discontinuous elements. SIAM J Numer Anal. 1982;19(4):742‐760. doi: 10.1137/0719052 [DOI] [Google Scholar]

[nme7099-bib-0076] 76. Shahbazi K. An explicit expression for the penalty parameter of the interior penalty method. J Comput Phys. 2005;205(2):401‐407. doi: 10.1016/j.jcp.2004.11.017 [DOI] [Google Scholar]

[nme7099-bib-0077] 77. Kundu PK, Cohen IM, Dowling DR. Fluid Mechanics. 5th ed. Academic Press; 2012. [Google Scholar]

PERMALINK

Kernel‐based active subspaces with application to computational fluid dynamics parametric problems using the discontinuous Galerkin method

Francesco Romor

Marco Tezzele

Andrea Lario

Gianluigi Rozza

Abstract

1. INTRODUCTION

2. ACTIVE SUBSPACES FOR PARAMETER SPACE REDUCTION

Definition 1

Definition 2

Definition 3

Proposition 1

Theorem 1

Algorithm 1. Active subspace computation.

1.

2.1. Response surfaces

Algorithm 2. Response surface construction with Gaussian process regression over the active subspace.

1.

Algorithm 3. Prediction phase using the Gaussian process response surface over the active subspace.

1.

3. KERNEL‐BASED ACTIVE SUBSPACES EXTENSION

FIGURE 1.

3.1. Choice of the feature map

Algorithm 5. Tuning the feature map with logarithmic grid‐search.

1.

Algorithm 4. Kernel‐based active subspace computation.

1.

3.2. Random Fourier features

4. BENCHMARK TEST PROBLEMS

4.1. Radial symmetric functions

TABLE 1.

FIGURE 2.

FIGURE 3.

FIGURE 4.

4.2. SEIR model for the spread of Ebola

TABLE 2.

FIGURE 5.

4.3. Elliptic partial differential equation with random coefficients

FIGURE 6.

TABLE 3.

5. A CFD PARAMETRIC APPLICATION OF KAS SOLVED WITH THE DG METHOD

5.1. Domain and mesh description

5.2. Parameter space description

TABLE 4.

FIGURE 7.

5.3. Governing equations

Algorithm 6. Chorin algorithm.

1.

5.4. Numerical results

FIGURE 8.

TABLE 5.

FIGURE 9.

FIGURE 10.

FIGURE 11.

TABLE 6.

6. CONCLUSIONS AND PERSPECTIVES

CONFLICT OF INTEREST

ACKNOWLEDGMENT

PROOF DETAILS

1.

ENDNOTES

DATA AVAILABILITY STATEMENT

REFERENCES

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases