Abstract
We propose an iterative method to evaluate the feedback control kernel of a chaotic system directly from the system’s attractor. Such kernels are currently computed using standard linear optimal control theory, known as linear quadratic regulator theory. This is however applicable only to linear systems, which are obtained by linearizing the system governing equations around a target state. In the present paper, we employ the preconditioned multiple shooting shadowing (PMSS) algorithm to compute the kernel directly from the nonlinear dynamics, thereby bypassing the linear approximation. Using the adjoint version of the PMSS algorithm, we show that we can compute the kernel at any point of the domain in a single computation. The algorithm replaces the standard adjoint equation (that is ill-conditioned for chaotic systems) with a well-conditioned adjoint, producing reliable sensitivities which are used to evaluate the feedback matrix elements. We apply the idea to the Kuramoto–Sivashinsky equation. We compare the computed kernel with that produced by the standard linear quadratic regulator algorithm and note similarities and differences. Both kernels are stabilizing, have compact support and similar shape. We explain the shape using two-point spatial correlations that capture the streaky structure of the solution of the uncontrolled system.
Keywords: feedback control, chaos, shadowing, Kuramoto–Sivashinsky equation
1. Introduction
Control of chaotic systems is a very active field of research. The objective is to find the actuation (control input) that will make the system meet a desired objective. Common examples, from the area of turbulence, include drag reduction, transition delay, suppression of recirculation zones in the suction side of aerofoils, etc. The control problem is usually formulated mathematically as an optimization problem (usually minimization of a functional) and is solved using standard techniques from the calculus of variations [1]. The solution method is iterative and relies on forward (in time) integration of the governing equations followed by backward (reverse in time) integration of the adjoint equations (see [2] for an application to control of transitional flow). The adjoint equations are a set of linearized equations around the current trajectory of the system in phase space. They are obtained when applying the optimality conditions to the augmented functional and using integration by parts (refer to [3] for more details). From the adjoint variables, the sensitivity of the functional to the control input(s) is obtained which is used to update the latter in the next iteration.
For linear time-invariant systems with a quadratic control objective, application of the aforementioned calculus of variation technique results in a set of three equations that can be reduced to just one Riccati equation for a matrix P using the sweep method [4]. This matrix relates linearly the adjoint and state variables of the system. From P, the feedback matrix K, that relates the optimal input to the states of the system, can be easily obtained. This is a very well-developed theory, known as linear quadratic regulator (LQR) optimal control theory [5]. It has been extended to handle noisy measurements and unknown system dynamics, modelled either as white Gaussian noise or as maximally malevolent; these are known as LQG and H∞ (or robust) approaches respectively [5]. In the area of flow control, all approaches have been applied successfully to suppress disturbances described by the linearized Navier–Stokes equations to a variety of flow settings [6–9]. Application of linear controllers is also effective for systems with energy-conserving nonlinearities, for example, channel flows. For such cases, in the energy equation, the integral of the nonlinear terms over the domain vanishes. The underlying physical meaning is that the role of nonlinearity is to transfer energy from larger to smaller scales (a process known as energy cascade) but do not generate energy. The nonlinear terms are therefore passive and this has motivated the application of passivity-based optimal linear control to suppress turbulence in channel flows at low Reynolds number, Reτ = 100 (see [10,11]).
The wider application of model-based, linear feedback control strategies to fluid problems is however restricted by the very high system dimension, N, which makes the solution of the Riccati equation to extract P intractable (the cost scales as O(N3)). This problem arises for complex geometries that do not possess one or more homogeneous directions. In such cases, the standard practise is to derive a reduced order model for the uncontrolled flow, synthesize the controller based on this model, and then apply it to the full system (see [9] for application of this approach to suppress linear three-dimensional disturbances in a boundary layer flow). This approach however is not optimal as the reduced order model captures the dynamics of the uncontrolled flow (open loop) and not that of the controlled flow (closed loop).
A few attempts have been made to bypass the solution of the Riccati equation, and therefore the need for a reduced-order model. These approaches compute directly the elements of the feedback matrix K using iterative methods that rely on the integration of the governing and adjoint equations in a forward/backward loop, exactly as described in the beginning of this section. They scale to large N and hence are suitable for systems arising from the discretization of the Navier–Stokes equations in complex domains. In [12,13], the authors have proposed a method that can provide the elements of the matrix row-by-row. It is based on the repeated iterative computation of the adjoint of a forward problem, the latter defined for the direct-adjoint vector pair associated with the LQR problem. The method was extended to the estimation problem in [14] and applied successfully to a two-dimensional boundary layer, while in [15] the method was extended to robust (H∞) control. In [16,17], all matrix elements are computed directly. The cost function of the LQR problem is written in terms of K and the latter is computed iteratively using forward/backward marching of the governing and adjoint equations.
All the above approaches have been applied to linear, time-invariant systems. In the present paper, we use instead the nonlinear governing equations as constraints to compute K directly. We extract this matrix from the system attractor and do not apply any model reduction (an area which is far less well developed for nonlinear systems compared to linear ones). The application of the iterative approach however to extract K is not straightforward. The reason is that for chaotic systems (which are always nonlinear), during the backward integration step, the adjoint variables grow exponentially at a rate , where λmax is the maximum Lyapunov exponent of the system. A different methodology is therefore required to compute the adjoint variables that quantify the sensitivity of the cost function to the elements of K.
Different approaches have been proposed to deal with this problem [18,19]. The least-squares shadowing (LSS) method [18] is the most promising among them. The method finds a nearby trajectory u′(τ) that shadows the reference trajectory uref(t) of the chaotic system. The difference |u′(τ) − uref(t)| remains bounded in time, so accurate sensitivities can be computed. The original LSS algorithm is computationally expensive for large systems, and the multiple shooting shadowing (MSS) [20] and the non-intrusive [21] variants are both faster and less memory demanding. A preconditioner was recently proposed by the authors to accelerate the convergence rate of the MSS variant [22]. When the sensitivity of a given objective to a large number of control variables is required, as in our case, adjoint methods offer the most efficient approach [1], since the cost of computation is independent of the number of variables. Adjoint versions of LSS and its variants have also been proposed in the literature [18,20,23].
In this paper, we employ the preconditioned multiple shooting shadowing (PMSS) method to compute the adjoint variables and from them, in a single computation, all the elements of the optimal feedback gain matrix. We apply this idea to the Kuramoto–Sivashinsky equation (KSE) with Dirichlet boundary conditions, that result in an ergodic system [24]. There is also an important additional benefit. For this type of boundary conditions, the nonlinearity of the system is energy conserving, a property which makes standard methods, like LQR, effective. Thus, we can compare the control kernels produced by the shadowing and LQR approaches. We note that very few attempts have been made to solve an optimization problem using LSS before [25,26], and none have considered the computation of the feedback matrix.
The feedback control of the KSE (with periodic boundary conditions) has been considered in the past. In [27,28], the authors discretized the equation by applying a Galerkin projection to the eigenmodes of the linear operator and then pole placement to stabilize its dynamics. The (linearly) unstable modes were separated from the stable ones and a controller was synthesized for the former. The number of actuators was based on the number of unstable modes. In [29,30], the optimal control problem considered the cost of actuation and the iterative algorithm also found the optimal actuator locations that minimize the distance between the solution and a desired state (which can be a travelling wave) over a given time period T. Recently, ‘model-free’ approaches, like deep reinforcement learning, were proposed [31]. This is a data-driven method that finds the optimal control laws using only local measurements.
The structure of this paper is as follows: §2 summarizes the control problem and the two algorithms. In §3, we apply both control strategies to the KSE and compare the control kernels, while in §4 we summarize the performance of the shadowing algorithm. We conclude in §5.
2. Formulation of the control problem
We consider a general dynamical system governed by a set of ordinary differential equations (ODEs) in general form
| 2.1 |
where u is the state vector of length N. If necessary, we assume that an appropriate spatial discretization method (e.g. finite difference, finite volume) has been applied to convert a system of partial differential equations to the ODE set (2.1). In this case, u represents the vector of discrete values of the corresponding continuous variables.
The above set describes mathematically the evolution of the uncontrolled system. For control purposes, we seek an actuation s(t) that will modify the behaviour of the system to meet a desired objective. Assuming M actuators (in which case s(t) becomes a vector of length M), the controlled system takes the form
| 2.2 |
where B is the input matrix (size N × M) that determines the spatial distribution of the actuation. Moreover, we assume that the actuation s(t) takes the linear feedback control form
| 2.3 |
where K is an M × N matrix, and utarg is a desired (target) state of the system. The controlled system then becomes
| 2.4 |
where for future reference we have defined h(u, K) = f(u) − BKu. The objective of this paper is to compare the matrix K obtained using the standard linear optimal control theory (also known as LQR) and the one obtained from PMSS. We briefly describe both approaches below.
(a). Control using linear quadratic regulator
The LQR theory is developed for linear time-invariant systems. We first linearize the governing ODE set around a stationary target state, utarg, i.e. we substitute u = utarg + u′ into (2.1), and keeping only the linear terms, we arrive at
| 2.5 |
where is the Jacobian matrix evaluated at the target sate. Actuation is then applied to the linearized system, i.e.
| 2.6 |
LQR finds the optimum s(t) which minimizes the following objective:
| 2.7 |
subject to (2.6), where Q and R are weighting matrices (size N × N and M × M, respectively). Both are symmetric, Q ≥ 0 and R > 0 (i.e. positive semi-definite and definite, respectively). The solution to the optimization problem results in an optimum actuation s(t) which is related to the state u(t) via equation (2.3) and the feedback matrix K is obtained from
| 2.8 |
where matrix P (size N × N) is the solution to the algebraic Riccati equation:
| 2.9 |
The cost of the solution of this equation scales as O(N3). In the present work, we set B = I and the weighting matrices Q = I and R = cI, where c is a positive constant. The optimal feedback controller is then given by
| 2.10 |
The derivation of equations (2.8) and (2.9) can be found in standard control textbooks [4,32]. The optimum feedback controller (2.8) is then applied to the full nonlinear system (2.4). We note that for a linear time-invariant system, the linear relationship (2.3) between actuation and state is exact, and is a direct outcome of the solution of the optimization problem. In the present work, we set the target state utarg = 0, i.e. we want the actuation to bring the system to rest.
(b). Control using preconditioned multiple shooting shadowing
We seek the control matrix K that solves the nonlinear control problem
| 2.11a |
| 2.11b |
where in analogy with (2.7) we set
| 2.12 |
Note that in (2.12), the full state u appears, not the perturbation around the target, u′, as in (2.5). Since however we have selected utarg = 0, both control problems are aiming to bring the system to rest. We employ to indicate averaging over an infinitely long period. For averaging over finite time T, we use
| 2.13 |
instead.
The optimization problem (2.11) can be solved iteratively using the sensitivities and coupling with an updating method, for example, gradient descent,
| 2.14 |
where i is the iteration number and a(i) is a step size. We explain how to evaluate the sensitivities for all elements of the matrix K (in total N2 values) with a single computation using the adjoint version of PMSS in the following subsection. The steps of the iterative algorithm are summarized below.
Control algorithm:
Inputs: T, u0, K(0), ϵ. Output: K (converged control matrix)
-
(i)
Set i = 0
-
(ii)
Integrate with initial condition u(0) = u0 in the interval [0, T] to obtain u(0)(t). Compute
-
(iii)
Call PMSS solver to obtain for the trajectory u(i)(t)
-
(iv)
Compute a(i) using an appropriate algorithm (for example backtracking or Brent’s method [33])
-
(v)
Update the control parameters:
-
(vi)
Integrate and store u(i+1)(t). Compute . (Note: this step is included within step (iv) if backtracking is employed)
-
(vii)
If break, and output K = K(i+1). If not, set i = i + 1 and return to step (iii).
Step (iv) of the algorithm typically requires multiple evaluations of (2.11b) to compute a(i). In this paper, we use backtracking, which guarantees monotonic reduction of , and is considerably less expensive than other extrema finding methods (such as Brent’s method). Backtracking requires the computation of u(t) and , so these are obtained in step (iv) (and make step (vi) redundant). The next subsection describes in detail the computation of the sensitivities (step (iii)).
The above algorithm requires to be ergodic (i.e. independent of the initial condition), as will be explained in §2b(i). It must also be noted that the cost function J(u, K), and therefore , are generally non-convex with respect to the elements of K [34]. Therefore in general, convergence to the global minimum is not guaranteed.
(i). Computation of sensitivities using an adjoint algorithm
We are seeking the derivative of one objective, (in practise ), to multiple inputs (the elements of matrix K). For such cases, adjoint algorithms offer an ideal solution, because their cost is independent from the number of matrix elements.
We start by explaining how to compute the derivative to a single parameter, say β, evaluated at β = β0, i.e. we seek under the nonlinear constraint
| 2.15 |
The linear tangent equation
| 2.16 |
where is the Jacobian, governs the evolution of v(t) = du(t)/dβ with initial condition v(0) = du(0)/dβ. This equation is obtained by perturbing β around the reference value β = β0 and linearizing (2.15) around the β0-trajectory in phase space. The variable v(t) can then be used to compute the sensitivity by evaluating an integral obtained after application of chain rule to (2.11a). The solution v(t) of the tangent equation is however unbounded for chaotic systems (grows exponentially in time due to the divergence of nearby trajectories). LSS [18] finds a solution to (2.16) that always remains bounded in time, which means that the perturbed trajectory u′(τ, β0 + δβ) shadows (i.e. remains close to) the reference trajectory uref(t, β0). This is achieved by relaxing the initial condition v(0), and minimizing the norm , where v(t) is now defined as v(t) = lim δβ→0 (u′(τ, β0 + δβ) − uref(t, β0))/δβ. For ergodic systems, the sensitivity does not depend on the initial conditions, so this relaxation does not affect the result. MSS [20] is a variant of LSS that minimizes at P + 1 discrete checkpoints in time, that define P time segments [tp−1, tp], p = 1…P, with size ΔT = T/P (equal for all segments). Below we provide only a summary of MSS. For full explanation and derivation details, the reader is referred to [20]. The minimization problem can be written as
| 2.17a |
| 2.17b |
Equation (2.17b) imposes the continuity of v between consecutive time segments. The size of matrix A is NP × N(P + 1), while and are vectors of length N(P + 1) and NP, respectively. Matrix A and vector b can be computed from the state transition matrix of (2.16) that satisfies
| 2.18 |
The solution to the minimization problem (2.17) is found by introducing a set of discrete adjoint variables and deriving the Karush–Kuhn–Tucker (KKT) system
| 2.19 |
From the computational point of view, it is more convenient to solve a linear system that involves the Schur complement matrix, S = AAT with size NP × NP,
| 2.20 |
Matrix S is block tri-diagonal and symmetric positive definite, i.e. all eigenvalues are real and positive. Equation (2.20) is solved iteratively using a linear ‘matrix-free’ solver, i.e. S is not stored; only the matrix–vector products Sw are required. The GMRES solver is used in this paper. Once w is computed, it is substituted into (2.19) to obtain v, and can be evaluated.
The above analysis is not useful when seeking the derivative of to multiple control parameters because (2.20) would need to be solved with a different right hand side for each parameter. The adjoint version of (2.19)
| 2.21 |
can used to compute the derivative of to all control parameters (in our case, the elements of K) with a single solution to (2.21). In the above equation, and are the discrete adjoint variables of v and w, respectively. Vector , where
| 2.22 |
Jp = J(tp) and hp = h(tp). The Schur complement of (2.21) is
| 2.23 |
which can be solved iteratively for . Notice that the standard KKT system (2.19) and its Schur complement (2.20) have identical matrices to the adjoint counterparts (2.21) and (2.23), but different right hand sides. Within each time segment, the continuous adjoint variables can be found by integrating
| 2.24 |
backwards in time in all segments with terminal conditions
| 2.25 |
where is obtained from the solution to (2.23) and is a projection operator evaluated at tp:
| 2.26 |
Finally, using , the sensitivity of to any element Ki,j can be found by evaluating the integral
| 2.27 |
where in the second term of the right-hand side, J is given by (2.12). Since K in this paper is size N × N, (2.27) is evaluated algebraically N2 times to compute the full sensitivity matrix.
The most time consuming part of the algorithm is the solution of the linear system (2.23). It is well known that systems with matrices that have tightly clustered eigenvalue spectra converge fast when solved iteratively [35]. For MSS, the eigenvalues of S, μ(S), are usually widely spread between very small and very large values, leading to very slow convergence. A useful measure to estimate the convergence rate is the matrix condition number, κ = μmax(S)/μmin(S). For fast convergence, κ should be as close as possible to unity. In order to reduce κ, we apply a block diagonal preconditioner (refer to [22] for details) to system (2.23) that annihilates the (l) fastest growing singular modes, corresponding to the (l) largest Lyapunov exponents in each segment. The superscript (q) refers to the number of Lanczos bidiagonalization iterations [36]. The MATLAB function ‘svds’, based on this algorithm, is used to compute the singular vectors that are employed to construct . To account for very small eigenvalues, 0 < μ(S) ≪ 1, Tikhonov regularization is also used to shift the entire eigenvalue spectrum by γ. The preconditioned and regularized system takes the form
| 2.28 |
It was shown in [22] that this system has very good convergence properties, and therefore we solve (2.28) instead of (2.23). Note the enormous advantage of the adjoint approach. The system (2.28) must be solved only once in order to compute the sensitivity of with respect to each element of matrix K, in total N × N values. Summarizing, we have the following algorithm:
PMSS solver:
Inputs: T, P, l, q, γ. Output:
3. Control of the Kuramoto–Sivashinsky equation
In this section, we apply the two control algorithms outlined above to the standard KSE:
| 3.1 |
We chose Dirichlet and Neumann boundary conditions (instead of standard periodic conditions) so that the system is ergodic [24]. For all simulations L = 128, which is large enough to ensure a chaotic behaviour. The spatial derivatives are discretized using N + 2 equally spaced nodes, using the second-order finite difference approximations as in [24]. N nodes are inside the domain and 2 are located on the boundaries. Two values of node spacing δx = L/(N + 1) are considered, 1 and 0.5. The variable step Runge–Kutta method (ode45 in MATLAB) was used for the temporal integration. The initial condition at t = 0 is obtained from a precursor integration in −1000 ≤ t ≤ 0; this ensures that the trajectory has reached the chaotic attractor at t = 0.
Figure 1 shows the typical streaky behaviour exhibited by the KSE in the x–t plane. It is clear that there is a characteristic average streak spacing, lstr, with wavenumber k = 2π/lstr. As will be shown later, lstr plays an important role in the analysis of control kernels. If the boundary conditions are periodic, the spectral energy peaks at k. This characteristic value is close to the wavenumber that maximizes the eigenvalue of the linearized KSE around the rest state. For periodic boundary conditions, this corresponds to a streak spacing that can be expressed analytically as [37], which is equal to . As will be seen later for system (3.1), lstr ≈ 8.5, therefore the choice of the boundary conditions does not affect much the streak spacing.
Figure 1.

Contour plot of a typical solution u(x, t) of (3.1). (Online version in colour.)
The time-average and root mean square (RMS) of u as a function of x, and , respectively, are shown in figure 2. The angled brackets 〈.〉 denote time-averaging, while represents averaging over multiple initial conditions u0 = u(x, 0). The Dirichlet boundary conditions result in sharp gradients at both ends of the domain, especially for . In the middle of the domain, the variation of both mean and RMS is smooth.
Figure 2.
Time average and RMS of u(x, t). and were obtained for trajectories with length T = 2000 and averaged over 150 random initial conditions in [0, 1]. (a) Time average of u(x, t), (b) temporal RMS of u(x, t).
The discretized KSE equation is written as
| 3.2 |
where f(u) is the nonlinear vector arising from the finite difference discretization of the right-hand side of (3.1), is the state vector of length N, and K is the N × N feedback control matrix. For shadowing control, we seek the optimal values of K that minimize the following objective:
| 3.3 |
subject to the nonlinear constraint (3.2). The objective and constraint have the same form as (2.11). The first term on the right-hand side of (3.3) represents the space–time average kinetic energy of the system, while the second term represents the cost of the control effort, which is regulated by the parameter α. We use N equally spaced actuators (equivalent to setting B = I in equation (3.2)), and we search for the optimum values of K that drive u(t) to zero, i.e. that bring the system to rest.
In the following, we compare this matrix K with the one obtained using the LQR algorithm. For the latter, the linear version of the KSE (3.1), i.e.
| 3.4 |
with the same boundary conditions is discretized to obtain the linearized matrix Al, which is required to compute the LQR feedback matrix (2.10). We then apply both matrices to the full nonlinear discretized KSE (3.2).
(a). Comparison of preconditioned multiple shooting shadowing control and linear quadratic regulator control kernels
Since we have selected B = I, the actuation s(t) = −Ku(t) can be written explicitly as , where the index i corresponds to the control location, xc = i × δx. Written in this form, the physical meaning of Ki,j becomes clear; it represents the weight of the jth velocity to the actuation in the ith point. Summing all j contributions, results in si(t). For more general cases, the input matrix B provides an appropriate spatial weighting and the control signal is given by Bs; see equation (2.2).
The total derivative of with respect to element Ki,j is found using (2.27). For the objective function (3.3) considered here, (2.27) translates into
| 3.5 |
Figure 3 shows the absolute values of the elements of the sensitivity matrix in log scale, for different trajectory lengths T. For T = 50 (figure 3a), the matrix does not seem to have a clear structure. For T = 200 (figure 3b), the matrix is starting to acquire a diagonally dominant structure. However, a significant number of elements in the top right and bottom left quadrants are of the same order of magnitude O(10−1) as the central diagonals (shown in yellow). Time horizons T = 50 and T = 200 are clearly not long enough for sensitivity convergence. However, increasing the trajectory length to T = 500 (figure 3c) and T = 800 (figure 3d), the sensitivities are starting to converge and become independent of T. It is now clear that the matrix has a diagonally dominant structure. Note that convergence is first attained around the main diagonal and slowly propagates further away.
Figure 3.
Colour maps of the absolute values of sensitivities in log scale for different time-averaging lengths T. They are obtained from the attractor of the uncontrolled system (i.e. the first iteration with K(0) = 0). (a) T = 50, (b) T = 200, (c) T = 500, (d) T = 800. (Online version in colour.)
Using the computed sensitivities for T = 800 and a step size a(0) = 10, we compute the elements from (2.14) and plot the colour map in linear scale in figure 4. Large positive and negative values are found around the main diagonal, while further away, i.e. when j ≫ i, Ki,j decays to 0. This has a clear physical meaning; the value of actuation at point i is determined mainly by the nearby neighbours, while the contribution of points located further away becomes progressively smaller and smaller. The control kernel therefore has compact support. This has been demonstrated for the kernel obtained when LQR is applied to the linearized Navier–Stokes equations in a channel flow (see fig. 6 of [38]). The present analysis shows that the same property holds if we compute K directly from the nonlinear attractor, at least for the case examined. More research is needed however to determine if controllers with compact support can be computed for general nonlinear systems using the current algorithm.
Figure 4.

Colour map of matrix K obtained using PMSS control with T = 800. (Online version in colour.)
Averaging along the mth diagonal of K (the main diagonal corresponds to m = 0), we obtain which depends only on ξ = mδx = x − xc. It is clear that represents the average value of weights against the distance ξ from the actuation point. We plot the distribution of for two grid spacings δx in figure 5a with solid lines. We superimpose also the weights from LQR obtained with control weight c = 1 (refer to equation (2.10)) with dashed lines. We notice that the weights obtained from LQR and PMSS have very similar distributions. As expected, they are both localized around ξ = 0 and decay to 0 further away. Also they both depend on the discretization.
Figure 5.
Distribution of feedback matrix weights (a) and kernels (b,c) obtained by averaging along the diagonals of K and plotting against ξ = x − xc. (a) Full domain, (b) mean convolution kernel in the region −12 ≤ ξ ≤ 12, (c) mean convolution kernel normalized by . (Online version in colour.)
In order to further analyse the results, we compute the control kernel, which is independent of δx. To this end, we write the actuation at location xc as the convolution integral
| 3.6 |
where is the convolution (or control) kernel at xc. This can be computed from the matrix elements , where xc = i × δx and x = j × δx. In figure 5b, we plot both diagonally averaged kernels and, in order to facilitate the comparison, we zoom in the region ξ ∈ [ − 12, 12]. The LQR kernels (dashed lines) collapse perfectly for the two discretizations, as they should. The PMSS kernels however (solid lines), although close, do not collapse to a single curve. Perhaps we should not expect them to collapse, as they are found directly from the nonlinear attractor under the assumption of the feedback control law (2.3). The shapes produced by the two control methods are similar, but the LQR kernel decays to 0 faster than the PMSS kernel. The latter has more pronounced peaks and oscillates around 0.
To eliminate the effect of the step size a(0) and the control cost parameter c that affect the absolute values of , in figure 5c we plot the kernel distribution normalized with the value at ξ = 0. This normalization reveals a very interesting feature; the two kernels are almost identical in the region −2.5 ≤ ξ ≤ 2.5, but deviate elsewhere.
What determines the shape of the kernels and why do they have this distribution? In order to obtain more insight, we consider the two-point spatial correlation function at a given location xc
| 3.7 |
where u′ = u − 〈u〉 is the fluctuation about the time-average 〈u〉. A small correlation ρ(ξ;xc) ≈ 0 indicates that a perturbation at xc + ξ (for example, due to actuation) would not be ‘felt’ at xc. It is therefore expected that this function will be related to the control kernel.
We plot ρ(ξ;xc) against ξ at different locations xc along the domain in figure 6a. The correlations collapse very well for ξ ∈ [ − 4, 4], but start to deviate as ξ becomes larger. Note also the slight loss of symmetry around ξ = 0 for points xc close to the boundaries of the domain. We then average ρ(ξ;xc) over xc in the region 40 < xc < 90 and plot the distribution of the average correlation in figure 6b. Symmetry around ξ = 0 has now been restored. Distinct positive and negative peaks can be identified at ξ ≈ ±4, ± 8.5, ± 13.5, etc. Moreover, decays to zero for ξ < −20 and ξ > 20.
Figure 6.
Two-point spatial correlations. (a) Correlations ρ(ξ;xc) against ξ for different xc, (b) averaged correlation in the region xc ∈ [40, 90]. (Online version in colour.)
We can physically explain these results by reference to figure 1. The peaks at ξ = ±4 indicate the average distance between positive and negative streaks that are located next to each other. The fluctuations around the average 〈u〉 have opposite signs and therefore . The lower peaks at ξ = ±8.5 indicate weaker correlation between two positive or two negative streaks. The even lower negative peak at ξ = ±13.5 can be explained similarly. The results indicate that the correlation is strong over approximately ξ = 8.5, i.e. over the average distance between streaks of the same sign.
In figure 7, we plot together the average and the normalized LQR and PMSS control kernels. Plotted in this way, all three distributions have remarkable similarities, but also some differences. The two control kernels and the correlation overlap in −2.5 < ξ < 2.5. At ξ = ±4, the troughs are more clearly pronounced for the PMSS kernel and are closer to . On the other hand, the LQR kernel decays very quickly outside the ξ region [ − 4, 4]. The narrow support of the LQR kernel indicates that the actuation acts to annihilate the positive/negative streaky combinations. On the other hand, the PMSS kernel has wider support and opposes larger positive/negative/positive streaky combinations.
Figure 7.
Mean two-point correlation superimposed on . (Online version in colour.)
(b). Response of the controlled system
Next we turn our attention to the response of the system (3.2) to PMSS and LQR actuations. We consider the instantaneous, spatially averaged kinetic energy,
| 3.8 |
and explore how fast J(t) is reduced. For PMSS, we apply control with different kernel sizes, i.e. we select a ξmax and compute the actuation using only the elements of K in the region −ξmax ≤ ξ ≤ ξmax around each control point xc. We set the step size a(0) = 6 on step 5 of the control algorithm. To ensure a fair comparison, we choose (by trial and error) a value c = 1.1 in equation (2.10) so that the average of the main diagonal, , is the same for PMSS and LQR.
The energy J(t) is plotted against time in figure 8. For a(0) = 6, all controllers bring the system to rest and so no additional iterations are required. It is remarkable that even a single PMSS control iteration, which relies only on information of the uncontrolled system, is so effective. When ξmax = 2, the normalized LQR and PMSS matrix kernels are very similar (as shown in figure 7), and hence J(t) drops at the same rate for both controllers. A contour plot of the controlled solution u(t) is shown in figure 9 and shows how effective the actuation is. It can be clearly seen that the streaks found in the uncontrolled case (figure 1) are rapidly annihilated upon application of the control and |u(x, t)| is brought to 0 within 5–10 time units.
Figure 8.
Instantaneous kinetic energy J(t) of the actuated system using PMSS and LQR. The uncontrolled case is shown in black and the LQR in blue. For the PMSS control matrix K, we use only the elements that fall inside the indicated range of ξ. Decreasing ξ leads to faster stabilization. (Online version in colour.)
Figure 9.

Absolute values of the controlled solution |u(x, t)| in log scale (with ξmax = 2). (Online version in colour.)
Increasing ξmax to 20 reduces the rate at which the system is brought to rest. Using the full matrix, i.e. setting ξmax = 127, still reduces the energy by 4 orders of magnitude (from J(0) = 1.68 to J(T) = 4.8 × 10−4). However, the response of the system is slow when J(t) falls below 10−2, i.e. this is a long-term effect.
Indeed, as can be seen from figure 8, and especially from the inset that zooms in to small values of t, the rate of descent is initially the same for PMSS and LQR. Most importantly, for the PMSS controller this holds for all values of ξmax, even the largest. The curves start to deviate approximately for t > 1. In figure 10, we plot the spatial distribution of the time-average and RMS of u(x, t) with and without control (the results are with ξmax = 127). Note the effectiveness in suppressing both variables and bringing them close to 0 across the whole domain.
Figure 10.
Time average and RMS of u(x, t). The controller K(1) uses the full matrix, i.e. with −127 ≤ ξ ≤ 127. The statistics are computed by time-averaging between t = 500 and t = 800( = T). (a) Time average of u(x, t), (b) RMS of u(x, t). (Online version in colour.)
In order to shed more light onto the behaviour shown in figure 8, we consider the matrix Al − K(1), where Al is the discrete form of the right-hand side of the linearized equation (3.4) and compute its eigenvalues. Figure 11 shows the eigenvalues, μ, for two values of ξmax = 2, 127. Note that for the uncontrolled case, all μ(Al) are real because the matrix Al is symmetric (contains only second and fourth order derivatives that are discretized with central differences). However, some μ(Al − K(1)) have a small imaginary part due to lack of strict symmetry of K(1). This is likely due to the finite length of the trajectory T and the slow convergence of the off-diagonal elements. Some eigenvalues of Al have positive real parts, indicating linear instability. When the controller is introduced, all eigenvalues move to the left-hand plane, i.e. become stable. However for ξmax = 127, some eigenvalues are close to the imaginary axis, explaining the slower rate of descent for J(t).
Figure 11.
Eigenvalues of the controlled and uncontrolled matrices Al − K(1) and Al, plotted in the complex plane ( for T = 800 and a(0) = 6). Only values with Re(μ) > −4 are shown. (a) K(1): −2 ≤ ξ ≤ 2, (b) K(1): −127 ≤ ξ ≤ 127. (Online version in colour.)
4. Preconditioned multiple shooting shadowing control algorithm performance
We now look at the performance of the PMSS control algorithm. We set as inputs T = 50, K(0) = 0 and ϵ = 1 × 10−2, while restricting K to the diagonals in the region −2 ≤ ξ ≤ 2. A number of performance measures are given in table 1. It is clear that a single iteration is sufficient to drive the kinetic energy evaluated at t = T, J(T), to zero. Since one iteration renders the controlled system linearly stable, i.e. Re(μmax(Al − K(1))) < 0, a second iteration is unnecessary.
Table 1.
Some key PMSS control algorithm performance measures. The algorithm inputs used are T = 50, K(0) = 0 and ϵ = 1 × 10−2. K is restricted to −2 ≤ ξ ≤ 2.
| iteration no. | J(T) | a(i) | no. GMRES iterations | ||
|---|---|---|---|---|---|
| i = 0 | 1.62 | 1.37 | 3.7 × 10−1 | 10 | 34 |
| i = 1 | 1.4 × 10−2 | ≈1 × 10−10 | 1 × 10−2 | 10 | 2 |
| i = 2 | 1.4 × 10−2 | ≈1 × 10−10 | — | — | — |
The time-average over 0 → T, , is reduced by 2 orders of magnitude between the 0th iteration (uncontrolled flow) and the controlled flow after one iteration. The actual value of depends on T and accounts for the transient shown in figure 8. The variation of against T (when the algorithm is run separately for each T) is shown in figure 12. drops with rate ∼T−1, because the transient period occupies a smaller and smaller fraction of T as T → ∞. The 2-norm of the sensitivity matrix , where σmax denotes the maximum singular value, drops by a factor of three from the 0th to the 1st iteration. Again, this depends on T.
Figure 12.

Space–time averaged objective obtained by running the PMSS algorithm for different T; it drops with rate ∼T−1. (Online version in colour.)
Table 2 shows the PMSS solver parameters (step 3 of the control algorithm). Initially, the uncontrolled trajectory (K(0) = 0) has 15 positive Lyapunov exponents (N+LE = 15). We therefore apply a preconditioner with l = 15 on the first iteration (to annihilate these fastest growing modes). The condition number κ is reduced by 4 orders of magnitude and the convergence of GMRES is achieved in 34 iterations. On the second iteration all unstable modes have been annihilated (N+LE = 0), and therefore no preconditioning is necessary (the condition number κ(S) = 5 is already very small).
Table 2.
PMSS solver parameters. N+LE refers to the number of positive exponents of the trajectory on step 3 of the control algorithm. κ(S) and are the condition numbers of the unconditioned and preconditioned MSS matrices, respectively.
| iteration no. | N+LE | ΔT | l | q | γ | κ(S) | κ(H) | no. GMRES iterations |
|---|---|---|---|---|---|---|---|---|
| i = 0 | 15 | 5 | 15 | 1 | 0.1 | 1.3 × 105 | 24 | 34 |
| i = 1 | 0 | 5 | — | — | 0.1 | 5 | — | 2 |
For the PMSS control algorithm to be applicable to large systems, it must be able to compute accurate sensitivities as efficiently as possible. Step 3 of the PMSS solver, i.e. the solution of the linear system (2.28), dominates the computational cost, making preconditioning necessary for scalability. In figure 13, we plot the convergence rate quantified in terms of GMRES residuals () against the GMRES iteration number m, where q and H are, respectively, the right-hand side vector and the matrix of (2.28) for different trajectory lengths T. It is clear that the dependence of the convergence rate on T is very weak. The residuals drop 5 orders of magnitude between 25 and 37 iterations, even though T varies by more than an order of magnitude, from T = 50 to T = 800, increasing the number of unknowns by a factor of 16.
Figure 13.

GMRES residuals for step 3 of the PMSS solver. The preconditioner parameters used are ΔT = 10, l = 15, q = 1 and γ = 0.1. The residual drops by approximately five orders of magnitude by the final iteration for all T-values. (Online version in colour.)
5. Conclusion
We have proposed an algorithm that couples shadowing adjoint sensitivity analysis with gradient descent to compute the feedback control matrix K for a chaotic system. The sensitivities were used as search directions to find the matrix elements that minimize an objective function, subject to the full nonlinear system constraints. Most importantly, due to the adopted adjoint formulation, the computational cost is independent of the number of the matrix elements. We applied this approach to the KSE, and managed to stabilize it around the rest position, u(t) = 0. It was shown that for suitably chosen parameters, a single iteration of the algorithm was sufficient to compute a stabilizing feedback matrix K.
We compared the control kernels obtained with this algorithm and the standard LQR approach and noted similarities and differences. Both kernels had compact support and similar shape, which was related to the streaky structure of the solution of the uncontrolled KSE. They were almost identical for short separations from the actuation point, but the LQR kernel decayed faster to 0. This difference is most likely due to the nonlinear terms that are ignored in LQR. All kernels computed with the PMSS algorithm were stabilizing.
From a computational point of view, the cost of LQR scales with O(N3), which poses severe restrictions for application to large-scale systems. On the other hand, the PMSS algorithm uses only time steppers and a preconditioner to make the convergence almost independent of N and T. The cost of the preconditioner depends however on the number of positive Lyapunov exponents. While the LQR has shown faster stabilization of the instantaneous energy for the case examined (figure 8), the performance of PMSS was almost as good for restricted kernels.
We close this section with some thoughts on possible extensions of this work. In practical applications, only noisy measurements at some locations, say yj(t), are available. This is known as output feedback control; in this case, the actuation must be expressed as si(t) = G(yj). Assuming G to be a linear function of present and past values of the measurements (to account for memory effects), we can write si(t) = K0yj(t) + K1yj(t − δt) + · · · + Knyj(t − nδt), where the matrices K0, K1, …, Kn can be obtained with the same method presented in this paper (instead of neural networks as for example in [31]). The value of n is problem dependent, and be estimated from the time autocorrelation of the uncontrolled solution.
The proposed approach is also scalable to two- and three-dimensional systems and requires only time steppers of the full nonlinear as well as the linearized forward and adjoint equations as mentioned above. It would be instructive to apply the method to other systems, for example suppression of vortex shedding around a cylinder at transitional Reynolds numbers, the recirculation zone in the suction suction of an aerofoil, etc.
Acknowledgements
The authors would like to thank the anonymous reviewers for their useful comments.
Footnotes
Data accessibility
The MATLAB code used to generate the results in this paper can be accessed from https://drive.google.com/file/d/1lVmx2wqZWi4PTktCef8oN0ERvWvy1oaA/view?usp=sharing.
Authors' contributions
G.P.: Conceptualization, validation, review and editing of the paper, supervision. K.S.: Conceptualization, software development, validation, original draft preparation, review and paper editing. All authors approved the final version and agree to be accountable for all aspects of the work.
Competing interests
We declare we have no competing interests.
Funding
The first author has received a PhD scholarship from Al-Alfi Foundation.
Reference
- 1.Gunzburger MD. 2002. Perspectives in flow control and optimization. Philadelphia, PA: Society for Industrial and Applied Mathematics.
- 2.Xiao D, Papadakis G. 2017. Nonlinear optimal control of bypass transition in a boundary layer flow. Phys. Fluids 29, 054103 ( 10.1063/1.4983354) [DOI] [Google Scholar]
- 3.Luchini P, Bottaro A. 2014. Adjoint equations in stability analysis. Annu. Rev. Fluid Mech. 46, 493–517. ( 10.1146/annurev-fluid-010313-141253) [DOI] [Google Scholar]
- 4.Bryson A, Ho YC, Siouris G. 1975. Applied optimal control. New York, NY: Hemisphere. [Google Scholar]
- 5.Green M, Limebeer D. 1995. Linear robust control. Englewood Cliffs, NJ: Prentice Hall. [Google Scholar]
- 6.McKernan J, Papadakis G, Whidborne JF. 2009. Linear and non-linear simulations of feedback control in plane Poiseuille flow. Int. J. Num. Mech. Fluids 59, 907–925. ( 10.1002/fld.1851) [DOI] [Google Scholar]
- 7.Barbagallo A, Sipp D, Schmid P. 2009. Closed-loop control of an open cavity flow using reduced-order models. J. Fluid Mech. 641, 1–50. ( 10.1017/S0022112009991418) [DOI] [Google Scholar]
- 8.Bagheri S, Brandt L, Henningson D. 2009. Input-output analysis, model reduction and control of the flat plate boundary layer. J. Fluid Mech. 620, 263–298. ( 10.1017/S0022112008004394) [DOI] [Google Scholar]
- 9.Semerano O, Bagheri S, Brandt L, Henningson DS. 2011. Feedback control of three-dimensional optimal disturbances using reduced-order models. J. Fluid Mech. 677, 63–102. ( 10.1017/S0022112011000620) [DOI] [Google Scholar]
- 10.Sharma AS, Morrison JF, McKeon BJ, Limebeer DJN, Koberg WH, Sherwin SJ. 2011. Relaminarisation of Reτ = 100 channel flow with globally stabilising linear feedback control. Phys. Fluids 23, 125105 ( 10.1063/1.3662449) [DOI] [Google Scholar]
- 11.Heins PH, Jones BL, Sharma AS. 2016. Passivity-based output-feedback control of turbulent channel flow. Automatica 69, 348–355. ( 10.1016/j.automatica.2016.03.007) [DOI] [Google Scholar]
- 12.Pralits J, Luchini P. 2010. Riccati-less optimal control of bluff-body wakes. In Proc. 7th IUTAM Symp. on Laminar-Turbulent Transition (eds P Schlatter, D Henningson). Stockholm, Sweden. Springer.
- 13.Bewley T, Luchini P, Pralits J. 2016. Methods for solution of large optimal control problems that bypass open-loop model reduction. Meccanica 51, 2997–3014. ( 10.1007/s11012-016-0547-3) [DOI] [Google Scholar]
- 14.Semerano O, Pralits J, Rowley C, Heningson D. 2013. Riccati-less approach for optimal control and estimation: an application to two dimensional boundary layers. J. Fluid Mech. 731, 394–417. ( 10.1017/jfm.2013.352) [DOI] [Google Scholar]
- 15.Semeraro O, Pralits JO. 2018. Full-order optimal compensators for flow control: the multiple inputs case. Theor. Comput. Fluid Dyn. 32, 285–305. ( 10.1007/s00162-018-0454-4) [DOI] [Google Scholar]
- 16.Mårtensson K, Rantzer A. 2012. A scalable method for continuous-time distributed control synthesis. In 2012 American Control Conf. (ACC), Montreal, Canada, 27–29 June 2012, pp. 6308–6313 ( 10.1109/ACC.2012.6314762) [DOI]
- 17.Mårtensson K. 2012. Gradient methods for large-scale and distributed linear quadratic control. PhD thesis, Department of Automatic Control, Lund Institute of Technology, Lund University.
- 18.Wang Q, Hu R, Blonigan P. 2014. Least squares shadowing sensitivity analysis of chaotic limit cycle oscillations. J. Comput. Phys. 267, 210–224. ( 10.1016/j.jcp.2014.03.002) [DOI] [Google Scholar]
- 19.Lasagna D. 2017. Sensitivity analysis of chaotic systems using unstable periodic orbits. SIAM J. Appl. Dyn. Syst. 17, 547–580. ( 10.1137/17M114354X) [DOI] [Google Scholar]
- 20.Blonigan PJ, Wang Q. 2018. Multiple shooting shadowing for sensitivity analysis of chaotic dynamical systems. J. Comput. Phys. 354, 447–475. ( 10.1016/j.jcp.2017.10.032) [DOI] [Google Scholar]
- 21.Ni A, Wang Q. 2017. Sensitivity analysis on chaotic dynamical systems by non-intrusive least squares shadowing (NILSS). J. Comput. Phys. 347, 56–77. ( 10.1016/j.jcp.2017.06.033) [DOI] [Google Scholar]
- 22.Shawki K, Papadakis G. 2019. A preconditioned multiple shooting shadowing algorithm for the sensitivity analysis of chaotic systems. J. Comput. Phys. 398, 108861 ( 10.1016/j.jcp.2019.108861) [DOI] [Google Scholar]
- 23.Blonigan PJ. 2017. Adjoint sensitivity analysis of chaotic dynamical systems with non-intrusive least squares shadowing. J. Comput. Phys. 348, 803–826. ( 10.1016/j.jcp.2017.08.002) [DOI] [Google Scholar]
- 24.Blonigan PJ, Wang Q. 2014. Least squares shadowing sensitivity analysis of a modified Kuramoto-Sivashinsky equation. Chaos Solitons Fractals 64, 16–25. ( 10.1016/j.chaos.2014.03.005) [DOI] [Google Scholar]
- 25.Günther S, Gauger NR, Wang Q. 2017. A framework for simultaneous aerodynamic design optimization in the presence of chaos. J. Comput. Phys. 328, 387–398. ( 10.1016/j.jcp.2016.10.043) [DOI] [Google Scholar]
- 26.Cagliari LV, Mishra S, Hicken JE. 2019. Plant and controller optimization in the context of chaotic dynamical systems. AIAA Scitech 2019 Forum, 0167, pp. 1–21.
- 27.Armaou A, Christofides PD. 2000. Feedback control of the Kuramoto–Sivashinsky equation. Physica D 137, 49–61. ( 10.1016/S0167-2789(99)00175-X) [DOI] [Google Scholar]
- 28.Christofides PD, Armaou A. 2000. Global stabilization of the Kuramoto-Sivashinsky equation via distributed output feedback control. Syst. Control Lett. 39, 283–294. ( 10.1016/S0167-6911(99)00108-5) [DOI] [Google Scholar]
- 29.Gomes SN, Pradas M, Kalliadasis S, Papageorgiou DT, Pavliotis Ga. 2015. Controlling spatiotemporal chaos in active dissipative-dispersive nonlinear systems. Phys. Rev. E 92, 022912 ( 10.1103/PhysRevE.92.022912) [DOI] [PubMed] [Google Scholar]
- 30.Gomes SN, Papageorgiou DT, Pavliotis GA. 2017. Stabilizing non-trivial solutions of the generalized Kuramoto-Sivashinsky equation using feedback and optimal control. IMA J. Appl. Math. 82, 158–194. ( 10.1093/imamat/hxw011) [DOI] [Google Scholar]
- 31.Bucci MA, Semeraro O, Allauzen A, Wisniewski G, Cordier L, Mathelin L. 2019. Control of chaotic systems by deep reinforcement learning. Proc. R. Soc. A 475, 20190351 ( 10.1098/rspa.2019.0351) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Astrom KJ. 2006. Introduction to stochastic control theory. New York, NY: Dover. [Google Scholar]
- 33.Brent RP. 1973. Algorithms for minimization without derivatives. Englewood Cliffs, NJ: Prentice-Hall. [Google Scholar]
- 34.Peres PLD, Geromel JC. 1994. An alternate numerical solution to the linear quadratic problem. IEEE Trans. Autom. Control 39, 198–202. ( 10.1109/9.273368) [DOI] [Google Scholar]
- 35.Saad Y. 2003. Iterative methods for sparse linear systems, 2nd edn. Philadelphia, PA: Society for Industrial and Applied Mathematics.
- 36.Baglama J, Reichel L. 2005. Augmented implicitly restarted Lanczos bidiagonalization methods. SIAM J. Sci. Comput. 27, 19–42. ( 10.1137/04060593X) [DOI] [Google Scholar]
- 37.Holmes P, Lumley JL, Berkooz G, Rowley CW. 2012. Turbulence, coherent structures, dynamical systems and symmetry, 2nd edn Cambridge, UK: Cambridge University Press. [Google Scholar]
- 38.Hogberg M, Bewley TR, Henningson DS. 2003. Linear feedback control and estimation of transition in plane channel flow. J. Fluid Mech. 481, 149–175. ( 10.1017/S0022112003003823) [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The MATLAB code used to generate the results in this paper can be accessed from https://drive.google.com/file/d/1lVmx2wqZWi4PTktCef8oN0ERvWvy1oaA/view?usp=sharing.








