Abstract
Traditionally before solving the optimal power flow considering uncertainty (OPF–U) problem, the predicted value of uncertainty parameters, such as wind power, e.g., is derived from data using a statistics approach or machine learning. Based on the predicted uncertainty parameters, the solution to the OPF-U problem can be obtained by the prescriptive analytics technique, such as robust optimization (RO). However, it is unclarified how the prediction error in predictive analytics affects solving the OPF-U problem in prescriptive analytics. We propose an adjustable framework method combining machine learning and RO for the OPF-U problem. The k-nearest neighbor is applied to obtain k samples around the predicted value from sufficient historical data. And the optimization results from a minimum volume ellipsoid set containing the k samples are applied to construct KMV set. Then a robust fluctuation region with an adjustable budget level is gained from the KMV set by a two-term exponential formula, which can be embedded into a two-stage RO model. Computational experiments under test cases of different uncertainty scales show the robustness and adjustability of the proposed fluctuation region are better than the state-of-the-art box and ellipsoidal sets. The solution of the proposed two-stage RO model is more economical than the state-of-the-art RO model. The out-of-sample simulation also demonstrates the proposed adjustable Predictive&Prescriptive method can reduce the computational burden as the scale of the system increases when predictive and prescriptive analytics are separated.
Keywords: Machine learning, Two-stage robust optimization, Uncertain fluctuation region, Optimal power flow
1. Introduction
1.1. Background
The strategy to reach carbon neutrality and the carbon peak is to reduce carbon emissions while increasing the percentage of renewable energy [1]. The energy dispatch in power networks will be considerably impacted by the uncertainty of rising renewable energy outputs. Thus, research has focused its attention on optimization problems with uncertainty. Nowadays, prediction and prescription are the two main categories used in analytics to determine the optimal solutions for optimization problems with uncertainty. The prediction uses predictive techniques to explain what the uncertainty will be. The prescription outlines how we should use optimization techniques to achieve feasible results. However, major studies on analyzing these problems predict uncertainty first and then optimize the solution in light of the prediction. In the prediction process, Zhang et al. proposed a short-term photovoltaic power prediction method combining k-means and an improved support vector machine, through which the predicted value with high photovoltaic accuracy can be obtained [2]. In the prescription process, Zhu et al. assumed that wind power prediction error obeyed independent normal distribution and further solved probabilistic power flow and analyzed the consumption capacity of the distribution network based on the predicted value of renewable energy power [3]. Researchers used statistics or machine learning to analyze large amounts of data and obtain the uncertain parameters in the next stage; then, researchers used optimization methods to obtain decision-making results based on predicted values. However, separating prediction and optimization may result in a decision that is not optimal, though forecasting techniques are constantly improving, and ultra-short-term interval prediction is becoming more and more trustworthy [4]. Ben-Tal et al. indicated a small prediction error would significantly worsen the objective function value [5]. In particular, Dragoon et al. illustrated the effect of prediction error for wind power generation on incremental reserve requirements and imbalance costs [6]. Therefore, how to effectively integrate the prediction and prescription process and how to use appropriate predictive method and prescriptive method to reduce the influence of prediction error on the feasible results of uncertainty problems is worth studying. Nowadays, the predictive method has been transitioned from model-driven to data-driven, from machine learning algorithm to deep learning algorithm [7], and from the original regression model to a deep neural network. In the future, large amounts of data will be deeply integrated to improve the accuracy of prediction. The research of the predictive method has performed well. This paper will focus on the optimization of uncertainty problems.
When analyzing uncertainty problems, it is necessary to consider the influence of uncertainty parameters. According to the analysis sequence on uncertainty, optimization methods of uncertainty problems can be divided into post-analysis [8] and pre-analysis [9]. Nowadays, stochastic optimization (SO) and robust optimization (RO) are popular. The former needs to obtain a distribution model containing uncertain parameters in advance, then use the distribution to generate a large number of samples, and solve the problem by replacing uncertain parameters with expected values [10]. However, the increase in the scale of uncertain parameters will lead to an exponential increase in the number of scenarios of SO, and a large number of scenarios will aggravate the difficulty of solving optimization problems. In comparison, RO uses uncertain sets to describe uncertain parameters [11]. The uncertainty region is a critical problem since over-budget regions make the solution more conservative and consume more computational resources. Therefore, it is important to integrate prediction with prescription. The solution's performance is also impacted by the selection of prediction and optimization techniques.
1.2. Literature review
In the predictive method, the forecasting techniques based on the probability distribution function of renewable energy outputs [12], neural networks [13], and nonparametric regressions [14] are currently relatively common. Nonparametric regressions do not assume a probability distribution function, making modeling easy. Thus, a nonparametric regression forecasting technique like k-nearest neighbor (KNN) is applied to form a fluctuation region, including possible predicted values.
In the prescriptive method, RO is a popular optimization approach under uncertainty. The RO generally utilizes an ambiguity set to obtain solutions based on the worst-case analysis [15]. Whatever, in a RO model, the optimal solution may be overly conservative with an underperforming set. In comparison, the solution of the RO model with an ellipsoid set performs less conservativeness than box sets [16] and polyhedral sets [17] and preserves the correlation of uncertainties [18]. To improve the standard ellipsoid set, an ellipsoidal Newton's iteration method was proposed to confirm a satisfactory accuracy and efficiency uncertainty boundary in Ref. [19], but the boundary for large uncertainty is unclear. In another way, Kuryatnikova et al. added correlations to the uncertainty ellipsoid set to narrow the boundaries of uncertainty sets with the limitation to only apply to small cases [20]. Therefore, the related research is still insufficient to include the uncertain ellipsoid set with large fluctuations and apply it to significant cases.
Nowadays, the common way to combine the predictive method and prescriptive method is to conduct prediction process and then conduct prescription process (PtP). In the process of PtP, a large number of observational data is analyzed, and predicted values satisfying the accuracy requirements are generated based on traditional statistical indicators such as average absolute error. Then the predicted value is used as input to realize optimal decisions for uncertain problems. Generally, observed variables greatly influence decision results, but they are not widely integrated into the modeling process of the prescription process. On the other hand, artificial intelligence methods largely focus on supervised learning and build models that map auxiliary variables to predicted values without solving optimal decisions under uncertainty problems. In the study of improving PtP, the core of the Smart Predictive-then-Prescriptive (S_PtP) framework is to measure the error between the predicted value and the actual value by the loss function. More emphasis is placed on predicting the coefficients of objective functions for linear programming (LP) or mixed integer linear programming problems. These two methods still belong to the prediction and prescription independent analysis, which usually ignores the abundant observational data and leads to inadequate decision-making schemes. In order to improve the way that the predictive method combines with the prescriptive method, Bertsimas et al. proposed a Predictive&Prescriptive framework [21]. The framework integrates the prediction and prescription processes, fully using the observed data and obtaining optimal results. The Predictive&Prescriptive framework integrates artificial intelligence, operations research, and management, which is a way to solve operation and management problems. Its core lies in using data to describe the optimal decision of the problem. However, the application of this framework to the uncertain problem of power systems is rarely few.
Considering two advantages of the framework combining machine learning and robust optimization: approximate true solutions based infinite number of samples and tractability, this paper proposes a Predictive&Prescriptive method to solve optimal power flow considering uncertainty (OPF–U) in a two-stage RO model. Compared with state-of-the-art literature, the proposed method can effectively combine predictive and prescriptive analyses when analyzing the OPF-U problem. The proposed KMV set covers the possible conditions of the prediction process than the state-of-the-art uncertainty set.
1.3. Contribution and paper organization
In the two-stage RO model, we construct a minimum volume ellipsoidal set with KNN, called KMV set, to effectively combine the predictive and prescription methods. The prediction error is expressed from big data without complex modeling to improve the predictive method. And robust fluctuation region from the KMV set depends on an adjustable budget level obtained by a two-term exponential formula to improve the prescription method. Moreover, the proposed fluctuation region is adjustable and flexible, which outperforms others. In detail, the contributions are as follows.
-
1)
The framework combining machine learning and optimization techniques considers prediction error and decreases the inadequacy of the predictive technique with a large volume of data.
-
2)
Containing all samples in k-nearest neighbors, the robust fluctuation region derived from the KMV set is determined by an optimal adjustive argument and is more robust than the box set and two kinds of ellipsoid sets. The optimal adjustive argument is calculated by a two-term exponential formula, and the coefficient of the exponential formula can be optimized with the off-the-shelf solvers.
-
3)
The interval of the proposed fluctuation region increases linearly with an adjustive argument, which is easier to adjust with budget level than the ellipsoid sets. As the adjustive argument increases, the fluctuation region becomes more extensive, but the compactness is still better than the box set. These advantages can help to select a suitable fluctuation region in engineering practice.
The remainder of this paper is organized as follows. Section II introduces the integration method for solving OPF-U, formulating the KMV set, and proposing an exponential formula to determine the optimal adjustive argument representing the budget level of the uncertainty set, and models the OPF-U problem. Section III introduces the solution methodology to the KMV set and the two-stage RO problem. Section IV presents simulation results and the superiority of the uncertain fluctuation region in the KMV set. Finally, the conclusions are shown in Section V.
2. Problem formulation
2.1. Two-stage optimal power flow framework
OPF is a fundamental problem for power system analysis, such as market clearing, network optimization, voltage control, and generation dispatch [22]. However, the prediction error and shorter time scale fluctuations of uncertainty threaten the real-time generation-load balance and charge additional transaction costs in the electricity grid, especially in the electricity market. In this paper, the method combined with predictive and prescriptive analytics for solving OPF-U is proposed to optimize problems without the impact of uncertainty prediction error. In the Predictive&Prescriptive analytics process, KNN is used to obtain valuable samples close to predicted value from vast historical data without complex modeling. Such valuable samples are used to form a KMV uncertainty set. Then the robust fluctuation region can be obtained from the KMV set easily by a mathematical formulation. The robust performance of the fluctuation region is verified by comparing some uncertainty sets. Finally, the proposed fluctuation region is embedded in the OPF-U formulation as uncertainty set to make optimal solutions and reduce the conservativeness of the solution.
2.2. The KMV set
Traditionally, KNN can obtain the predicted value by averaging the closest k training data (all samples in KNN) as in Eq. (1).
| (1) |
Therefore, we apply a minimum volume ellipsoid set [23] to enclose all samples in KNN while considering the prediction error. The minimum volume ellipsoid set is as follows:
| (2) |
where the vector contains samples in KNN, and are optimal solutions of minimum volume ellipsoid set optimization model (MVE optimization model):
| (3) |
The MVE optimization model can minimize the volume of the ellipsoid set. Once and are obtained, and are calculated:
| (4) |
| (5) |
Then and from MVE optimization model are used in Eq. (6) to construct the proposed uncertainty set, namely KMV set:
| (6) |
is a vector full of adjustive argument W, where . The lower boundary of the uncertainty region is the average of , and the upper boundary is the average of , which increases as W increases. is a vector full of the diagonal elements of .
2.3. Robust fluctuation region selection
The appropriate fluctuation region can be found in the uncertainty set by selecting the appropriate adjustive argument W. The selection of the optimal argument follows Lemma1. The detailed derivation of Lemma1 can be found in Appendix A.1.
Lemma1
The suitable W is calculated by a two-term exponential formula when k is available with KNN:
(7) where the a-d coefficients are restricted by the MVE optimization model's optimization results. The following coefficient optimization model can be used to find the optimal W to form a robust fluctuation region that includes all samples in KNN.
(8)
(9)
(9a)
(9b)
(9c)
(9d)
(9e) where and are the j-th element of and . represents the maximum of all samples in KNN. Eqs. (9a), (9b), (9c), (9d) are constraints on the upper and lower boundaries of the a-d coefficient. Eq. (9e) guarantees the uncertainty region to enclose all samples in KNN.
The coefficient optimization model can be transformed into a bilinear model, and the bilinear items can be replaced as linear items by applying duality theory [24]. Finally, the coefficient optimization model can be solved using off-shelf MIP solvers, e.g., Gurobi. The transformation process is shown in Appendix A.2. The solutions (a-d coefficients) of the coefficient optimization model are the keys to calculate W. Finally, when W is calculated, the robust fluctuation region can be obtained as follows Eq. (6).
2.4. OPF-U formulation
The OPF-U problem can be solved in two stages. The first-stage is to optimize the generation of units before realizing uncertainty generation, and the second-stage is to optimize the adjustment of generators after revealing a specific realization of uncertainty. Given the intrinsic difficulty in the nonconvex OPF constraints in the first and second stages pose, a convex relaxation method such as second-order cone programming (SOCP) was used to solve the two-stage OPF problem approximately in polynomial time [25].
-
1)
The first-stage optimization model
a) Objective function
The first stage's objective function is the fuel cost of all generators, as shown below.
| (10) |
b) Constraints
The constraints mainly comprise power balance constraints (11), voltage magnitude limits (12), the output of generators bounds ((13) (14)), and power flow of transmission line limits (15).
| (11) |
| (12) |
| (13) |
| (14) |
| (15) |
where c and s are required variables satisfying the following relation constraints:
| (16) |
-
2)
The second-stage optimization model
a) Objective function
The second-stage aims to minimize the adjustment of generators. The re-dispatch cost and penalty for prediction error are added to the objective function:
| (17) |
| (18) |
| (19) |
b) Constraints
The electricity network ((12)–(15)) should hold on.
Generation-load balance constraints:
| (20) |
Generation constraints:
| (21) |
| (22) |
Eq. (21) constrains the output of unit i in generator limits after considering re-dispatching . is wind power generation and is restricted by Eq. (22).
3. Solution methodology
In this paper, OPF-U problem in the two-stage RO model is a min-max-min problem. The first-stage minimizes operation cost under the predicted value and obtains the first-stage solution. The second-stage minimizes adjustment cost under the worst case of uncertainty and obtains the second-stage solution. In Fig. 1, we use a cluster analysis method [26] to obtain various uncertainty scenarios. Then a KMV set is conducted with KNN and MVE optimization model under one scenario. There are uncertain fluctuation regions gained from the KMV ellipsoid set with different W. The robust fluctuation region in the second-stage formed with machine learning and optimization technique can be finalized by a mathematical formulation. Finally, the two-stage RO model with fluctuation region is constructed and solved to obtain the results of OPF-U.
Fig. 1.
-
1)The proposed fluctuation region
When the predicted value in the first-stage is available, the root mean square error (RMSE) with cross-validation is applied to obtain a heuristically optimal number k of the nearest neighbors. When RMSE is the lowest, k samples around the predicted value are available. Then the optimization results from MVE optimization model and an adjustive argument W are integrated to form the proposed fluctuation region. The budget level of fluctuation region depends on the selection of W. Then, the optimal W is obtained with Eq. (7), in which coefficients are obtained from the coefficient optimization model with an off-the-shelf solver. Finally, the robust fluctuation region is formed by Eq. (6). Other fluctuation regions also can be acquired by setting different W.
-
2)
Two-stage RO model for OPF-U problem
The proposed fluctuation region gained from the KMV set is embedded into the two-stage RO model as an uncertainty set. The two-stage OPF-U problem can be expressed as follows:
| (23) |
| (24) |
| (25) |
| (26) |
The first-stage decision variable and the second-stage decision variable . Eq. (24) denotes all the constraints with continuous various Eqs. ((11)–(16)) in the first stage. Eq. (25) denotes all the constraints with continuous variation of the second stage Eq. (21). Eq. (28) contains all the constraints with continuous variation of both stages Eqs. ((20) (22)).
The two-stage model can be decomposed by Column-and-Constraint Generation (C&CG) algorithm [27]. The C&CG algorithm decomposes the two-stage robust problem as the master problem and subproblem. The subproblem can be converted to a single-level problem with Karush–Kuhn–Tucker conditions, in which the internal constraints are linear, and uncertainty is under a separate level [28]. The detailed C&CG algorithm can be found in Appendix B.
The flowchart application of the proposed method is shown in Fig. 2. Day-ahead and real-time optimization results are available from the left and right processes. When this method is applied in real-time, we should judge the relationship between the predicted and actual values. If the real wind power generation is larger than the predicted wind power generation, the staff can set W to be larger than the predicted W from the coefficient optimization model. Then solving the two-stage RO model with real-time W can obtain the real-time solution.
Fig. 2.
The flowchart application of the proposed method.
4. Simulation results
We carry out case study using JuliaPro [29] 1.5.4 on a standard personal computer with an Intel Core i7-10700F CPU running at 2.90 GHz and 16 GB RAM. NearestNeighbors.jl is applied to obtain the k samples around predicted value and JuMP is applied to the model for OPF-U problem. Then MVE optimization model is solved by using MosekTools.jl. The two-stage RO model is solved with Gurobi [30]. For the sake of space, the source code of our method is available at https://github.com/zlq178/ZLQ.git.
4.1. Case set up
In order to illustrate fluctuation region is suitable for any fluctuation of uncertainty, we simulate three cases. Case 1 includes 3000 random samples with a standard deviation of 0.2, indicating a normal fluctuation. Case 2 includes 3000 random samples with a standard deviation of 0.6, indicating a high fluctuation. In the samples of case 1 and case 2, the predicted output of wind turbines is 64 MW, and the electricity load is 75 MW. Other detailed paraments of the two-stage RO model are derived from Ref. [31]. Then we perform an out-of-sample analysis with 3041 historical wind data of Guangzhou in case 3. We embrace analytics to illustrate the superiority of the Predictive&Prescriptive analytics method with the IEEE 14-, 30-, 118-, and 1047-bus systems [32]. For further elaboration on the performances of the proposed uncertainty set in terms of usability, conservativeness, and robustness, case 1 and case 2 conduct a comparative analysis with some state-of-the-art uncertainty sets on fluctuation region including total operation cost, the average adjustment cost (AAC) and the percentage of 3000 samples where a solution in uncertainty sets does not violate the limit of the transmission line, denoted as “R”. The state-of-the-art uncertainty sets are introduced in Appendix C, named as Box set, Ellipsoid set1, and Ellipsoid set2. During the comparison process, invalid regions that are not covering all samples in KNN are highlighted in italics. The optimal fluctuation region of each uncertainty set is highlighted in bold. Finally, case 3 compares the Predictive&Prescriptive analytics method with the traditional separation method, which studies the predictive or prescriptive analytics separately on operation cost and computation time and analyzes different fluctuation regions’ contribution to engineering.
We normalize the wind power data of two wind turbines and then use K-means++ with the Elkan algorithm (EK-means++) for clustering wind power scenarios. Then the two-dimensional scatter plot shows the recognition of wind power scenarios. Cluster result under normalized wind power data shows the advantage of obtaining possible scenarios. As shown in Fig. 3, three clusters with the cluster analysis method indicate three scenarios in the wind data of case 1. Scenario 1 represents that one of the wind power generations is large, scenario 2 represents that the generation of the two wind turbines is generally small, and scenario 3 represents that the generation of the two wind power is generally large. Three clustering centers are respectively (0.64, 0.47) (0.46, 0.67) (0.39, 0.44), and the performance of the proposed uncertainty set is evaluated in scenario 2, which clustering center is (0.46, 0.67).
Fig. 3.
Clustering result.
4.2. Evaluation of the performances under case 1
In case 1, RMSE indicates the optimal number k is 5, in Fig. 4.
Fig. 4.
RMSE following different k in case 1.
The samples in the k-nearest neighbors are: [0.63834,0.63819,0.66186,0.63622,0.64444]. The optimal W is 0.5.
-
1)
Comparison of fluctuation region
In Fig. 5, the regional interval (upper-lower boundary length) of two ellipsoid sets changes irregularly as W increases, which is not convenient for selecting suitable lower and upper budgets. The interval regions of the Box set and KMV set increase linearly with W. Based on such an advantage, we can set W to determine the budget of uncertainty fluctuations quickly.
Fig. 5.
The lower and upper budgets of different uncertainty sets.
Table 1 shows uncertain fluctuation regions with different W. When W = 0.2 and W = 0.5, the Ellipsoid set2 is overly tight and doesn't contain all samples in k-nearest neighbors entirely. When W = 0.2, 0.2, 0.4, 0.5, the fluctuation regions of the Box set, Ellipsoid set1, Ellipsoid set2, and KMV set are appropriately tight and contain all samples in k-nearest neighbors. However, when W = 0.5, the upper boundary of the fluctuation region of the KMV set is more approach to the samples in the k-nearest neighbors than the one of the fluctuation region of the Box set.
Table 1.
-
2)Comparison of total operation cost
| Box set |
Ellipsoid set1 |
Ellipsoid set2 |
KMV set |
|||||
|---|---|---|---|---|---|---|---|---|
| Adjustive argument | Lower | Upper | Lower | Upper | Lower | Upper | Lower | Upper |
| W = 0 | 0.640 | 0.640 | 0.640 | 0.640 | 0.640 | 0.640 | 0.442 | 0.442 |
| W = 0.2 | 0.602 | 0.678 | 0.588 | 0.692 | 0.621 | 0.659 | 0.442 | 0.531 |
| W = 0.4 | 0.563 | 0.717 | 0.584 | 0.696 | 0.602 | 0.678 | 0.442 | 0.619 |
| W = 0.5 | 0.544 | 0.736 | 0.405 | 0.875 | 0.639 | 0.641 | 0.442 | 0.663 |
| W = 0.6 | 0.525 | 0.755 | 0.551 | 0.729 | 0.561 | 0.719 | 0.442 | 0.707 |
| W = 0.8 | 0.486 | 0.794 | 0.345 | 0.955 | 0.541 | 0.739 | 0.442 | 0.796 |
| W = 1 | 0.448 | 0.832 | 0.547 | 0.733 | 0.393 | 0.887 | 0.442 | 0.884 |
A phased conclusion can be drawn that the KMV set can possess a more regular and tight fluctuation region, improving the efficiency of selecting the uncertainty fluctuations’ budget and usability, respectively.
Table 2 shows the total operation cost under different uncertainty sets with different W, which reveals the conservativeness of the solution. The results show the cost under the KMV set with W = 0.5 is the lowest. The total operation cost under the state-of-the-art uncertainty sets is lowest when W = 0.2,0.2,0.4, respectively. Meanwhile, the wind power generation is contained in samples in the k-nearest neighbors. In the most uncertain regions containing samples in the k-nearest neighbors, the two-stage OPF-U with the KMV set is more economical than others. Although each set takes the promising region, which the total operation costs under the Box set, Ellipsoid set1, Ellipsoid set2, KMV set are 6.75, 6.806, 6.748, 6.702 respectively, the cost under the proposed set (W = 0.5) is 1.53% less than the worst cost among compared sets. Furthermore, the computation time for an optimal solution of all uncertainty sets is 1.5s–1.63s.
Table 2.
-
3)Comparison of AAC and “R”
| Adjustive argument | Box set | Ellipsoid set1 | Ellipsoid set2 | KMV set |
|---|---|---|---|---|
| W = 0 | 6.350 | 6.350 | 6.350 | 27.011 |
| W = 0.2 | 6.750 | 6.806 | 6.676 | 17.924 |
| W = 0.4 | 7.370 | 6.821 | 6.748 | 8.77 |
| W = 0.5 | 8.346 | 13.838 | 6.529 | 6.702 |
| W = 0.6 | 8.932 | 7.847 | 7.460 | 6.970 |
| W = 0.8 | 9.994 | 14.665 | 8.470 | 10.087 |
| W = 1 | 12.240 | 8.211 | 14.272 | 14.287 |
We continue to use the promising region in each uncertainty set that achieves the lowest total operation cost and includes all samples in k-nearest neighbors to analyze robustness. The AAC under the KMV set in Table 3 is the lowest one. Because the solution under KMV set is more robust than other sets, the further adjustment cost is lower after revealing uncertainty. In Case 1, the “R” of deterministic OPF is 92.6% and lower than the proposed two-stage OPF-U for ignoring the margin for uncertainty. Moreover, the “R” under the state-of-the-art uncertainty sets less than 100%, meaning that the solution of the state-of-the-art model violates the limit of the transmission line. The reason for the highest “R” in the KMV set is that the KMV set can offer significant advantages in guaranteeing line safety. Therefore, the KMV set with W = 0.5 can guarantee the robustness of the two-stage RO model and reduce adjustment costs.
Table 3.
AAC with different W in Case 1.
| AAC | R | |
|---|---|---|
| Box set | 1.273 | 93.6% |
| Ellipsoid set1 | 1.742 | 93.2% |
| Ellipsoid set2 | 1.273 | 95.00% |
| KMV set | 0.771 | 100.00% |
4.3. Evaluation of the performances under case 2
This experiment is based on the higher violation simulation data. In case 2, RMSE indicates the optimal number k is 9, in Fig. 6.
Fig. 6.
RMSE following different k in case 2.
The samples in the k-nearest neighbors are: [0.64002, 0.63817, 0.63794, 0.64369, 0.64528, 0.63138, 0.66887, 0.64939, 0.62576]. The optimal W is 0.4.
-
1)
Comparison of fluctuation region
Table 4 shows the fluctuation region of uncertainty with different W. When W = 0.2, 0.4, 0.4, 0.4, the fluctuation regions of the Box set, Ellipsoid set1, Ellipsoid set2, and KMV set are appropriately tight and contain all samples in k-nearest neighbors. However, the upper boundary of the Ellipsoid set2 is smaller than the maximum of the samples in the k-nearest neighbors, and many fluctuation regions are out of work when W = 0.2,0.6. The Ellipsoid set1's fluctuation region is not as robust as the proposed one when W = 0.2,0.5. When , all fluctuation regions of KMV set still work and contain all samples in the k-nearest neighbors with strong robustness.
Table 4.
-
2)Comparison of total operation cost
| Box set |
Ellipsoid set1 |
Ellipsoid set2 |
KMV set |
|||||
|---|---|---|---|---|---|---|---|---|
| Adjustive argument | Lower | Upper | Lower | Upper | Lower | Upper | Lower | Upper |
| W = 0 | 0.64 | 0.64 | 0.64 | 0.64 | 0.64 | 0.64 | 0.4800 | 0.4800 |
| W = 0.2 | 0.5632 | 0.7168 | 0.6130 | 0.6670 | 0.6239 | 0.6561 | 0.4800 | 0.5768 |
| W = 0.4 | 0.4864 | 0.7936 | 0.5855 | 0.6945 | 0.6004 | 0.6796 | 0.4800 | 0.6720 |
| W = 0.5 | 0.4480 | 0.8320 | 0.6399 | 0.64 | 0.5850 | 0.6950 | 0.4800 | 0.7210 |
| W = 0.6 | 0.4096 | 0.8704 | 0.4335 | 0.8465 | 0.6399 | 0.6400 | 0.4800 | 0.7691 |
| W = 0.8 | 0.3328 | 0.9472 | 0.5627 | 0.7173 | 0.5791 | 0.7009 | 0.4800 | 0.8652 |
| W = 1 | 0.2560 | 1.0240 | 0.5656 | 0.7144 | 0.5655 | 0.7144 | 0.4800 | 0.9613 |
It is obvious that when W = 0.4, the cost under the KMV set is the lowest, and the wind power generation is contained in samples in the k-nearest neighbors. In Table 5 the conservativeness level of solutions among RO models with different uncertainty sets: Box set > Ellipsoid set1> Ellipsoid set2 > KMV set. When each set takes the promising region, the cost under KMV set's region reduces by 10.30% than the Box set's region. The CPU time obtaining the optimal solutions of all uncertainty sets regions from 1.5s to 1.6s with no significant difference.
Table 5.
-
3)Comparison of AAC and “R”
| Adjustive argument | Box set | Ellipsoid set1 | Ellipsoid set2 | KMV set |
|---|---|---|---|---|
| W = 0 | 10.370 | 10.370 | 10.370 | 23.069 |
| W = 0.2 | 7.370 | 6.706 | 6.664 | 13.188 |
| W = 0.4 | 9.994 | 6.814 | 6.755 | 6.682 |
| W = 0.5 | 12.240 | 6.519 | 6.816 | 7.482 |
| W = 0.6 | 13.671 | 12.783 | 9.298 | 9.241 |
| W = 0.8 | 14.432 | 7.389 | 6.839 | 13.583 |
| W = 1 | 16.675 | 7.507 | 7.275 | 16.196 |
In Table 6, the “R” under KMV sets is 100%, and the ones under compared uncertainty sets are around 87.6%, which illustrates the solution in KMV set can guarantee not to violate the limit of all transmission lines. Meanwhile, the AAC under the KMV set is the lowest among all sets. Therefore, the KMV set with W = 0.4 can guarantee the robustness of the two-stage RO model and reduce adjustment cost.
Table 6.
AAC with different W in Case 2.
| AAC | R | |
|---|---|---|
| Box set | 2.573 | 87.4% |
| Ellipsoid set1 | 1.327 | 87.8% |
| Ellipsoid set2 | 1.072 | 87.6% |
| KMV set | 0.905 | 100.00% |
4.4. Analysis of optimization results with KMV set
The above comparison demonstrates that the fluctuation region derived from the KMV set provides a good overall choice for the two-stage RO model as an uncertainty set. In this section, we apply it to perform the advantage of the Predictive&Prescriptive analytics method by comparing it with the separation method. Meanwhile, we analyze fluctuation regions with different W and flexible applications. The experiments on the separation method are performed under 7.98% mean absolute percentage error of the new short-term prediction [33]. The optimization results of both methods are summarized in Table 7.
Table 7.
Optimization results in Case 3 with both methods.
| System | IEEE14 | IEEE30 | IEEE118 | IEEE1047 | ||||
|---|---|---|---|---|---|---|---|---|
| Adjustive argument | ||||||||
| Separation | 7.370 | 1.70 | 16.518 | 1.79 | 50.611 | 12.77 | 75.3912 | 32.35 |
| W = 0.2 | 7.329 | 1.66 | 16.554 | 1.79 | 50.336 | 10.13 | 81.9205 | 31.66 |
| W=0.35 | 7.324 | 1.67 | 16.489 | 1.75 | 50.219 | 9.23 | 73.901 | 30.33 |
| W = 0.4 | 7.365 | 1.71 | 16.516 | 1.81 | 50.573 | 10.24 | 75.123 | 32.64 |
| W = 0.6 | 7.542 | 1.74 | 16.626 | 1.83 | 52.076 | 10.26 | 81.2147 | 33.11 |
| W = 0.8 | 7.750 | 1.74 | 16.749 | 1.84 | 52.56 | 11.31 | 81.2147 | 33.12 |
| W = 1 | 7.958 | 1.74 | 16.837 | 1.84 | 61.10 | 11.31 | 81.2147 | 33.12 |
From Table 7, the robust fluctuation region is obtained when W = 0.35, and the optimal adjustive argument is not affected by system topology. The total operation cost of the separation method is still more than that of the proposed method with the optimal adjustive argument when increasing the system's scale.
It's obvious that the computational burden is more serious as the scale of the system increases when predictive and prescriptive analytics are separated. There is no distinction between the separation method and the proposed method in terms of computation time on the IEEE14-, 30-bus system. However, the computation time of the separation method is 6.66% more than that of the proposed method with W = 0.35 for large-scale systems. Therefore, the combination of predictive and prescriptive analytics is more advantageous in narrowing the effect on operation cost from prediction error and reducing the computation burden for large-scale systems.
Table 8 shows the out-of-sample performance of the proposed fluctuation region. The regional interval (upper-lower boundary length) widens as W increases. Taking the IEEE14-bus system as an example, when W = 0.35, the fluctuation region contains all samples in the k-nearest neighbors. And the most economically feasible solution is obtained.
Table 8.
Different fluctuation regions with different W.
| System | Adjustive parameter | u_lower | u_upper | Deviation rate |
|---|---|---|---|---|
| IEEE14 | Separation | 1.194 | 1.678 | 7.980% |
| W = 0.2 | 1.194 | 1.433 | −7.800% | |
| W=0.35 | 1.194 | 1.611 | 3.73% | |
| W = 0.4 | 1.194 | 1.671 | 7.567% | |
| W = 0.5 | 1.194 | 1.791 | 15.251% | |
| W = 0.6 | 1.194 | 1.910 | 22.934% | |
| W = 0.8 | 1.194 | 2.149 | 38.300% | |
| W = 1 | 1.194 | 2.388 | 53.668% |
When W is smaller than 0.35, the upper boundaries are smaller than the predicted value. When W is greater than 0.35, the upper boundaries are greater than the predicted value, and the fluctuation regions have a large budget. As the adjustive argument increases, the prediction error penalty increases rapidly. The total cost increases by 8.66% compared to the suitable fluctuation region, where W = 0.35. When the real wind power generation is further larger than the predicted value and energy storage capacity is abundant, the W can be set greater than 0.35 to form a wide fluctuation region. Large wind power generation can be absorbed under such a fluctuation region, and the optimal solution is not conservative. Additionally, it will sacrifice and decrease the final profit. When other generations are large, such as hydropower is larger than the predicted value, the W can be set smaller than 0.35 to form a tight fluctuation region. Proper wind power curtailment can guarantee the generation-load balance and the safe operation of the system.
5. Conclusion
This paper proposes a Predictive&Prescriptive method combining the predictive and prescriptive analytics and explores a two-stage RO model that is applied to OPF-U based on a framework combining machine learning and optimization techniques. The KNN provides samples, including possibilities of uncertainty. The proposed uncertainty set in the RO model contains all samples in k-nearest neighbors, and the compactness is best among compared sets. The robust fluctuation region can be obtained by a two-term exponential formula, and other regions are linear with W. Although the Ellipsoid set2 is tighter than the KMV set with a specific W, the fluctuation region is too narrow to cover other possibilities. Simulation results also illustrate that the ellipsoid sets' interval regions have no regularity with the adjustive argument, which is not convenient to select a proper adjustive argument quickly for fluctuation region of uncertainty. Although the interval regions of the Box set and KMV set extend with increasing W linearly, the Box set is not tight enough. Therefore, the KMV set with the regular region has its superiority. Finally, the KMV set is used to calculate two-stage OPF-U on a large volume of real historical data. When the optimal adjustive argument is applied, the fluctuation region is robust, and the solution is not conservative with W = 0.35. Compared to major methods separating predictive and prescriptive analytics, the proposed method can accelerate the solution of the optimization model and decrease the unnecessary cost when considering uncertainty.
Furthermore, we also get the conclusion that the superiority of the uncertain fluctuation region selected by Lemma1 is not affected by sharp violation and a vast quantity of uncertainty. The KMV set with high-performance uncertainty region is the most robust compared to other sets. The optimal result is adjustable with W and not conservative in the two-stage RO model. Further work can focus on applying advanced machine learning and optimization techniques and considering multi-time scale dispatch for stability control. The predictive method can predict uncertainty with a large number of predictive features and can be effectively combined with the prescriptive method.
Author contribution statement
Liqin Zheng: Conceived and designed the experiments; Performed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data; Wrote the paper.
Xiaoqing Bai: Conceived and designed the experiments; Analyzed and interpreted the data; Wrote the paper.
Xiaoqing Shi: Performed the experiments; Analyzed and interpreted the data; Wrote the paper.
Yunyi Li: Performed the experiments; Contributed reagents, materials, analysis tools or data.
Dongmei Xie: Analyzed and interpreted the data.
Chun Wei: Contributed reagents, materials, analysis tools or data
Data availability
Data will be made available on request.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgement
This work was supported by the National Natural Science Foundation of China under Grant 51967001 and Innovation Project of Guangxi Graduate Education Grant YCSW2022119.
Appendix.
An Uncertain fluctuation region
A.1 Proof of lemma1
The upper boundary increases linearly with W in Eq. (6), and the KNN approach can obtain k samples around the predicted value. Based on the above two characteristics, W can be determined by k. In our framework combining machine learning and optimization techniques, the selection of k will affect the optimal decision of W. When k is small, samples in KNN are few, and W should be large to include possible uncertainty values. When k increases, there are more and more possibilities in samples around the predicted value, and W can be set smaller to ensure that the maximum value of samples is included in the uncertainty region. When k tends to be a larger value, the tendency of W to become smaller will be slowed down to include all samples. Therefore, W changes with k inversely and follows the law of exponential.
Different k and samples in KNN can lead to different optimal results of the MVE optimization model. Moreover, the proposed uncertainty set is indirectly affected by the selection of k and samples in KNN. The optimization results and also affect the robustness of the proposed uncertainty set. Therefore, the proposed uncertainty set's robust fluctuation region should be unique and can be calculated with an exponential formula that is adjusted with the machined learning and optimization results. The coefficients of the exponential formula should be constrained with and , and the uncertainty region should be tight when containing all samples in KNN.
A.2 The transformation process of the coefficient optimization model
First, we define:
| (28) |
| (29) |
Then take the logarithm on both sides:
| (30) |
| (31) |
We multiply both sides of the inequality constraint by k, and substitute Eq. (30) into the constraint. A new inequality constraint is available:
| (32) |
The take exponents on both sides of Eq. (31):
| (33) |
The constraint is dealt with the same multiplication, substitution, and exponent process, and the new inequality constraint is available:
| (34) |
Finally, Eqs. ((27) (28)) are substituted into the final inequality constraint. The original model can be:
| (35) |
| (36) |
B Column-and-Constraint Generation Algorithm
The Column-and-Constraint Generation algorithm to two-stage OPF-U problem.
| Algorithm 1: Column-and-Constraint Generation |
|---|
|
Step1: Initialization: Set upper and lower boundaries: UB = , LB = , S = , M = 0 Step2: Solve the master problem: (37) (38) (39) (41) (42) Derive optimal solution () Update LB Step3: Solve the subproblem: (43) (45) (46) (48) (49) (50) Update UB Step4: Compute gap = |UB-LB| Step5: if gap , then returns and ends up. Step6: else increment m = m+1, and turn to Step2. |
C The construction of the state-of-arts uncertainty sets.
The box set has a simple structure and can quickly handle classic problems such as linear optimization, quadratic cone optimization, semi-definite optimization et al. However, the robustness of the box set is worse than the ellipsoid set. Therefore, we illustrate the fluctuation region of KMV set is most prominent by pairing a box set and two ellipsoid sets when applied to a two-stage RO model.
-
1)
Box Set [34]:
| (51) |
where , The uncertainty belongs to .
-
2)
Ellipsoid Set1: the uncertain arguments u belongs to the region [12]:
| (52) |
where is a covariance matrix.
| (53) |
Abbreviations
- OPF-U
optimal power flow considering uncertainty
- SO
stochastic optimization
- RO
robust optimization
- DRO
distributionally robust optimization
- KNN
k-nearest neighbor
- KMV
KNN + minimum volume
- SCUC
Security Constrained Unit Commitment
- ED
Economic Dispatch
- EM
Electricity Market
Parameters
Unit vector
The lower boundary of uncertainty
The upper boundary of uncertainty
The predicted value
Maximum prediction error
Samples closed to
The indices of the k closest points of to
Cost coefficients of i-th unit
Total electrical demand
- ,
The number of the system bus and units
The sets of units and transmission lines
The load demand of bus i
Bus admittance matrix
The angle of (i, j) element of
The lower and upper boundary of the active power of the i-th unit
The lower and upper boundary of reactive power of the i-th unit
The lower boundary of the voltage amplitude of bus i
The upper boundary of the voltage amplitude of bus i
The upper boundary of line transmission active power
The unit price of the re-dispatching unit i
The predicted output of wind
The unit penalty of prediction error
The coefficients of constraints
Variables
Adjustive argument
The first-stage decision variables
The electricity output of active power of unit i
The electricity output of reactive power of unit i
The voltage amplitude of bus i
Active and reactive powers at bus i
The regeneration of the re-dispatching unit i
The number close to the predicted value
The index of iteration
The introduced auxiliary variable to represent the subproblem
New recourse variables added into a master problem at the h-th iteration
The optimal values of uncertainty obtained from the subproblem at the h-th iteration
Dual variables
The optimal decision values in the first-stage
The optimal values of the 2nd stage at the h+1-th iteration
Uncertainty variables and sets
The positive semi-definite matrix variable
Vector variables
- U
The box set
The ellipsoid set
The minimum volume ellipsoid set
The KMV set
References
- 1.Xu G., Dong H., Xu Z., Bhattarai N. China can reach carbon neutrality before 2050 by improving economic development quality. Energy. 2022;243 doi: 10.1016/j.energy.2021.123087. [DOI] [Google Scholar]
- 2.Zhang Y., Li G., Li X. Short-term forecasting method for regional photovoltaic power based on typical representative power stations and improved SVM. Electric Power Automation Equipment. 2021;41(11):205–210. doi: 10.16081/j.epae.202108017. [DOI] [Google Scholar]
- 3.Zhu J., Huang Y., Ma L., Li H., Yuan Y. Evaluation of distributed power consumption capacity of distribution network based on uncertain optimal power flow. Autom. Electr. Power Syst. 2022;46(14):46–54. https://kns.cnki.net/kcms/detail/32.1180.TP.20220315.1531.004.html [Google Scholar]
- 4.Tu Q., Miao S., Lin Y., Zhang D., Yao F., Han J. Ultra-short-term interval forecasting method for regional wind farms based on dynamic R-vine copula model. High Voltage Eng. 2022;48(2):456–470. doi: 10.13336/j.1003-6520.hve.20201711. [DOI] [Google Scholar]
- 5.Ben-Tal A., Ghaoui L., Nemirovski A. Princeton University Press; Princeton NJ USA: 2009. Robust Optimization; p. 28. [Google Scholar]
- 6.Dragoon K., Milligan P. 2003. Assessing Wind Integration Costs with Dispatch Models: A Case Study of PacifiCorp. Windpower. [Google Scholar]
- 7.Miao C., Li H., Wang X., et al. Data-driven and deep learning-based ultra-short-term wind power prediction. Autom. Electr. Power Syst. 2021;45(14):22–29. doi: 10.7500/AEPS20201127004. [DOI] [Google Scholar]
- 8.Ma R., Qin J. Probabilistic continuous hybrid flow method for electricity-gas coupling system integrated with DFIG wind farm and is load margin analysis. Electric Power Automation Equipment. 2019;39(8):38–46. doi: 10.16081/j.epae.201908044. [DOI] [Google Scholar]
- 9.Tian Y., Wang K., Li G., Ge W., Luo H. Dynamic stochasic optimal power flow based on second-order cone programming considering wind power correlation. Autom. Electr. Power Syst. 2018;42(5):41–47. https://kns.cnki.net/kcms/detail/32.1180.TP.20180124.1623.028.html [Google Scholar]
- 10.Wei Z., Pei L., Chen S., Zhao J., Fu Q. Review on optimal operation and safety analysis of AC/DC hybrid distribution network with high proportion of renewable energy. Electric Power Automation Equipment. 2021;41(9):85–94. doi: 10.16081/j.epae.202109039. [DOI] [Google Scholar]
- 11.Jin X., Wu Q., Jia H., Hatziargyriou N.D. Optimal integration of building heating loads in integrated heating/electricity community energy systems: a bi-level MPC approach. IEEE Trans. Sustain. Energy. 2021;12(3):1741–1754. doi: 10.1109/TSTE.2021.3064325. [DOI] [Google Scholar]
- 12.Seo S., Oh S., Kwak H. Wind turbine power curve modeling using maximum likelihood estimation method. Renew Energ. 2019;136:1164–1169. doi: 10.1016/j.renene.2018.09.087. [DOI] [Google Scholar]
- 13.Hua W., Jiang J., Sun H., Tonello A., Qadrdan M., Wu J. Data-driven prosumer-centric energy scheduling using convolutional neural networks. Appl. Energy. 2022;308 doi: 10.1016/j.apenergy.2021.118361. [DOI] [Google Scholar]
- 14.Golestaneh F., Pinson P., Gooi H.B. Very short-term nonparametric probabilistic forecasting of renewable energy generation-with application to solar energy. IEEE Trans. Power Syst. 2016;31(5):1–14. doi: 10.1109/TPWRS.2015.2502423. [DOI] [Google Scholar]
- 15.Jin X., Wu Q., Jia H., Hatziargyriou N. Optimal integration of building heating loads in integrated heating/electricity community energy systems: a Bi-level mpc approach. IEEE Trans. Sustain. Energy. 2021;12(3):1741–1754. doi: 10.1109/TSTE.2021.3064325. [DOI] [Google Scholar]
- 16.Jiang S., Peng G., Bogle D., Zheng Z. Two-stage robust optimization approach for flexible oxygen distribution under uncertainty in integrated iron and steel plants. Appl. Energy. 2022;306 doi: 10.1016/j.apenergy.2021.118022. [DOI] [Google Scholar]
- 17.Vatani B., Chowdhury B., Dehghan S. A critical review of robust self-scheduling for generation companies under electricity price uncertainty. Int. J. Electr. Power Energy Syst. 2018;97:428–439. doi: 10.1016/j.ijepes.2017.10.035. [DOI] [Google Scholar]
- 18.Wu W., Wang K., Li G., Ge Y. Modeling ellipsoidal uncertainty set considering conditional correlation of wind power generation. Proceedings of the CSEE. 2017;37(9):2500–2506. doi: 10.13334/j.0258-8013.pcsee.160389. [DOI] [Google Scholar]
- 19.Qiu Z., Jiang N. An ellipsoidal Newton's iteration method of nonlinear structural systems with uncertain-but-bounded parameters. Comput. Methods Appl. Mech. Eng. 2021;373 doi: 10.1016/j.cma.2020.113501. [DOI] [Google Scholar]
- 20.Kuryatnikova O., Ghaddar B., Molzahn D. 2021. Adjustable Robust Two-Stage Polynomial Optimization with Application to AC Optimal Power Flow. arXiv preprint. [DOI] [Google Scholar]
- 21.Bertsimas D., Kallus N. From predictive to prescriptive analytics. Manage Sci. 2020;66(3):1025–1044. doi: 10.1287/mnsc.2018.3253. [DOI] [Google Scholar]
- 22.Cain M., O’neill R., Castillo A. Federal Energy Regulatory Commission; 2012. History of Optimal Power Flow and Formulations; pp. 1–36. [Google Scholar]
- 23.Ohmori S. A predictive prescription using minimum volume k-nearest neighbor enclosing ellipsoid and robust optimization. Mathematics. 2021;9(2):119. doi: 10.3390/math9020119. [DOI] [Google Scholar]
- 24.Sadat S. Optimal bidding strategy for a strategic power producer using mixed integer programming. scholarcommons.usf.edu. 2017 [Google Scholar]
- 25.Lorca A., Sun X. The adaptive robust multi-period alternating current optimal power flow problem. IEEE Trans. Power Syst. 2018;33(2):1993–2003. doi: 10.1109/TPWRS.2017.2743348. [DOI] [Google Scholar]
- 26.Zheng L., Li Y., Wei C., Bai X. A data-driven method for operation pattern analysis of the integrated energy microgrid. Energy Conv Manag: X. 2021;11 doi: 10.1016/j.ecmx.2021.100092. [DOI] [Google Scholar]
- 27.Ji Y., Xu Q., Zhao J., Yang Y., Sun L. Day-ahead and intra-day optimization for energy and reserve scheduling under wind uncertainty and generation outages. Electr Power Syst Res. 2021;195 doi: 10.1016/j.epsr.2021.107133. [DOI] [Google Scholar]
- 28.Zeng B., Zhao L. Solving two-stage robust optimization problems using a column-and-constraint generation method. Oper. Res. Lett. 2019;41(5):457–461. doi: 10.1016/j.orl.2013.05.003. [DOI] [Google Scholar]
- 29.Julia computing, inc., JuliaPro documentation. 2022. https://juliacomputing.com/docs/ [Online]. Available:
- 30.Gurobi Optimization, Inc., Gurobi Optimizer Reference Manual. 2020. http://www.gurobi.com/ [Online]. Available: [Google Scholar]
- 31.Zhu J., Liu Y., Xu L., Jiang Z., Ma C. Robust day-ahead economic dispatch of microgrid with combined heat and power system considering wind power accommodation. Autom. Electr. Power Syst. 2019;43(4):40–48. doi: 10.7500/AEPS20180214007. [DOI] [Google Scholar]
- 32.Zimmerman R., Murillo-Sánchez C., Thomas R. Matpower: steady state operations, planning and analysis tools for power systems research and education. IEEE Trans. Power Syst. 2021;26(1):12–19. doi: 10.1109/TPWRS.2010.2051168. [DOI] [Google Scholar]
- 33.Wang S., Li B., Li G., Yao B., Wu J. Short-term wind power prediction based on multidimensional data cleaning and feature reconfiguration. Appl. Energy. 2021;292 doi: 10.1016/j.apenergy.2021.116851. [DOI] [Google Scholar]
- 34.Jiang S., Peng G., Bogle I., Zheng Z. Two-stage robust optimization approach for flexible oxygen distribution under uncertainty in integrated iron and steel plants. Appl. Energy. 2022;306 doi: 10.1016/j.apenergy.2021.118022. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Data will be made available on request.






