Abstract
The quantile estimates of extreme wind speed are needed for various areas of interest using regional frequency analysis (RFA) and extreme value theory. These calculations are crucial for the coding of wind speed. The data was taken from the NASA official website at a 10-meter distance and measured in meters per second (m/s). RFA of annual maximum wind speed (AMWS) using L-moments is performed utilizing annual maximum wind speed data from sixteen sites (16) in Pakistan’s Khyber Pakhtunkhwa province. There are no sites that are found to be discordant. The wards method is used to construct a homogenous region and make two homogenous regions from 16 sites. The heterogeneity test justifies that both clusters are homogeneous. The most appropriate probability distribution from the Generalized Normal (GNO), Generalized Logistic (GLO), Pearson Type-3 (P3), Generalized Pareto (GPA), and Generalized Extreme Value (GEV) distributions is chosen to calculate regional quantiles. According to the L-moments diagram and Z statistics, the GEV for Cluster- Ι and GLO for Cluster- ΙΙ are the best suggestions from the others. Both clusters’ robustness is measured utilizing relative bias (RB) and relative root mean square error (RRMSE). Overall, GEV distribution is fit for cluster-Ι, and the GLO distribution is fit for cluster-ΙΙ. Utilizing the site mean and median as index parameters, we can also find at-site quantiles from regional quantiles. The study’s quantile estimates can be employed in codified structural designs with policy consequences.
Keywords: Linear-moments, Monte Carlo simulation, Quantile estimates, Wind speed
Subject terms: Natural hazards, Climate change
Introduction
Wind speed, a fundamental atmospheric phenomenon driven by air movement from high to low-pressure areas due to temperature differentials, plays a crucial role in various sectors such as weather forecasting, aviation, and wind energy generation. The rising global demand for clean energy sources has propelled the growth of wind energy, making it a prominent contributor to renewable energy portfolios worldwide1–3. Coastal and hilly regions are particularly promising for wind power generation due to favorable wind conditions3. The rising environmental challenges associated with fossil fuel consumption have prompted a shift towards cleaner energy alternatives to mitigate climate change impacts4. Wind, solar, and geothermal energy have emerged as sustainable options for energy generation, characterized by their renewability and minimal environmental footprint5–7.
Pakistan’s potential for wind energy is substantial, with estimates suggesting significant capacity for wind power generation across the country8,9. However, an accurate assessment of wind speed variability is crucial for effective wind energy planning and infrastructure development. Probabilistic modeling of wind speed data is essential for predicting energy output and evaluating extreme events10–12.
The study focuses on conducting a RFA of extreme wind events in the Khyber Pakhtunkhwa province of Pakistan using linear moments methods. By providing localized insights into extreme wind events, the research aims to inform risk assessment, disaster preparedness, and infrastructure reliability efforts. Additionally, the study seeks to identify homogeneous regions, determine probability distributions, and estimate quantiles for different return periods to mitigate losses and optimize wind energy utilization13,14. This study addresses a notable research gap by focusing on the Regional Frequency Analysis (RFA) of extreme wind events specifically within Pakistan’s Khyber Pakhtunkhwa province. While previous research has explored wind energy potential and extreme weather phenomena in various regions, none have delved into the Khyber Pakhtunkhwa province using RFA techniques. By narrowing the scope to this specific region, the study aims to provide localized perceptions of extreme wind events, which are crucial for informing risk assessment, disaster preparedness, and infrastructure resilience efforts tailored to the region’s unique characteristics.
Furthermore, this research introduces methodological advancements by employing linear moments methods for RFA, offering a novel approach to analyzing extreme wind data and estimating probability distributions. This methodological aspect contributes to the broader field of extreme weather analysis, presenting a refined technique for assessing and quantifying wind hazards with improved accuracy and reliability. The study’s findings also carry direct policy implications, particularly in areas such as energy planning, safety regulations, and infrastructure development. By identifying homogeneous regions, determining probability distributions, and estimating quantiles for different return periods, the research provides suggestions for policymakers to mitigate losses and optimize wind energy utilization strategies tailored to the specific needs and challenges of the Khyber Pakhtunkhwa Province. The objectives of the study include ensuring data assumptions are met, screening data for RFA, identifying homogeneous regions, determining probability distributions, estimating quantiles, and proposing mitigation strategies for extreme events. Section 2 will explore the methodology used, Sect. 3 will describe the study area’s characteristics, Sect. 4 will present the results and discussion, and the final section will conclude with policy implications and future directions.
Methodology
In the methodology section, we have employed a detailed approach to analyzing the annual maximum wind speed (AMWS) series and conducting regional frequency analysis (RFA). The step-by-step methodology is explained in the following subsections.
The initial examination of the annual maximum wind speed (AMWS) series
Before performing RFA, we check the basic assumptions, which are stationarity, independence, and homogeneity. These are also mutual assumptions for the RFA of maximum events, such as maximum floods, rainfall, and droughts. For stationarity, Spearman’s order rank correlation test, for independence, the Wald-Wolfowitz test, and for homogeneous, the Man-Whitney U (MWU) test is used in this study.
Spearman’s order rank correlation test for checking the trend
The Spearman’s rank correlation test is used to assess the presence of a monotonic trend in the wind speed data. This test is selected because it does not assume normality and is effective at detecting trends even in non-linear data. By using Spearman’s test, the authors aim to determine if there is a consistent directional change in wind speed over time, which is crucial for understanding long-term patterns and making accurate predictions. Named after the British psychologist Charles Spearman15, this coefficient indicates whether the trend is positive, negative, or non-existent. The null and alternative hypotheses of the Spearman’s rank correlation test are as follows:
Ho: There is no trend in the series. H1: There is a trend in the series.
The significance threshold is 0.05, and the test statistics are
![]() |
1 |
while
Spearman’s rank correlation coefficient,
= difference between the two ranks of each observation,
Number of observations
The Wald-Wolfowitz (WW) test for checking independence
The Wald-Wolfowitz test is employed to assess the independence of observations in the wind speed data. This non-parametric method is chosen for its suitability in evaluating the randomness and lack of correlation between successive wind speed measurements. By applying the Wald-Wolfowitz test, the authors aim to confirm that wind speed observations at a specific site are independent, ensuring the data’s reliability for frequency analysis. Independence implies that the occurrence of a particular wind speed observation does not influence the occurrence of any other wind speed observation at the same site. This assumption of independence is regularly verified in hydrological studies, which often include analyses of yearly means, totals, maxima, minima, monthly, seasonal, and other time interval data, such as non-annual maximum data samples and partial duration series. The non-parametric Wald-Wolfowitz test, first introduced by Wald and Wolfowitz16, is widely used to test the independence of observations in a recorded series and to identify potential trends in the data. Let
,
,
……………
represent the experimental values of the variable under investigation. The Rao and Hamed recommended R statistics are
![]() |
2 |
The
statistic follows a normal distribution with the mean and variance shown below.
![]() |
3 |
![]() |
4 |
Where the term
and
is the
moment with respect to the origin of the sample. The test statistics of the WW test is given as
![]() |
5 |
Where
is used to test the data set for independence at 5% level of significance.
Mann-Whitney U (MWU) test for homogeneity
The Mann-Whitney U (MWU) test, a non-parametric test devised by Mann and Whitney17, is employed to test the null hypothesis that two samples come from the same population. This test is particularly useful when the data does not adhere to the normality assumption, making it a robust alternative to the t-test. In the context of wind speed data analysis, the MWU test is frequently utilized to assess the homogeneity of samples, especially in the frequency analysis of extreme occurrences. By using the MWU test, the authors aim to determine whether there are significant differences between two sets of wind speed data, thereby ensuring that the data can be reliably used for further statistical analysis. This approach enhances the reliability of conclusions drawn from the study, particularly in understanding extreme wind speed events. Let’s have two independent samples of sizes
and
the total sample size
, which is
almost similar in length yielding
. All of the samples are ranked from best to worst. The MWU test is based on the lowest value of “U,” which is the minimum of the V and W variables defined in.
![]() |
6 |
![]() |
7 |
![]() |
8 |
Where
and
are both sample,
is the sum ranking order of the first sub sample
in the combined series N,
where
is the rank of the first sample
in the combined series N. where
denotes the number of times an element from the first sample
is ranked after an element from the second sample
. Similarly, W denotes the case in which the second sample
is ranked after the first sample
. When
and
,
under the same null hypothesis, the
test statistic can be regarded normally distributed. The
statistic is written as follows.
![]() |
9 |
![]() |
10 |
![]() |
11 |
The formula for the variance of U should be modified as follows in the presence of tied ranks.
![]() |
12 |
Where
is the number of observations that share rank
and
is the number of ranks that are tied.
Linear moments
The use of L-moments for parameter estimation in probability distributions is well justified due to their robustness and reliability, particularly with small sample sizes and non-normal data distributions. L-moments offer significant advantages over traditional moment-based methods by being less sensitive to outliers and providing more accurate parameter estimates even with limited data. By employing L-moments, the aim is to enhance the precision of parameter estimation and improve the validity of the frequency analysis results. In this study, L-moments were utilized to estimate the parameters of probability distributions, a method commonly employed in the frequency analysis of extreme wind speeds. Compared to the method of moments and the maximum likelihood approach, L-moments are more reliable because they are less affected by outliers and better suited for smaller sample sizes. L-moments are defined as the expectations of linear combinations of order statistics, making them a robust choice for its analysis18–22. They may be used to explain any random variable with a mean. Let
represent a random sample of magnitude r with cumulative distribution function F(X) and quantile functions X(F). Let
be the random sample order statistics23. Explained the rth population L-moment for the random variable X as follows:
![]() |
13 |
When it comes to L-moments,
is a linear function of the predicted order of statistics according to L-moments. The following have provided the first four L-moments
![]() |
14 |
![]() |
15 |
![]() |
16 |
![]() |
17 |
The ratio of L-moments will be determined as follows
![]() |
18 |
![]() |
19 |
![]() |
20 |
Is the measure of locations,
is the measure of L-coefficient of variations (L-cv),
and
are L- skewness and L-Kurtosis, respectively in the preceding equation.
Application of the L-moments based on regional frequency analysis
Hosking and Wallis24 Proposed the following four steps for regional frequency analysis of extreme wind based on L-moment’s theory. These steps are as follows.
Screening of the data.
Recognition of Homogeneous Regions.
Selections of the best-fit distribution.
Quantile estimation.
Screening of the data
In L-moment-based RFA, data screening ensures the data is clean and ready for analysis. The data must be checked to ensure it is trustworthy and suitable for testing the causality theory. For the screening of the data set, Hosking and Wallis24 introduced a discordancy measure denoted by
. The discordance amount
is used to differentiate those locations that are completely discordant with the sites within the group. The
is defined as follows.
![]() |
21 |
![]() |
22 |
The sum of squares and cross products matrix where
= [
,
,
] vector consisting of sample LMR
=
, N = Total enumerate of sites Hosking plied a touchstone for discordancy statistic, site’s collection, and relevant
brink point.
Recognition of homogeneous regions
The development of homogeneous regions is the most important phase in RFA. A region is said to be homogenous if all of its sites share some common characteristics. There is substantial literature on various grouping strategies, such as geographical convenience, subjective partitioning, objective partitioning, and cluster analysis. According to Hosking and Wallis24, many site characteristics, geographical (elevation, latitude, and longitude), and all other than the observed data series of interest can be used to characterize regions subjectively. Furthermore, cluster analysis is more reliable than other techniques for developing homogeneous regions25–28. Therefore, in this study, Ward’s methods for hierarchical clustering have been used for dividing the cluster of 16 sites into subgroups.
Ward’s method
The justification for using Ward’s test29 in hierarchical clustering lies in its focus on minimizing the variance within clusters. Ward’s approach, based on hierarchical clustering, effectively groups data points by minimizing the increase in total within-cluster variance when forming clusters. This method relies on standardized Euclidean distance, ensuring that the clustering process is statistically sound and that the resulting clusters are both meaningful and distinct. By using Ward’s test, the authors aim to achieve a reliable and interpretable clustering of wind speed data, thereby enhancing the accuracy and validity of the frequency analysis results.
![]() |
23 |
Where
and
are the physiographic coordinates of places
and
respectively and
is a diagonal matrix each coordinate is expressed as a sum of squares because the variables are presented in different units. To avoid proportional effects, this coordinate is inversely weighted by the sample variance. In terms of variables, the cluster’s sum of squares inside the cluster (GSS) is defined as the sum of the distances between all objects in the cluster and their center of gravity25. An equation can be used to express it as.
![]() |
24 |
Where
and
are the cluster r size and centroid, respectively.
Heterogeneity test
Hosking and Wallis24 discuss a test denoted by
to measure the heterogeneity of a region. By using
test, we approach the homogeneity of the sites in the region, whether the region is homogenous or heterogeneous.
![]() |
25 |
Where v is the standard deviation of sample l-cv.
![]() |
26 |
l-cv of the regional average
![]() |
27 |
represents the mean and variance of population
. According to the criteria, if
< 1, the region is homogeneous,
is in among 1 and 2. It can be considered homogeneous but not perfectly homogeneous.
≥ 2, then the region will be considered as perfectly heterogeneous. The use of Kappa distribution for simulations is common because of its tremendous qualities, generated by emerging two gamma distributions, having four parameters (
) indicating scale, location, shape, and redundant shape parameter. Its density, distribution, and quantile functions are given below.
![]() |
28 |
![]() |
29 |
![]() |
30 |
Selections of the best fit distribution
The next stage is to select the right distribution after passing the homogeneity test. After making homogeneous regions, the following step is the description of the appropriate statistical model by choosing a suitable regional frequency distribution.
Peel30 suggested an L-moment (LM) ratio diagram to select a suitable probability distribution in regional frequency analysis of homogeneous regions. Vogel and Fennessey31 determined that in the use of extreme events in hydrology, LM ratio diagrams are always employed instead of product-moment ratio diagrams. Hosking23 found that LM ratio diagrams can distinguish between candidate distributions and explain regional data.
Low flow occurrences within the regions will be analyzed based on the fitted regional distribution using the goodness-of-fit criterion in terms of L-moments using L-moment ratio diagrams and Z-statistics24. The average moments of the regional data are compared to the moments of the distribution in this criterion. The main goal is to choose the optimal distribution for the observed data among the above-mentioned simulated candidate distributions. The best fit of the simulated distribution depends on how well L-skewness and L-kurtosis support regional average L-skewness and L-kurtosis.
The procedure for selection of distribution accordingly is as follows.
![]() |
31 |
Where
![]() |
32 |
![]() |
33 |

= L-CV of fitted distribution
- B4
= Regional Bias

= Regional Standard Deviation
- Nsim
= Quantity of Simulated Regional Data by Kappa Distribution
The fit is considered to be good if
have the small value or sufficiently close to zero. In the statistical technique of hypothesis testing if
, then at 90% confidence level, the candidate distribution is considered the best-fitted probability distribution. If more than one candidate probability distribution meets the above criteria, the distribution with the lowest
value is chosen as the best-suited probability distribution.
Quantile estimation
The final phase of RFA is to estimate the parameters of the chosen frequency distribution and assess its robustness in giving valid quantile estimates for all sites in the homogenous region24. suggested that the regional L-moment algorithm is more convenient despite the non-fulfillment of some fundamental assumptions of the index flood procedure. For various non-exceedance probabilities, regional quantile estimations are calculated using a simulation process. Furthermore, by scaling
with an estimate of the scaling factor
corresponding to non-exceedance probability
, the quantile estimates of each site might be obtained as follows:
![]() |
34 |
where
is the estimation of the at-sit quantile,
is the individual sites mean and
is refers to the regional quantile function. Other scaling factors that can be used include median, mean, etc. We used the Monte Carlo simulation technique given by32 to test the robustness of the specified regional frequency distribution in this work, with 10,000 simulations. We calculated the errors between simulated quantiles and calculated regional quantiles estimations using this technique. These differences are then used to calculate RB and RRMSE for various non exceedance probabilities, which are then used to examine the robustness of best fit distributions. Below is the mathematical form of RB and RRMSE.
![]() |
35 |
![]() |
36 |
Here
is the sample size,
and
is the simulated and computed regional quantiles, respectively, in the above equation.
On the basis of standard error of At-site quantile estimations under best-fit distribution24, proposed the following equation to check robustness.
![]() |
37 |
This can be further written like this
![]() |
38 |
In additions we can use sample variance of median and sample median instead of sample mean of variance and sample mean. In that case the relationship will become
![]() |
39 |
Where the
represent the sample median and
is the sample variance of median.
Study area and data
Khyber Pakhtunkhwa (30⁰-35 N & 67⁰-72⁰E) is one of Pakistan’s five provinces, It is located on the Iranian plateau and Eurasian land plate with an area of 74,521 km², It is separated into two zones geographically, from the Hindu Kush to the northern section of Peshawar and from Peshawar to the southern half of the Derjat basin. The KPK climate shifted from severely cold (in places like Chitral) to highly hot (in places like Dera Ismail Khan)33. On availability of required data sets, only sixteen (16) palaces of KPK (Cherat, Chitral, D.I. Khan, Tank, Bannu, Upper Dir, Drosh, Kakul (Abbottabad), Parachinar, Saidu Sharif, Kalam, Malam Jabba, Mir Khani, Peshawar, Lower Dir, Kohistan) were selected for this study. All site names and characteristics are shown in Table 1; Fig. 1. The wind energy potential was calculated using monthly NASA wind speed data from 1983 to 202034. NASA’s collection of fresh data from the satellite system is used to predict the world’s energy resources35.
Table 1.
The results of basic assumptions for 16 sites.
| Name of the sites | Spearman test | Wald & Wolfowitz test | Mann Whitney U test | |||
|---|---|---|---|---|---|---|
| Test statistic | P-value | Test statistic | P-value | Test statistic | P-value | |
| Abbottabad | 0.569 | 0.285 | -1.135 | 0.128 | -0.975 | 0.165 |
| Bannu | -0.806 | 0.210 | -0.857 | 0.196 | -0.767 | 0.221 |
| Cherat | -0.861 | 0.195 | 0.353 | 0.362 | -1.597 | 0.055 |
| Chitral | -0.717 | 0.237 | 0.791 | 0.214 | -1.389 | 0.082 |
| D.I. Khan | 0.274 | 0.392 | 0.837 | 0.201 | -0.353 | 0.362 |
| Drosh | -0.837 | 0.201 | 0.814 | 0.208 | -1.638 | 0.051 |
| Kalam | -0.277 | 0.391 | 0.081 | 0.468 | -0.306 | 0.380 |
| Kohistan | -1.017 | 0.154 | 0.940 | 0.174 | -1.016 | 0.155 |
| Lower Dir | -1.305 | 0.096 | 0.376 | 0.353 | -1.472 | 0.070 |
| Malam Jabba | 0.598 | 0.275 | 1.558 | 0.060 | -0.353 | 0.363 |
| Mir Khani | 0.720 | 0.236 | 0.968 | 0.166 | -0.726 | 0.234 |
| Parachinar | -0.372 | 0.355 | 0.097 | 0.461 | -0.228 | 0.410 |
| Peshawar | 0.059 | 0.477 | -1.029 | 0.152 | -0.643 | 0.260 |
| Saidu Sharif | -0.658 | 0.255 | -0.409 | 0.341 | -1.390 | 0.341 |
| Tank | -0.492 | 0.311 | 0.405 | 0.343 | -0.311 | 0.378 |
| Upper Dir | -0.416 | 0.339 | 0.402 | 0.344 | -1.141 | 0.127 |
Fig. 1.
Geographical locations of the 16 stations of KPK, Pakistan. (The software ArcGIS 10.7 was used to create the map.).
The selection of specific locations within Khyber Pakhtunkhwa (KPK) for calculating wind energy potential was driven by a combination of geographical diversity, data availability, strategic importance, and feasibility for implementation. The chosen sites represent a broad geographical spread across KPK, covering different climatic zones and topographical features, which is crucial for understanding the varying wind energy potential across the province. For instance, areas like Chitral, known for its severe cold, and Dera Ismail Khan, which experiences extreme heat, were included to provide a comprehensive overview of the region’s wind energy potential. Although satellite data is freely available, the selected locations have reliable historical wind speed data from NASA spanning from 1983 to 2020, ensuring a robust analysis and accurate assessment of wind energy potential. Some locations were chosen due to their strategic significance for energy projects, being close to existing infrastructure and population centers, such as Peshawar and D.I. Khan, where energy demand is high. Additionally, the selected sites often have previous studies or benchmark data available, aiding in comparative analysis and validating the current study’s findings. Finally, the feasibility for actual wind energy project implementation, considering factors like land availability, environmental impact, and socio-economic benefits, was also a key criterion. This comprehensive approach ensures that the study addresses both scientific and practical aspects of renewable energy development in KPK.
Results and discussion
Basic assumption
Prior to performing the RFA of AMWS, we investigated three main assumptions of RFA: independence, homogeneity, and stationarity. The term “independence” refers to the notion that no single observation in a data series affects subsequent observations. In practice, the degree of dependency between successive portions of a series varies with the interval between them and is commonly small between yearly maximum values, but the degree of dependence between consecutive daily values is typically substantial. The term “homogeneity” means that all observations within a data series originate from the same population. When the variety in severe events such as floods, snowmelt, rainfall, wind speed, and drought is large, it becomes hard to identify non-homogeneity. Stationarity implies that the AMWS series is invariant in time, excluding random variations. Trends, leaps, and cycles describe non-stationarity. While trends may be attributed to periodic changes in climatic circumstances, cycles can be linked to long-term climate oscillations. Jumps occur most often in flood series caused by a sudden change in the river system, such as the structure of a dam.
The required assumptions should be fulfilled by the data of annual maximum wind speed. Therefore, time series graphs and various non-parametric tests are applied to justify these assumptions.
The Wald-Wolfowitz Test is used to verify AMWS’ assumption of independence. The results are given in detail in Table 1. The Wald-Wolfowitz test statistic values are usually small, and the p-value is greater than (0.05) for each site. According to this test, we conclude that the AMWS data of the different sites is independent.
We used the Man-Whitney U (MWU) test to check the assumption of homogeneity in the data of AMWS. The results verified that the probability “P” value is greater than the critical value of 0.05 such that it means that we accept the null hypothesis (the sample comes from a homogenous population) of the MWU test, and we conclude that the data of AMWS is homogeneous. The details of the results are given in Table 1.
We used the Spearman order rank correlation test to check the stationarity of AMWS. The Spearman’s rank order correlation test statistic values for each site are small, and the p-value is larger than the level of significance, i.e. (p > 0.05). Therefore, we conclude that based on the results given in Table 1, the data of each AMWS site fulfills the stationarity assumption.
Time series plots
As time goes by, stationarity is one of the basic assumptions when dealing with hydrological data. The graphs of ordered data on variables give us a good understanding of stationarity. The time series plots in Fig. 2 show that the data series of sixteen sites have a uniform increasing/declining trend, indicating randomness in the observation of all sites and that the time series data is stationary.
Fig. 2.
Time series plot of annual maximum wind speeds of 16 gauging stations of Khyber Pakhtunkhwa, Pakistan.
Screening of the data using discordancy measure
The data screening to detect certain discordant sites is the initial stage in regional frequency analysis. We analyzed two clusters, the first of which has 12 sites and the second of which has four, and we calculated the discordancy measure for each site. For each site, the discordancy statistics are computed. As shown in Table 2, for all sites, the computed values are less than the critical value of 3.
Table 2.
Summary statistics based on L-moments for Cluster-Ι various wind sites.
| Stations | n | Latitude (North) | Longitude (East) | Elevation (meter) |
l 1 | t | t 3 | t 4 | D i |
|---|---|---|---|---|---|---|---|---|---|
| Abbottabad | 30 | 34.11 | 73.15 | 1418.53 | 8.363 | 0.099 | 0.053 | 0.167 | 1.22 |
| Cherat | 30 | 33.49 | 71.33 | 632.28 | 9.576 | 0.083 | 0.161 | 0.171 | 1.24 |
| Chitral | 30 | 35.51 | 71.50 | 3392.71 | 8.048 | 0.102 | 0.106 | 0.082 | 0.75 |
| Drosh | 30 | 35.34 | 70.47 | 3174.26 | 7.390 | 0.093 | 0.130 | 0.128 | 0.14 |
| Kalam | 30 | 35.5 | 72.59 | 3782.04 | 9.346 | 0.090 | 0.028 | 0.030 | 1.04 |
| Kohistan | 30 | 35.06 | 73 | 2969.68 | 8.918 | 0.080 | 0.007 | 0.121 | 1.08 |
| Lower Dir | 30 | 34.5 | 70.49 | 2061.64 | 8.314 | 0.096 | 0.148 | 0.116 | 0.41 |
| Malam Jabba | 30 | 34.45 | 72.44 | 706.05 | 7.298 | 0.093 | 0.082 | 0.127 | 0.08 |
| Mir Khani | 30 | 35.30 | 74.42 | 3462.82 | 10.038 | 0.095 | 0.179 | 0.067 | 1.91 |
| Peshawar | 30 | 34.02 | 71.56 | 713.79 | 7.757 | 0.083 | 0.099 | 0.094 | 0.23 |
| Saidu Sharif | 30 | 34.44 | 72.21 | 706.05 | 7.675 | 0.094 | 0.083 | 0.068 | 0.48 |
| Upper Dir | 30 | 35.12 | 70.51 | 3061.05 | 7.526 | 0.087 | 0.044 | 0.119 | 0.32 |
| Bannu | 30 | 33 | 70.06 | 1337.87 | 12.481 | 0.101 | 0.177 | 0.241 | 2.11 |
| Tank | 30 | 31.55 | 70.52 | 256.26 | 10.600 | 0.085 | 0.148 | 0.191 | 1.04 |
| Parachinar | 30 | 33.52 | 70.05 | 1727.58 | 12.036 | 0.108 | 0.071 | 0.143 | 1.66 |
| D.I. Khan | 30 | 31.49 | 70.56 | 294.41 | 10.503 | 0.068 | 0.017 | 0.119 | 2.28 |
Note: first sample l-moment (l1), L-cv (t), L-skewness (t3), L-kurtosis (t4), and discordancy measure (Di).
In Table 2,
denotes the record length, which is set at 30 across all sites.
Stands for the sample mean,
for the sample L-CV,
for the sample L-skewness, and
for the sample L-kurtosis. The mean of the data in Table 2 of cluster-Ι ranges from 7.390667 to 12.48133, whereas sample L-CV ranges from 0.066783 to 0.107685. The data skewness coefficient ranges from 0.006789 to 0.179212. All the values of
are less than 3. Therefore, the information given in Table 2 shows that no site is found to be discordant, and the analysis is carried out on 16 sites.
For the formation of small homogeneous region(s), cluster analysis with the Ward algorithm is used to split data into several subgroups such that any site belonging to the cluster has related climatic/geographical features29. The longitude and latitude are used for the cluster analysis to divide the 16 sites into subgroups. This method investigated that there are two homogenous regions in this study. The results are shown in the dendrogram given in Fig. 3.
Fig. 3.
Dendrogram that represents the division of 16 gauging sites into sub-groups obtained using the Wards clustering method.
Regions and heterogeneity measure
Combining the sub-groups (from left to right) given in Fig. 3, Cluster-1 consists of 12 site names: Abbottabad, Cherat, Chitral, Drosh, Kalam, Kohistan, Lower Dir, Malam Jabba, Mir Khani, Peshawar, Saidu Sharif, and Upper Dir. Similarly, Cluster-II consists of 4 sites: Bannu, Parachinar, Tank, and D.I. Khan. The heterogeneity values of Cluster-1 and II are shown in Table 3. The results of Table 4 show that both Clusters Ι and ΙΙ are “acceptably homogenous.”
Table 3.
Homogeneity measures of both clusters.
| Cluster | Number of sites |
|
|
|
Homogeneity |
|---|---|---|---|---|---|
| Cluster-Ι | 12 | -1.76 | -1.26 | -1.77 | Homogeneous |
| Cluster -ΙΙ | 04 | 0.90 | -0.36 | -0.69 | Homogeneous |
Table 4.
Goodness of fit test for homogeneous clusters.
| Clusters | Distributions | GLO | GEV | GNO | P3 | GPA |
|---|---|---|---|---|---|---|
| Cluster-Ι | | | |
3.69 a | 1.01* | 1.26 |
|
4.36 a |
| Cluster-ΙΙ | | | |
|
1.22 | 1.14** | 1.27 | 3.74 a |
* show the best distribution.
** show the second best distribution.
a indicates that the calculated values are more than the critical value of 1.64.
Hosking and Wallis24 give three different aspects of heterogeneity values. If
<1, the region is completely homogeneous. If 1≤
≤2, the region can be homogeneous. If, on the other hand,
, the region is completely heterogeneous. In Table 3, the values of
of both clusters indicate that no value is greater than 2, which meets the criteria of a homogeneous region.
Selections of best fit distribution
The third stage of RFA is the fitting of the distribution and selection of the best fitting distributions24. used standards to determine the first three perimeter distributions, such as Generalize Pareto (GPA), Generalized Logistic (GLO), Generalize Extreme Value (GEV), Generalize Normal (GNO), and Generalize Pearson Type 3 ( P3). When starting this process, we will keep two goals in mind. The first is the nomination of the best distribution. The ordinal is the estimate of the quantile for each region in several time periods. Hosking provides two methods to achieve the best distribution. Mainly Z-fit, others are ratio graphs.
The selections of the fit distribution for each cluster are based on the L-moment ratio diagram and Z statistical test. Z-Fit applies through the critical value if
at a level of significance of 5%. It might be possible that more than one distribution strikes the said limits, and then the distribution approaching zero will be best considered as the best fit.
Table 4 summarizes the appropriate Z statistics and best distributions of both homogeneous clusters. For Cluster-Ι, the values of GEV and P3 are the smallest among other values. The values of GEV and P3 are less than the critical values of 1.64, and the selected distribution is required to be closer to zero. Therefore, according to this criterion, it can be said that the distribution of GEV and P3 is acceptable if the statistic is less than 1.64. Similarly, for Cluster-ΙΙ, the values of GLO and GNO are the smallest among other values. The values of GLO and GNO are less than the critical values of 1.64, and the selected distribution is required to be closer to zero. Therefore, according to this criterion, it can be said that the distribution of GLO and GNO is acceptable if the number is less than 1.64.
L-moments ratio diagram
L-moment ratio diagrams (scatter plots) display L-moments of various distributions that are commonly used and are useful for providing guidelines for selecting an appropriate distribution for the study area based on average values of L-skewness and L-Kurtosis. Although it is a subjective method, it is a very popular tool for selecting candidate distributions at the outset. Another advantage of the L-moment ratio diagram is the ability to display moment ratios from multiple distributions on the same graph paper.
The l-moments ratio diagram/plot for the two clusters is shown in Fig. 4. For Cluster-I regional average L-skewness and L-Kurtosis average, they lie closest to the GEV distribution; similarly, for Cluster-II regional average L-skewness and L-Kurtosis average, they lie closest to GLO. So it shows that GEV for Cluster-I and GLO for Cluster-II are the best-fitted distribution.
Fig. 4.
L-moments ratio diagram for both regions.
Constructions of growth curves and accuracy measures for best fit distributions
To evaluate which of these two distributions was the most accurate, we performed a Monte Carlo simulation provided by32. For the design flood estimate, RB and RRMSE were used to examine the robustness of the RFA distributions.
For the Cluster-Ι Table 5 shows the RB and RRMSE simulation results for GEV and P3 distributions for various return times up to 100 years. Table 5 shows that the RB values for GEV are lower than the P3 distribution at all periods of return except year 2. As a result of the RB measures, GEV is the most robust distribution. Also, the value of RRMSE outperforms the P3 distributions during return periods of 5 and 10 years. However, the RRMSE of the GEV distribution is higher than that of the P3 distribution for return periods 2, 20, 50, and 100. Overall, Table 5 shows that the GEV distribution outperforms the P3 distribution however RRMSE shows that P3 has little advantage over GEV over longer return periods. Hong and Ye36 discovered that the GEV distribution is linked to lower values of unrealistic upper bound quantiles and that it best fits the normalized AMWS.
Table 5.
Accuracy measure for best-fit distributions for Cluster-Ι and Cluster-II.
| Distributions | Measures | Q2 | Q5 | Q10 | Q20 | Q50 | Q100 | |
|---|---|---|---|---|---|---|---|---|
| Cluster-Ι | GEV | RB | 0.0003 | 0.0001 | -0.0001 | -0.0002 | -0.0000 | 0.0002 |
| RRMSE | 0.0062 | 0.0117 | 0.0186 | 0.0261 | 0.0363 | 0.0446 | ||
| PE3 | RB | -0.0000 | 0.0004 | 0.0008 | 0.0011 | 0.0015 | 0.0018 | |
| RRMSE | 0.0057 | 0.0129 | 0.0192 | 0.0249 | 0.0319 | 0.0368 | ||
| Cluster-ΙΙ | GLO | RB | 0.0008 | 0.00002 | -0.0002 | -0.0002 | 0.0003 | 0.0013 |
| RRMSE | 0.008 | 0.0172 | 0.0283 | 0.0393 | 0.0547 | 0.0674 | ||
| GNO | RB | 0.0003 | 0.0001 | 0.0004 | 0.00097 | 0.002 | 0.003 | |
| RRMSE | 0.009 | 0.0214 | 0.0343 | 0.0461 | 0.0607 | 0.0714 | ||
Similarly, Table 5 shows the RB and RRMSE simulation results for GLO plus GNO distributions for various return times up to 100 years for Cluster-ΙΙ. Table 6 shows that the RB values for the GLO distribution are lower than the GNO distribution for all return periods except 2 and 100 years. As a result of the RB measures, GLO is the most robust distribution. Also, the RRMSE value of the GLO distribution outperforms the GNO distribution during return periods of 2, 5, 10, 20, 50, and 100 years. General Table 7 shows that the GLO distribution outperforms the GNO distribution; however, RRMSE shows that GNO has a little advantage over GLO for a longer return period. So the GLO distribution is a robust distribution for cluster-ΙΙ. Modarres20 performed a RFA on wind speed data and found that the GLO distribution was the best fit for the region.
Table 6.
Regional quantile estimation for best-fit distributions of both clusters.
| Cluster | Dist |
|
|
|
Q2 | Q5 | Q10 | Q20 | Q50 | Q100 |
|---|---|---|---|---|---|---|---|---|---|---|
| Ι | GEV | 0.9318 | 0.1456 | 0.1227 | 0.9840 | 1.1313 | 1.2182 | 1.2944 | 1.3834 | 1.4438 |
| ΙΙ | GLO | 0.984 | 0.088 | -0.103 | 0.9847 | 1.1167 | 1.2031 | 1.2895 | 1.4090 | 1.5056 |
Table 7.
At site Quantiles Estimate for the best-fit distributions using the mean as Index parameter of Cluster-Ι and cluster-II.
| Clusters and best fit dist | Sites names | Q2 | Q5 | Q10 | Q20 | Q50 | Q100 |
|---|---|---|---|---|---|---|---|
|
Cluster-I GEV |
Upper Dir. | 7.4203 | 8.4219 | 9.070 | 9.7141 | 10.596 | 11.304 |
| Drosh | 7.2871 | 8.2708 | 8.9079 | 9.5398 | 10.406 | 11.101 | |
| Chiral | 7.9353 | 9.0065 | 9.7002 | 10.388 | 11.332 | 12.088 | |
| Lower Dir. | 8.1976 | 9.3041 | 10.020 | 10.731 | 11.706 | 12.488 | |
| Kalam | 9.2154 | 10.459 | 11.265 | 12.064 | 13.160 | 14.039 | |
| Kohistan | 8.7928 | 9.9797 | 10.748 | 11.510 | 12.5569 | 13.395 | |
| Mirkhani | 9.8977 | 11.233 | 12.099 | 12.957 | 14.1349 | 15.078 | |
| SaiduSharif | 7.5678 | 8.5894 | 9.2510 | 9.907 | 10.8076 | 11.529 | |
| Malam Jabba | 7.1958 | 8.1671 | 8.7962 | 9.4202 | 10.2763 | 10.962 | |
| RMC Peshawar | 7.6484 | 8.6808 | 9.3495 | 10.012 | 10.9226 | 11.651 | |
| Abbottabad | 8.2462 | 9.3594 | 10.080 | 10.795 | 11.7764 | 12.562 | |
| Cherat | 9.4419 | 10.716 | 11.541 | 12.360 | 13.4839 | 14.384 | |
|
Cluster-II GLO |
Parachinar | 11.8518 | 13.440 | 14.480 | 15.520 | 16.958 | 18.121 |
| Bannu | 12.2903 | 13.937 | 15.016 | 16.094 | 17.586 | 18.791 | |
| D.I. Khan | 10.3419 | 11.728 | 12.635 | 13.543 | 14.798 | 15.812 | |
| Tank | 10.4381 | 11.837 | 12.753 | 13.669 | 14.935 | 15.959 |
Regional quantiles estimations for different return periods
After selecting the best-fit distributions, the next stage in regional frequency analysis is to find the quantile estimates for each return period. The return period “T” can be defined as the likelihood of repeated interval estimates, such as floods, droughts, stream flow, rainfall, or earthquakes. The return time period T can be called
with its exceedance probability P. The probability of occurrence or exceedance is the chance of an event occurring within a specific time period, that is,
probability of occurrence. For example, in the case of 20 years (
) can be defined as the chance of exceeding, where
) is the probability of non-exceedance.
After selecting the most suitable regional distribution, we estimate the regional quantiles and parameters of the two clusters. Table 6 shows the best-fit distribution of both clusters and regional quantiles.
At-sites quantiles estimations by using mean as index parameter
For fitted regional frequency distributions, the regional at-sit quantile may be calculated by multiplying the regional quantile by the sample mean at a single site. By definition, the regional at-site quantile estimation by mean is.
![]() |
40 |
Where
are the regional at-sit quantiles estimations,
are the individual sites mean and
is the functions quantile of the fitted, RFD.
The results of regional at-sits quantiles are estimated by using the sample mean for Cluster-Ι and cluster- ΙΙ the following Table 8 shows the results. We find an at-site quantile estimate for that cluster, which is the best fit distribution. For Cluster-Ι the best-fit distribution is GEV, and we can interpret it as a 100-year return period computed in Table 7. We may calculate a quantile estimate for each
site in the Cluster-Ι for a particular return period. We consider the site Upper Dir which has an average annual maximum wind speed of 7.525667 we obtained by multiplying the regional quantile estimate by the mean of the relevant site. As the
=1.3834, interpretable as 7.525667* 1.3834 = 10.596 is the amount of extreme wind once in the coming 50 years (for a given return period) with a non-exceedance probability of 0.980. All other sites and cluster-II can be interpreted similarly.
Table 8.
Standard Errors of At-site Quantile Estimate using the mean as an index parameter for both clusters.
| Cluster-I | Sites names | Q2 | Q5 | Q10 | Q20 | Q50 | Q100 |
|---|---|---|---|---|---|---|---|
| GEV | Upper Dir. | 0.6266 | 0.8470 | 1.0569 | 1.2438 | 1.4639 | 1.6172 |
| Drosh | 0.6213 | 0.8376 | 1.0433 | 1.2267 | 1.4426 | 1.5931 | |
| Chiral | 0.6837 | 0.9192 | 1.1427 | 1.3421 | 1.5771 | 1.7409 | |
| Lower Dir. | 0.7017 | 0.9450 | 1.1762 | 1.3824 | 1.6252 | 1.7945 | |
| Kalam | 0.7796 | 1.0533 | 1.3139 | 1.5460 | 1.8192 | 2.0096 | |
| Kohistan | 0.7361 | 0.9975 | 1.2466 | 1.4684 | 1.7293 | 1.9110 | |
| Mirkhani | 0.8458 | 1.1396 | 1.4188 | 1.6679 | 1.9611 | 2.1655 | |
| SaiduSharif | 0.6447 | 0.8694 | 1.0831 | 1.2735 | 1.4978 | 1.6541 | |
| Malam Jabba | 0.6143 | 0.8279 | 1.0310 | 1.2120 | 1.4253 | 1.5738 | |
| RMC Peshawar | 0.6434 | 0.8707 | 1.0871 | 1.2799 | 1.5068 | 1.6648 | |
| Abbottabad | 0.7091 | 0.9539 | 1.1862 | 1.3935 | 1.6377 | 1.8079 | |
| Cherat | 0.7971 | 1.0776 | 1.3446 | 1.5825 | 1.8625 | 2.0576 | |
|
Cluster-II GLO |
Parachinar | 1.1506 | 1.6427 | 2.0832 | 2.4431 | 2.872 | 3.1842 |
| Bannu | 1.1907 | 1.7011 | 2.1581 | 2.5314 | 2.9770 | 3.299 | |
| D.I. Khan | 0.9655 | 1.3989 | 1.7863 | 2.1011 | 2.475 | 2.7464 | |
| Tank | 0.9564 | 1.3743 | 1.7482 | 2.0529 | 2.4162 | 2.6791 |
In the most recent study, Ali et al.37 analyzed the wind data for Peshawar and Abbottabad sites and claimed that these areas fall in regions with low wind speeds. Results in Table 7 show that for small to large return periods, each site’s estimated extreme wind quantiles are greater than the mean of each corresponding site. This rising trend in extreme wind quantiles in the KPK region gives useful information to engineers and policymakers working on wind energy.
The standard errors of the estimated at-site quantile
Hosking and Wallis24 proposed a simulation process (algorithm) for accuracy estimation that is usually done by “Abs. Bias”, “Bias” and “RMSE” for regional assessment. However, we can use the extra results to get the standard mistake of the calculated amount of each site in the region.
For all sites, we used Eq. (36) to compute the standard errors of these at-site quantile estimations. The at-site quantile estimates for both clusters are calculated using the sample mean as an index parameter, and the best-fit regional frequency distribution is GEV for Cluster-I and GLO for Cluster II, Table 8 shows the results of both clusters. These estimated standard errors for the estimated quantiles give the reliability of estimates and provide useful information for the comparison of future studies.
Summary and conclusions
This study investigated the RFA of AMWS at 16 stations in Khyber Pakhtunkhwa, Pakistan. The initial screening of the AMWS is checked through the time series plot, spearman test, Mann-Whitney U test, and Wald and Wolfowitz test. The finding indicates that all 16 stations of AMWS passed the initial screening and were used further for the RFA of AMWS. In the first step of RFA of AMWS, the discordancy measure was used, and the findings revealed that none of the sites was discordant, suggesting that all 16 stations should be included in RFA. Ward’s hierarchical clustering techniques identified all sixteen stations as two homogeneous clusters. According to the Z statistics criterion and the L-moment ratio diagram, the GEV and GLO distributions were the best fit among all other PDFs for clusters I and II, respectively.
The simulation-based accuracy measures RB and RRMSE are used to search out a robust regional distribution. Based on the accuracy measures, the GEV distribution for Cluster-Ι and the GLO distribution for Cluster- ΙΙ are found to be the most acceptable choices for regional AMWS analysis in this study, according to the Z statistics and LM ratio diagram.
At-site wind quantiles are estimated using the mean as an index value, and for each site, the quantiles show a rising trend. Furthermore, the standard error of each quantile for small to large return periods is small and shows that the estimated quantiles are efficient and reliable for practical use.
Future research should expand the geographic scope of the study to include more regions within Pakistan and neighboring countries and conduct temporal analyses to investigate changes in extreme wind speed patterns over time, considering potential climate change impacts. Comparing L-moments with other estimation methods, incorporating additional meteorological variables and using higher-resolution data could enhance accuracy. Implementing pilot projects and long-term projections using regional climate models will help future-proof infrastructure. Interdisciplinary collaboration, assessing the economic impact of extreme wind events, and increasing public awareness are crucial for integrating findings into building codes and disaster management plans, ensuring practical applicability, and enhancing community resilience.
Acknowledgements
This research project was funded by (i) Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2024R358), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia. (ii) The Deanship of Scientific Research and Graduate Studies at King Khalid University, thorough a Large Research Groups Project under grant numbrR.G.P.2/338/45.
Author contributions
“Ishfaq Ahmad and Muhammad Shafeeq ul Rehman Khan” analyzed the data and revised the manuscript for the study.“Muhammad Salman and Ehtshaam Ul Haq” wrote the manuscript.“Ibrahim Mufrah Almanjahie and Fatimah Alshahrani” performed all the mathematical work.“Muhammad Fawad” prepared all the figures for the study.
Data availability
The datasets used during the current study are available from the corresponding author on reasonable request.
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Ma, X. Adaptive extremum control and wind turbine control (Doctoral dissertation, Technical University of Denmark). (1997).
- 2.Sarrias, R., Fernández, L. M., García, C. A. & Jurado, F. Energy storage systems for wind power application. Renew. Energy Power Qual. J.1 (08), 1117–1122 (2010). [Google Scholar]
- 3.Hussain, M. A., Abbas, S., Ansari, M. R. K., Zaffar, A. & Jan, B. Wind Speed Analysis of some Coastal Areas near Karachi (Pakistan Academy of Sciences, 2012). [Google Scholar]
- 4.Fawad, M., Yan, T., Chen, L., Huang, K. & Singh, V. P. Multiparameter probability distributions for at-site frequency analysis of annual maximum wind speed with L-moments for parameter estimation. Energy. 181, 724–737 (2019). [Google Scholar]
- 5.Dai, K., Bergot, A., Liang, C., Xiang, W. N. & Huang, Z. Environmental issues associated with wind energy–A review. Renew. Energy. 75, 911–921 (2015). [Google Scholar]
- 6.Huang, J. & McElroy, M. B. A 32-year perspective on the origin of wind energy in a warming climate. Renew. Energy. 77, 482–492 (2015). [Google Scholar]
- 7.Darbandi, S., Aalami, M. T. & Asadi, H. Comparison of four distributions for frequency analysis of wind speed. Environ. Nat. Resour. Res.2 (1), 96 (2012). [Google Scholar]
- 8.Tauseef Aized. Design and analysis of wind pump for wind conditions in Pakistan. Adv. Mech. Eng. 11 (9), (2019).
- 9.Hulio, Z. H., Jiang, W. & Rehman, S. Techno-Economic assessment of wind power potential of Hawke’s Bay using Weibull parameter: a review. Energy Strategy Reviews. 26, 100375 (2019). [Google Scholar]
- 10.Huang, M., Li, Q., Xu, H., Lou, W. & Lin, N. Non-stationary statistical modeling of extreme wind speed series with exposure correction. Wind Struct.26 (3), 129–146 (2018). [Google Scholar]
- 11.Li, Q., Zhang, J., Wang, R., Liu, J. & Li, P. Typhoon-resistant performance Assessment of Coastal Rural Residential Keel Brick walls reinforced with high ductility concrete. J. Mar. Sci. Eng.10 (11), 1766 (2022). [Google Scholar]
- 12.Hosking, J. R. M. & Wallis, J. R. Some statistics useful in regional frequency analysis. Water Resour. Res.29 (2), 271–281 (1993). [Google Scholar]
- 13.Heo, S. et al. Non-gaussian multivariate statistical monitoring of spatio-temporal wind speed frequencies to improve wind power quality in South Korea. J. Environ. Manage.318, 115516 (2022). [DOI] [PubMed] [Google Scholar]
- 14.Elavarasan, R. M. et al. A comprehensive review on renewable energy development, challenges, and policies of leading Indian states with an international perspective. Ieee Access.8, 74432–74457 (2020). [Google Scholar]
- 15.Thomson, G. Charles Spearman, 1863–1945, 5(15) (1947).
- 16.Wald, A. & Wolfowitz, J. An exact test for randomness in the non-parametric case based on serial correlation. Ann. Math. Stat.14 (4), 378–388 (1943). [Google Scholar]
- 17.Mann, H. B. & Whitney, D. R. On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat. 50–60. (1947).
- 18.Fawad, M., Ahmad, I., Nadeem, F. A., Yan, T. & Abbas, A. Estimation of wind speed using regional frequency analysis based on linear-moments. Int. J. Climatol.38 (12), 4431–4444 (2018). [Google Scholar]
- 19.Goel, N. K., Burn, D. H., Pandey, M. D. & An, Y. Wind quantile estimation using a pooled frequency analysis approach. J. Wind Eng. Ind. Aerodyn.92 (6), 509–528 (2004). [Google Scholar]
- 20.Modarres, R. Regional maximum wind speed frequency analysis for the arid and semi-arid regions of Iran. J. Arid Environ.72 (7), 1329–1342 (2008). [Google Scholar]
- 21.Yu, I., Kim, J. & Jeong, S. Development of probability wind speed map based on frequency analysis. Spat. Inform. Res.24, 577–587 (2016). [Google Scholar]
- 22.Alam, J., Muzzammil, M. & Khan, M. K. Regional flood frequency analysis: comparison of L-moment and conventional approaches for an Indian catchment. ISH J. Hydraulic Eng.22 (3), 247–253 (2016). [Google Scholar]
- 23.Hosking, J. R. L-moments: analysis and estimation of distributions using linear combinations of order statistics. J. Royal Stat. Soc. Ser. B: Stat. Methodol.52 (1), 105–124 (1990). [Google Scholar]
- 24.Hosking, J. R. M. & Wallis, J. R. Regional frequency analysis. (1997).
- 25.Ouarda, T. B. et al. Intercomparison of regional flood frequency estimation methods at ungauged sites for a Mexican case study. J. Hydrol.348 (1–2), 40–58 (2008). [Google Scholar]
- 26.Rao, A. R. & Srinivas, V. V. Regionalization of Watersheds: An Approach Based on Cluster AnalysisVol. 58 (Springer Science & Business Media, 2008). [Google Scholar]
- 27.Arellano-Lara, F. & Escalante-Sandoval, C. A. Multivariate delineation of rainfall homogeneous regions for estimating quantiles of maximum daily rainfall: a case study of northwestern Mexico. Atmósfera. 27 (1), 47–60 (2014). [Google Scholar]
- 28.Rasheed, A., Egodawatta, P., Goonetilleke, A. & McGree, J. A novel approach for delineation of homogeneous rainfall regions for water sensitive urban design—a case study in Southeast Queensland. Water. 11 (3), 570 (2019). [Google Scholar]
- 29.Ward, J. H. Jr Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc.58 (301), 236–244 (1963). [Google Scholar]
- 30.Peel, M. C., Wang, Q. J., Vogel, R. M. & McMAHON, T. A. The utility of L-moment ratio diagrams for selecting a regional probability distribution. Hydrol. Sci. J.46 (1), 147–155 (2001). [Google Scholar]
- 31.Vogel, R. M. & Fennessey, N. M. L moment diagrams should replace product moment diagrams. Water Resour. Res.29 (6), 1745–1752 (1993). [Google Scholar]
- 32.Meshgi, A. & Khalili, D. Comprehensive evaluation of regional flood frequency analysis by L-and LH-moments. I. A re-visit to regional homogeneity. Stoch. Env. Res. Risk Assess.23, 119–135 (2009). [Google Scholar]
- 33.Rafiq, L., Tajbar, S. Exploring wind energy potential in KPK-Pakistan by using multi criteria approach. Eskişehir Tech. Univ. J. Sci. Technol. A-Applied Sci. Eng.20 (2), 171–178 (2019). [Google Scholar]
- 34.Kassem, Y., Camur, H., Abdalla, M. A. H. A., Erdem, B. D. & Al-ani, A. M. R. Evaluation of wind energy potential for different regions in Lebanon based on NASA wind speed database. In IOP Conference Series: Earth and Environmental Science, Vol. 926, No. 1, p. 012093. (IOP Publishing, 2021).
- 35.Deshmane, M. K. S., Yadav, M. A. A., Ingawale, M. S. M. & Kamble, M. A. S. Wind data estimation of Kolhapur district using improved hybrid optimization by genetic algorithms (iHOGA) and NASA Prediction of Worldwide Energy resources (NASA Power). Int. Res. J. Eng. Technol. (IRJET). 7 (3), 2530–2538 (2020). [Google Scholar]
- 36.Hong, H. P. & Ye, W. Estimating extreme wind speed based on regional frequency analysis. Struct. Saf.47, 67–77 (2014). [Google Scholar]
- 37.Ali, B. et al. A comparative study to analyze wind potential of different wind corridors. Energy Rep.9, 1157–1170 (2023). [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The datasets used during the current study are available from the corresponding author on reasonable request.






















































