Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2025 Aug 28;15:31758. doi: 10.1038/s41598-025-12810-0

A systematic approach to modeling monthly maximum temperature and total rainfall in Kenya

Kevin Otieno 1,, Linda Chaba 1,2, Collins Odhiambo 1,3, Bernard Omolo 1,4
PMCID: PMC12394451  PMID: 40877330

Abstract

Goodness of fit (GOF) test approaches for selecting probability distributions of climatic variables are pervasive in the statistical literature. However, a combined approach of multiple tests remains underutilized despite evidence supporting their improved precision. Increased erratic climatic conditions pose severe threats to economic stability, necessitating robust statistical methods for climate modeling. To address this need, this study evaluates probability distributions for climatic variables using a comprehensive approach that combines multiple tests. A scoring system ranked each distribution’s performance across tests, with a composite score indicating the best fit. To assess robustness, sensitivity analysis on the best-performing distribution examined the influence of partitioning data into different segments (block sizes). The results show a generalized extreme value (GEV) distribution consistently outperforming other temperature and rainfall data distributions across multiple metrics. Extended block sizes capture long-term climatic patterns but introduce greater uncertainty due to fewer data points, while shorter block sizes tend to overfit. Intermediate block sizes provide a balance, producing reliable parameter estimates and stable return levels. These findings underscore the importance of selecting suitable block sizes and confirm the robustness of the GEV distribution for climate modeling. The study contributes to improved methodologies for risk assessment and climate adaptation strategies, particularly in regions such as Kenya.

Keywords: Goodness of fit tests, Probability distributions, Temperature, Rainfall, Block size

Subject terms: Hydrology, Statistics, Climate and Earth system modelling

Introduction

Kenya’s increasing exposure to the effects of climate variability is a pressing issue, especially with erratic rainfall patterns and rising high-temperature patterns significantly affecting its key sectors. Agriculture, a backbone of Kenya’s economy1,2, is particularly vulnerable, as unpredictable weather disrupts planting and harvesting cycles, reduces yields, and exacerbates food insecurity. Infrastructure, too, faces challenges, with extreme weather events such as floods and droughts causing damage to roads, bridges, and other critical systems. The cumulative effect of these climate-induced challenges undermines the country’s overall economic stability, highlighting the urgent need for robust mitigation and adaptation strategies.

The effects of climate variability are particularly evident in regions like Marsabit, where prolonged droughts and heavy rainfall lead to severe consequences. Droughts reduce water availability, hinder crop growth, and limit pastures, leading to crop failures and livestock losses, exacerbating food insecurity35. In contrast, intense rainfall causes soil erosion, farmland flooding, and infrastructure damage, imposing significant financial burdens on the government for repairs and diverting resources from development projects.

These recurring events underscore the urgent need for sustainable strategies, such as climate-resilient agricultural practices, improved water management systems, and robust infrastructure design. Investments in early warning systems and community-based adaptation measures are also critical to mitigating the impacts on vulnerable populations.

A deeper understanding of climate variability, such as rainfall and temperature, can be achieved through probability distributions, which provide valuable tools to analyze climate patterns6. Globally, researchers have identified region- and time-dependent distributions for these variables, with models such as GEV, Gamma, log-normal, and Weibull frequently recommended for climatic data. Notable studies include those by Sharma and Singh7, Dzupire et al.8, Athulya and James9, Ozonur et al.10, Ximenes et al.11, Hussain et al.12, Singirankabo and Iyamuremye13 and Agbonaye and Izinyon14. For example, Ximenes et al.11 found Gamma and Weibull to be optimal for monthly precipitation in Northeast Brazil, while Douka and Karacostas15 identified GEV and log-normal as suitable for extreme precipitation in Thessaloniki, Greece. The differences in the probability distributions between11 and15 can be attributed to different geographical locations; Greece is located between Inline graphic and northeast Brazil is Inline graphic. Their work on these regions also employed different periods; Greece’s data comprised monthly precipitation records from 1988 to 2017, whereas the study on Northeast Brazil used hourly rainfall data from 1947 to 2003. These studies and a summary in Table 1 demonstrate the importance of selecting appropriate probability distributions for accurate climate modeling.

Table 1.

Literature results of probability density functions (PDF) fitted to rainfall data.

Author Region PDF assessed Best PDF
7

Pantnagar, India

Annual Maximum daily rainfall

Normal, LN, Gamma, Weibull, Pearson, GEV LN, Gamma
14

South Eastern Nigeria

Annual Maximum daily rainfall

EVI, GEV, GPA, LN,

Pearson type III and log Pearson type III

GEV
16

Jericho, Ibadan Nigeria

Daily rainfall

Exponential, Gamma,Normal and Poisson Exponential
17

Japan

Annual Maximum hourly rainfall

Normal, LN, Gumbel, G2, P3 and LP3 LP3
18

Bangladesh

Annual Maximum daily rainfall

Normal, N2, N3, N4 and N5 Normal
19

Narok Town, Kenya,

Annual Maximum daily rainfall

Normal, LN, Weibull, Gamma, Gumbel, Exponential and Pareto

GEV, Gumbel,

Gamma

20

Colombia

Annual Maximum hourly rainfall

Gumbel, LP3, P3, Normal, GEV. GEV
6

Qatar

Annual Maximum daily rainfall

Normal, LN, Gamma, Gumbel, Log-logistic, GEV, Pearson, LP3,

Beta, Weibull and General Pareto

GEV
21

Wilayah Persekutuan, Malaysia

Annual Maximum hourly rainfall

Exponential, Gamma, Weibull and Mixed Exponential

Mixed

Exponential

LN - Lognormal, GPA - Generalized Pareto distribution, EV1 - Extreme value type I distribution, G2 - Gamma 2,

GEV - Generalized Extreme Value, P3 - Pearson type 3, LP3 - Log-Pearson type 3, N2 - Mixtures of two normal

N3 - Mixtures of three normal, N4 - Mixtures of four normal, N5 - Mixtures of five normal

Extensive research has also been conducted to identify the best-fitting probability distributions for temperature data. Key studies include those by Athulya and James9, Dzupire et al.8, Hasan22, Hossain23, Hussain et al.12 and Ozonur et al.10. These studies have explored various distributions, including the normal, log-normal, Gamma, and Weibull distributions. For instance, Hussain et al.12 identified the Generalized Pareto (GP), Extreme Value (EV), and GEV models as suitable for modeling temperature data. Similarly, Hasan22 employed ten continuous distributions, including the exponential, Gamma, Log-Gamma, Beta, normal, log-normal, Erlang, power function, Rayleigh, and Weibull distributions, with the Beta distribution emerging as the best fit for the temperature data.

This study aims to identify the most appropriate probability distributions for modeling monthly maximum temperatures and total monthly rainfall in Kenya. The analysis is based on a comprehensive data set covering the last 73 years, capturing the impacts of recent climatic changes. By incorporating these extensive and up-to-date data, the study ensures that the models account for evolving climate patterns. For instance, accurate descriptions of climatic data provide a better understanding of the probability distributions of maximum temperatures and total rainfall, which helps capture the frequency and intensity of climatic events, such as heat waves and heavy downpours. These models also enhance predictive capabilities by leveraging historical trends and recent shifts, improving forecasting accuracy and facilitating better preparation for future climatic scenarios. Additionally, by identifying the underlying distributions, the study supports data-driven decision-making, providing a critical foundation for risk assessment and resource allocation in agriculture, water management, and disaster response sectors.

The study makes a significant contribution to modeling climatic events through three key focus areas. First, it provides a comprehensive theoretical framework for understanding and applying statistical distributions in hydrology and climate studies. The framework offers precise definitions of commonly used distributions, facilitating their identification and application to various climatic datasets. It also includes robust parameter estimation methodologies that ensure accurate modeling of climatic variables. Furthermore, the study outlines strategies for selecting extreme values tailored to specific extreme value distributions, enabling the precise focus on significant climatic events.

Second, the research emphasizes the application of GOF tests to identify the most suitable probability distributions for climatic data. Detailed discussions on the implementation of GOF tests enhance the accuracy and reliability of the models. This methodological rigor improves the alignment of models with observed data and bolsters their credibility for practical applications in risk assessment and decision-making.

Lastly, we emphasized the significance of temporal pattern analysis through block size selection, a crucial factor in statistical modeling that directly impacts the capture of temporal patterns in climatic data. We conducted a sensitivity analysis to assess the impact of varying block sizes on the GEV distribution. This analysis combined graphical methods, GOF tests, return level estimates for various periods, and confidence intervals. By examining the effect of block size on model performance and extremal forecast, this section provides valuable insights into the stability and reliability of the GEV distribution across varying temporal resolutions.

The paper is structured as follows. “Methods” section provides a detailed description of the data, the procedure for selecting candidate probability distributions, parameter estimation methods, and the implementation of GOF tests, including the combined approach of multiple GoF tests. “Results and discussion” section presents summary statistics, results from the selection of candidate distributions, findings from the GoF tests, and insights from the sensitivity analysis. Finally, “Conclusion” section concludes the paper by summarizing the key findings and their implications for climate modeling and risk assessment.

Methods

Data

The monthly maximum temperature (Tmax) and total precipitation (Prep) data for Kenya, covering the period 1950–2022, were sourced from the World Bank Climate Change Knowledge Portal24. The precipitation data (Prep), measured in millimeters, represents the total accumulation of monthly rainfall. This provides a comprehensive measure of rainfall intensity and distribution across different months. The temperature data (Tmax), recorded in degrees Celsius, captures the highest daily maximum temperature observed each month, offering valuable insights into extreme temperature events.

Selection of candidate probability distributions

A review of existing literature identified probability distributions commonly applied in hydrological studies: exponential, Gamma, Weibull, log-normal, logistic, Gumbel, GPD, and GEV, as referenced by712,14,1618,20. Similarly, for temperature data, these distributions, in addition to a normal distribution, were identified as suitable candidates, supported by findings from22 and other related studies. Table 2 describes each probability distribution function. These distributions were selected due to their suitability in modeling skewed, heavy-tailed, or extreme data characteristics commonly found in climatic datasets. The Cullen and Frey graph25 was used to preliminarily assess the shape characteristics of the data, guiding the selection of appropriate distributions for further analysis.

Table 2.

Description of various probability distribution functions.

Distributions Probability density functions Ranges Parameters
Normal Inline graphic Inline graphic Inline graphic: standard deviation
Inline graphic: mean
Lognormal Inline graphic Inline graphic Inline graphic: scale parameter
Inline graphic: shape parameter
Weibull Inline graphic Inline graphic Inline graphic
Inline graphic: shape parameter
Inline graphic: scale parameter
GEV Inline graphic Inline graphic Inline graphic: location parameter
Inline graphic Inline graphic: shape parameter
Inline graphic Inline graphic: scale parameter Inline graphic
Inline graphic
Exponential Inline graphic Inline graphic Inline graphic: rate parameter
Inline graphic
Gamma Inline graphic Inline graphic Inline graphic: shape parameter
Inline graphic: scale parameter
Logistic Inline graphic Inline graphic Inline graphic: location parameter
s: scale parameter
Gumbel Inline graphic Inline graphic Inline graphic: location parameter
Inline graphic: scale parameter
Inline graphic: shape parameter
Uniform Inline graphic Inline graphic a: Lower bound
b: Upper bound
GPD

Inline graphic,

Inline graphic,

Inline graphic

Inline graphic

Inline graphic

Inline graphic: location parameter
Inline graphic: scale parameter
Inline graphic: shape parameter

Parameter estimation

In statistical modeling, parameter estimation is essential due to the typically unknown nature of most model parameters. Commonly employed methods include the Method of Moments, L-moments, Maximum Likelihood Estimation (MLE), and LH-moments, as noted in studies by Al Mamoon and Rahman6 and Haddad and Rahman26. In this paper, we employ the MLE method for parameter estimation across the analyzed distributions, as it is one of the most widely applied and robust methods. MLE is favored for its consistency and efficiency, particularly in large samples, as it maximizes the likelihood of the observed data and often yields more reliable results compared to other methods such as Moments, L-moments, and LH-moments, particularly in terms of asymptotic properties. Research, including foundational studies by Fisher27, Zong28 and Naghettini29, has demonstrated that MLE’s variance and bias are comparatively low, thereby enhancing its suitability across a broad range of distributions. These qualities render MLE exceptionally reliable for environmental datasets, including temperature and rainfall measurements, where precision and robustness are critical.

Goodness of fit tests

The suitability of each probability distribution was assessed using a suite of GOF tests, including the Kolmogorov-Smirnov (KS), Anderson-Darling (AD), Cramer-von Mises (CvM), and Chi-Square tests. These tests evaluate the alignment between theoretical and empirical data, with KS tests focusing on overall distributional fit15,30, AD and CvM emphasizing tail behavior15,26,3133, and Chi-Square examining frequency alignment19. Additional evaluation was performed using Akaike Information Criteria (AIC) and Bayesian Information Criteria (BIC) to balance model complexity and fit10,12,22,26, along with Root Mean Square Error (RMSE) to quantify predictive accuracy14.

Comprehensive scoring methodology

The literature indicates a lack of suitable GOF tests designed to effectively distinguish between empirical and theoretical distributions34. Numerous studies have shown that the best-fit probability distribution can vary significantly between different regions, even for the same variable32. In response to these challenges, we adopt a comprehensive scoring methodology, as outlined in previous studies14,17,22,35. This method employs an integrated scoring approach that incorporates multiple GOF tests, information criteria, and graphical analyses to ensure a robust selection of the optimal probability distribution model. Each distribution model is subjected to several GOF tests, with a scoring system applied whereby the best-performing model in each test receives the highest rank. To enhance the rigor of the selection process, each model’s rank is determined independently for each GOF test and then aggregated across all tests to produce a composite score. For graphical assessments, rankings are informed by visual inspection of density plots and quantile-quantile (Q–Q) plots, providing additional insight into the best-fitting model.

Results and discussion

This section provides statistical results from the analysis. The dataset used in this study assumes an independent and identically distributed (iid). We tested for stationarity using the Augmented Dickey-Fuller (ADF) test, randomness using the Wald-Wolfowitz runs test, and independence using the Ljung-Box test to verify adherence to these assumptions. All tests were performed at Inline graphic significance level. The results indicated that the data were stationary and random but exhibited autocorrelation; therefore, the data were aggregated using block analysis.

Summary statistics

Table 3 shows the descriptive statistics for the annual maximum temperature and total rainfall for Kenya.

Table 3.

Summary statistics for the monthly maximum temperature (Inline graphicC) and total monthly rainfall (mm).

Variable Maximum temperature Total rainfall
Observations 876 876
Average 26.23 63.97
Standard deviation (SD) 1.27 42.72
Minimum 23.16 2.46
q25 25.32 35.90
Median 26.23 50.90
q75 27.15 81.88
Maximum 29.97 280.32
Skewness 0.12 1.46
Kurtosis 2.43 5.43

The maximum temperature (Tmax) for 876 observations has an average of Inline graphic with low variability (standard deviation = 1.27) and a range from Inline graphic to Inline graphic. The interquartile range Inline graphic to Inline graphic highlights a concentration around the median Inline graphic, with a near-symmetrical distribution (skewness = 0.12) and a relatively flat shape (kurtosis = 2.43). The findings resonate with previous studies in1,2, which indicate that while temperature variability at the national level tends to be low due to data aggregation, an increase in temperature has been observed in most regions across the country.

In contrast, Total rainfall (Prep) exhibits much higher variability, with a mean of 63.97 and a standard deviation of 42.72, ranging from 2.46 to 280.32. This wide range reflects the variability and extreme nature of rainfall. Quartiles (q25 = 35.90, q75 = 81.88) and a median of 50.90 indicate a right-skewed distribution (skewness = 1.46), while positive kurtosis (5.43) points to heavy tails, signifying extreme events. The findings also align with the evidence1,2.

Choice of candidate distributions

For the temperature data in Fig. 1a, the Cullen and Frey graph shows that the distribution approximates the normal region with a slight platykurtic shape, identifying the normal, uniform, log-normal, Gamma, Weibull, and logistic distributions as potential candidates. Studies, such as12, have shown that extreme value distributions are suitable for modeling temperature data; therefore, these distributions were also considered potential candidates. In the rainfall data in Fig. 1b, the distribution exhibits positive skewness and high kurtosis, suggesting alignment with distributions such as log-normal, Gamma, Weibull, and exponential. Given the presence of extreme values, models that account for extreme behavior, specifically the GPD and GEV distributions, were also included in the analysis.

Fig. 1.

Fig. 1

Cullen and Frey plots for Assessing best-fit distribution of (a) Maximum temperature (Inline graphicC) and (b) Total rainfall (mm).

Model fitting was conducted using MLE for parameter estimation. For extreme value distributions, the Block Maxima (BM) and Peak Over Threshold (POT) approaches were used to determine the number of block maxima and thresholds required to fit GEV and GPD distributions, respectively. The BM approach is widely used in extreme value analysis to capture maximum events within defined time intervals, such as annual maxima, and it is commonly applied for environmental and climate data30,36,37. For the POT method, which is well-suited to modeling excesses over a specified threshold, the Mean Residual Life (MRL) plot was generated as shown in Fig. 2, and visual inspection was used to determine an appropriate threshold for each variable13,37. The blue curve in Fig. 2 represents the observed mean excess values {Inline graphic} , the red lines denote the upper and lower confidence intervals Inline graphic and threshold Inline graphic defines the limit for identifying extreme events Inline graphic38. In Fig. 2a, a threshold in the range of 50 to 150 is suitable, as it provides a stable mean excess with narrower confidence intervals. This indicates that values above this threshold exhibit behavior suitable for modeling with a GPD. For temperature, the MRL plot in Fig. 2b did not suggest a proper threshold, hence the initial guess of a threshold around Inline graphic, where the confidence intervals remain relatively narrow, indicating reliable estimates. However, after approximately 28, the confidence intervals begin to widen slightly, indicating increased uncertainty in the mean excess values at higher thresholds. The GPD parameters were estimated based on observations exceeding this threshold.

Fig. 2.

Fig. 2

Mean residual life (MRL) plots for evaluating threshold selection in (a) Total rainfall (mm) and (b) Maximum temperature (Inline graphicC).

Graphical assessments and GOF tests results

Graphical assessments

Density and Q–Q plots were generated to compare the observed data with several fitted theoretical distributions. For temperature data, the density plot in Fig. 3 shows that the GEV, Gamma, and log-normal distributions provide the best fit, capturing both the central peak and tail behavior. The normal, Weibull, and logistic distributions also perform reasonably well but exhibit slight deviations in the tails. In contrast, the uniform distribution shows significant discrepancies, particularly in the extremes, suggesting its unsuitability for modeling extreme temperature events. The Q–Q plots in Fig. 4 reveal that most distributions demonstrate deviations in the tails, with the GEV and normal distributions showing the closest adherence to the theoretical quantiles. Among the fitted distributions, the GEV, normal, log-normal, and Gamma distributions provide the best fit in that order, followed by the logistic and Weibull distributions, which exhibit moderate deviations. In contrast, the GPD and uniform distributions exhibit a substantial lack of fit, particularly at the lower and upper tails. This visual approach to identifying the best-fitting distribution is inherently subjective and, therefore, cannot be relied upon solely. To enhance robustness, these results were complemented with findings from other GOF tests to improve the reliability of distribution selection.

Fig. 3.

Fig. 3

Density plots of observed and simulated maximum temperature (Inline graphicC) data to assess the performance of probability distributions.

Fig. 4.

Fig. 4

Quantile–Quantile (Q–Q) plots for comparing the fit of eight probability distributions to maximum temperature (Inline graphicC).

Similarly, for the rainfall data in Fig. 5, the GEV, Gamma, and log-normal distributions show the closest alignment with the actual observed data, effectively capturing the shape and spread of the distribution. The Weibull distribution provides a moderate fit, performing well in the central range but diverging in the tails. In contrast, the exponential and GPD distributions exhibit substantial deviations, failing to represent the empirical distribution, especially at the extremes accurately. The Q–Q plots in Fig. 6 reinforce these findings, with the GEV and Gamma distributions displaying the best adherence to the theoretical quantile line, followed by the log-normal and Weibull distributions. Exponential and GPD exhibit the weakest performance. These results are consistent with previous studies, such as21, which identified the GEV distribution as the most appropriate model for extreme rainfall events.

Fig. 5.

Fig. 5

Density plots of observed and simulated total rainfall (mm) data to assess the performance of probability distributions.

Fig. 6.

Fig. 6

Q–Q plots to compare the fit of six probability distributions for total rainfall (mm) data.

GOF tests

The GOF analysis in Table 4 (a) identifies the GEV distribution as the most suitable model for the maximum temperature data. The GEV distribution achieves the lowest statistics for the KS (0.0297), AD (0.8890), and CvM (0.1335) statistics, accompanied by high p-values (0.4206, 0.4211, and 0.4442), indicating a strong alignment with the observed data. It also produces the lowest Chi-square statistic (3.5969, p = 0.9637) and achieves superior performance in terms of AIC (2,898.30), BIC (2,912.63) and RMSE (1.5694), highlighting its precision and efficiency. Other distributions, such as the normal, log-normal, and Gamma, provide moderate fits, with non-significant GOF statistics but higher AIC and BIC values, along with RMSE values that reflect less accuracy compared to the GEV. Conversely, the Weibull, Uniform, Logistic, and GPD distributions exhibit poor performance, with high test statistics, low p-values, and significant deviations from the observed data. The Uniform and GPD distributions show extreme misalignment, as evidenced by infinite AD statistics, high Chi-square values, and elevated RMSE scores, confirming their unsuitability for modeling maximum temperature data.

Table 4.

Goodness of fit test results for temperature and rainfall distributions.

Distribution KS tests AD tests CvM tests Chi square test Information criterion
Statistic p value Statistic p value Statistic p value Statistic p value AIC BIC RMSE
(a) Temperature distributions
   Normal 0.0401 0.1202 1.4212 0.1965 0.2031 0.2617 16.3101 0.0911 2910.44 2919.99 1.3321
   LogNormal 0.0342 0.2559 1.2911 0.2353 0.1898 0.2882 16.5991 0.0837 2907.32 2916.87 1.3335
   Gamma 0.0362 0.2006 1.2962 0.2336 0.1883 0.2913 16.2235 0.0934 2907.81 2917.36 1.3324
   Weibull 0.0626 0.0021 7.7696 0.0001 1.0687 0.0017 51.381 0.0000 3012.31 3021.86 1.4798
   Uniform 0.0457 0.0517 Inf 0.0000 12.1632 0.0000 389.4772 0.0000 2954.14 2963.69 1.819
   Logistic 0.1993 0.0000 2.745 0.037 0.3695 0.0871 36.0075 0.0001 3365.02 3374.57 1.3876
   GEV 0.0297 0.4206 0.889 0.4211 0.1335 0.4442 3.5969 0.9637 2898.30 2912.63 1.5694
   GPD 1.000 0.000 Inf 0.000 237.66 0.000 Inf 0.0000 1965.84 1974.98 25.0933
(b) Rainfall distributions
   LogNormal 0.0491 0.0292 2.2209 0.0697 0.3347 0.1083 12.34 0.2628 8731.72 8741.27 69.72
   Exponential 0.2020 0.0000 58.9464 0.0000 10.6643 0.0000 198.86 0.0001 9039.54 9044.31 81.22
   Gamma 0.0594 0.0041 3.4029 0.0172 0.6878 0.0136 36.40 0.0000 8717.89 8727.44 62.96
   Weibull 0.0765 0.0001 7.6860 0.0002 1.3937 0.0003 70.78 0.0000 8762.53 8772.08 63.38
   GEV 0.0315 0.3487 1.1799 0.2753 0.1891 0.2897 17.13 0.0716 8713.87 8728.20 58.86
   GPD 0.6694 0.0000 341.7810 0.0000 68.2021 0.0000 273.02 0.0000 4286.20 4294.42 77.84

For the rainfall data in Table 4 (b), the GEV distribution also emerges as the most robust model, as reflected in the highest p-values for the tests KS (0.3487), AD (0.2753), and CvM (0.2897), indicating minimal deviation from observed data. Furthermore, the GEV achieves among the lowest AIC (8713.87) and BIC (8728.19) values, highlighting its parsimony and suitability for modeling rainfall patterns. Its superior predictive accuracy is evident from the lowest RMSE value (58.86), reinforcing its reliability. Concerning chi-square tests, the log-normal distribution was found to have the lowest chi-square value, indicating a better fit. Yuan et al.17 also had a similar finding when they used Chi-square tests to evaluate the best fit for the frequency analysis of the annual maximum hourly precipitation. In contrast, the GPD and exponential distributions perform poorly, with significant p-values, high Chi-square statistics, and elevated RMSE values, indicating substantial deviation and limited applicability for modeling rainfall data.

A comprehensive scoring method was used to further evaluate the best-fitting distributions, with findings presented in Table 5. Analysis for temperature distributions in Table 5 (a) revealed that the GEV consistently outperformed others as observed in39, achieving the highest overall rank with a total score of 17. This was supported by its superior performance in key tests, including KS, AD, and CVM tests. The Gamma and log-normal distributions ranked second and third, respectively, demonstrating moderate fits across multiple metrics. However, distributions like Weibull, Uniform, Logistic, and GPD performed poorly, accumulating higher total scores and displaying suboptimal results in density plots and QQ plots.

Table 5.

Goodness of fit rankings for temperature and rainfall distributions.

Distribution KS test AD test CVM test CHI test AIC BIC RMSE Density plot QQ plot Total score Overall rank
(a) Temperature distributions
   Normal 4 4 4 3 5 5 1 4 2 32 4
   Log-normal 2 2 3 4 3 3 3 3 3 26 3
   Gamma 3 3 2 2 4 4 2 1 4 25 2
   Weibull 6 6 6 6 7 7 5 5 6 54 5
   Uniform 5 8 7 7 6 6 7 6 8 60 8
   Logistic 7 5 5 5 8 8 4 8 5 55 7
   GEV 1 1 1 1 2 2 6 2 1 17 1
   GPD 8 7 8 8 1 1 8 7 7 55 6
(b) Rainfall distributions
   Log-normal 2 2 2 1 4 4 4 2 3 24 3
   Exponential 5 5 5 5 6 6 6 5 5 48 6
   Gamma 3 3 3 3 3 2 2 1 2 22 2
   Weibull 4 4 4 4 5 5 3 3 4 36 4
   GEV 1 1 1 2 2 3 1 4 1 16 1
   GPD 6 6 6 6 1 1 5 6 6 43 5

For rainfall distributions, the ranking analysis in Table 5 (b) also confirms that the GEV distribution again emerged as the top performer, ranking first with a total score of 16. These findings are supported by Agbonaye and Izinyon14, Al Mamoon and Rahman6, Alam et al.18, Coronado-Hernández et al.36, Fadhilah et al.21, Ghosh et al.40, Ng et al.35 and Yuan et al.17. Its strength was evident across most GOF tests, where it outperformed or closely matched the best-performing distributions in each category. The Gamma distribution ranked second, showcasing a strong overall fit with balanced performance across metrics. Log-normal followed in third place, excelling in certain tests but lagging in others, such as AIC and BIC. In contrast, the exponential and Weibull distributions demonstrated weaker fits, while the GPD distribution consistently ranked lowest.

Sensitivity analysis

To evaluate the robustness of the GEV distribution’s fit to rainfall data, a sensitivity analysis was performed using various block sizes designed to capture diverse temporal patterns and extremes. Block size refers to a series of independent groups of observations of a particular length38. According to Coles and Coles38, block sizes are often selected to capture a specific period. In this work, the block sizes included annual, seasonal, monthly, 5-year, 10-year, 12-month moving averages, 6-month intervals, and 4-month intervals. Annual blocks, where maximum values were extracted per year, followed the methodologies outlined in38,41. Seasonal blocks were based on quarterly aggregations, as indicated by42 and41. Monthly blocks were used to capture monthly maxima, as discussed in43 and42. For longer-term patterns, multi-year blocks of 5-year and 10-year intervals were established, consistent with approaches adopted in studies such as44. A 12-month moving average window assessed rolling maxima, highlighting shifts in trends. Event-based blocks focused on the most extreme events by isolating total rainfall above the 95th percentile following the techniques used in45. For intermediate seasonality, semi-annual blocks were divided each year into January–June and July–December intervals, consistent with approaches used by42,43,46. Furthermore, a regional seasonal classification for Kenya was used to account for local climatic variations, with blocks corresponding to the “Hot and Dry”, “Long Rainy”, “Cool”, and “Short Rainy” seasons, building on the framework proposed by47. For each block length, maximum values were extracted and the GEV parameters were estimated and presented in Table 6.

Table 6.

ML estimates and significance of location, scale, and shape parameter for temperature and rainfall distribution.

Model Parameter Maximum temperature Total rainfall
Estimate Std Error p value Estimate Std Error p value
Annual Location 27.818 0.1015 Inline graphic 128.1784 4.8262 Inline graphic
Annual Scale 0.805 0.0690 Inline graphic 35.9544 3.5314 Inline graphic
Annual Shape − 0.325 0.0537 Inline graphic − 0.0563 0.0974 0.5633
Quarterly Location 26.437 0.0739 Inline graphic 68.5266 2.4114 Inline graphic
Quarterly Scale 1.129 0.0534 Inline graphic 34.8286 1.8871 Inline graphic
Quarterly Shape − 0.250 0.0424 Inline graphic 0.1231 0.0593 Inline graphic
Monthly Location 25.764 0.0462 Inline graphic 42.9879 1.0338 Inline graphic
Monthly Scale 1.236 0.0330 Inline graphic 26.7123 0.8179 Inline graphic
Monthly Shape − 0.248 0.0221 Inline graphic 0.1837 0.0295 Inline graphic
5-Year Location 28.404 0.1846 Inline graphic 182.8691 10.2237 Inline graphic
5-Year Scale 0.622 0.1298 Inline graphic 33.6435 7.3215 Inline graphic
5-Year Shape − 0.255 0.1806 0.1580 − 0.1164 0.2184 0.5942
10-Year Location 28.649 0.2487 Inline graphic 217.9374 14.7674 Inline graphic
10-Year Scale 0.538 0.1914 Inline graphic 34.3022 11.3699 Inline graphic
10-Year Shape − 0.143 0.4396 0.7458 − 0.4127 0.3434 0.2295
12-Month moving average Location 25.780 0.0460 Inline graphic 42.8250 1.0363 Inline graphic
12-Month moving average Scale 1.220 0.0330 Inline graphic 26.6061 0.8195 Inline graphic
12-Month moving average Shape − 0.242 0.0230 Inline graphic 0.1827 0.0297 Inline graphic
Event-based Location 28.641 0.0382 Inline graphic 170.5155 2.5342 Inline graphic
Event-based Scale 0.204 0.0316 Inline graphic 13.9905 2.2711 Inline graphic
Event-based Shape 0.245 0.1757 0.1627 0.3974 0.1753 Inline graphic
Semi-annual Location 26.976 0.0927 Inline graphic 100.9257 3.5479 Inline graphic
Semi-annual Scale 1.009 0.0669 Inline graphic 36.9574 2.6486 Inline graphic
Semi-annual Shape − 0.267 0.0564 Inline graphic 0.0049 0.0749 0.9481
Seasons Location 26.487 0.0825 Inline graphic 62.9546 2.1866 Inline graphic
Seasons Scale 1.308 0.0595 Inline graphic 32.2959 1.7431 Inline graphic
Seasons Shape − 0.350 0.0293 Inline graphic 0.1850 0.0543 Inline graphic

Significant values are in bold.

For both rainfall and temperature data, parameter estimates reveal notable differences between block sizes, particularly in the shape parameter, which defines tail behavior. For rainfall, annual, 5-year, and 10-year blocks exhibited non-significant negative shape parameters Inline graphic, indicating a Weibull class of distribution as reported in30 and uncertainty in tail estimates for these broader temporal aggregations. In contrast, mid-range blocks, such as monthly, quarterly, event-based, and seasonal, yielded significant positive shape parameters, reflecting the heavy-tailed Frechet class of distributions with well-defined extremal patterns. This is in agreement with Moccia et al.33 although the findings of Onwuegbuche et al.48 and Singirankabo et al.37 revealed that Gumbel is the optimal distribution. The location and scale parameters were consistently significant Inline graphic across all block sizes, indicating reliable estimation of central tendency and variability. The event-based block for rainfall, with a high shape estimate (0.3974), suggested a heavier tail and a higher propensity for extreme rainfall events compared to other blocks. For temperature data, location and scale parameters were also consistently significant across all blocks, confirming stable estimates of central tendency and variability. However, the shape parameter was not significant for the 5-year, 10-year, and event-based models, indicating uncertainty in tail estimates, which is likely due to the limited number of data points or the irregular occurrence of extreme events. In contrast, the quarterly, monthly, and seasonal models produced significant shape parameters, suggesting that they provide more robust and reliable tail estimates for predicting rare and extreme values in both temperature and rainfall.

The model diagnostic tests in Table 7 reveal that the 10-year and 5-year blocks provide the best fit for both rainfall and temperature data, achieving the lowest AIC and BIC values (e.g., AIC = 74.406 and 146.985 for rainfall), indicating strong model parsimony and minimal information loss. These longer blocks effectively capture long-term extreme trends but rely on fewer data points (n = 7 and 14), which increase uncertainty in parameter estimates due to increased variances, as demonstrated by46. This finding aligns with studies by38,41, which emphasize the effectiveness of larger blocks in capturing long-term climatic trends by averaging out short-term fluctuations, thereby focusing on extreme patterns. Event-based and annual blocks also perform well for rainfall, with low AIC and BIC values, reflecting their stability in representing extreme events with adequate data, as supported by42. In contrast, higher-frequency blocks, such as monthly and 12-month moving average models, exhibit much higher AIC and BIC values for both rainfall and temperature, suggesting potential overfitting and inefficiency in capturing extreme patterns, a limitation also noted by43. Mid-range blocks, including quarterly, semi-annual, and seasonal, achieve moderate AIC and BIC values for both datasets, offering a balanced approach that captures seasonal variability while maintaining sufficient stability for reliable parameter estimation. This perspective is supported by studies such as15,42,46, which highlight the value of intermediate temporal scales in balancing the trade-offs between long-term trend analysis and sufficient data representation.

Table 7.

Model performance metrics for maximum temperature (Inline graphicC) and total rainfall (mm) across different blocks.

Models Blocks Maximum temperature Total rainfall
LogLikelihood AIC BIC n LogLikelihood AIC BIC n
1 Annual − 84.807 175.613 182.485 73 − 374.640 755.281 762.152 73
2 Quarterly − 456.207 918.413 929.444 292 − 1519.203 3044.407 3055.437 292
3 Monthly − 1446.151 2898.303 2912.629 876 − 4353.937 8713.874 8728.200 876
4 5-Year − 13.415 32.830 34.747 14 − 70.493 146.985 148.902 14
5 10-Year − 6.223 18.446 18.284 7 − 34.203 74.406 74.244 7
6 12-Month moving average − 1420.455 2846.911 2861.199 865 − 4295.304 8596.607 8610.895 865
7 Event-based − 5.382 16.764 21.977 42 − 195.366 396.732 402.085 44
8 Semi-annual − 210.050 426.100 435.051 146 − 758.480 1522.961 1531.911 146
9 Seasons − 480.508 967.016 978.046 292 − 1507.363 3020.726 3031.756 292

In addition, we computed the return levels for different return periods to determine how various models estimate the extremes. The return level represents the magnitude of an event expected to be equaled or exceeded, on average, once within a specified return period38,48. The findings in Fig. 7 for temperature and rainfall data reveal distinct patterns across models when estimating extremes at various return periods. For temperature in Fig. 7a , the 10-year and 5-year models consistently produce the highest return levels, maintaining stability across increasing return periods as observed in48, indicating their robustness in estimating extreme values over longer intervals. In contrast, models with finer resolutions, such as monthly and 12-month moving averages, yield lower return levels with modest increases over time, suggesting a limited capacity to capture rare extremes. The quarterly and semi-annual models show moderate return levels, providing a balanced estimation that captures both seasonal variability and long-term trends. For rainfall in Fig. 7b, a similar pattern emerges, with the 10-year, 5-year, and seasonal models achieving the highest and most stable return levels, while finer models like monthly and 12-month moving averages display lower return levels and less pronounced growth across return periods. The event-based model exhibits high initial return levels but shows a plateau at more extended periods, indicating potential limitations in capturing prolonged extremes. Overall, the 10-year, 5-year, and seasonal models appear to be the most consistent for temperature and rainfall extremes.

Fig. 7.

Fig. 7

Return level plots for different block sizes for (a) Maximum temperature (Inline graphicC) and (b) Total rainfall (mm).

Finally, we used a density plot to check how each model captures the distribution of maximum temperatures and total rainfall. In the temperature plot in Fig. 8a , the 10-year, 5-year, and event-based models displayed the most concentrated curves, suggesting a narrower range with more pronounced extremes. Models with higher temporal resolutions, like monthly and 12-month moving averages, exhibit wider density curves, indicating a broader distribution that captures more frequent fluctuations but is less focused on extremes. The quarterly and semi-annual models fall between these extremes, striking a balance between stability and variability. For rainfall data in Fig. 8b, a similar pattern emerges: the 10-year and 5-year models show steeper, more concentrated curves, indicating that they effectively capture rare, high-magnitude events. In contrast, finer-resolution models, such as monthly and 12-month moving averages, have flatter curves, capturing a wider range of data with less emphasis on extremes.

Fig. 8.

Fig. 8

Density plots for different block sizes for (a) Maximum temperature (Inline graphicC) and (b) Total rainfall (mm).

Conclusion

In this study, we have assessed various probability distributions for modeling maximum temperature and total rainfall data using a systematic and comprehensive approach that combines several GOF tests and graphical tools. In addition, we have identified the optimal block size for the GEV distribution using return levels across different periods, as well as log-likelihood, AIC, and BIC. Insights from GOF tests highlighted that the GEV, Gamma, and log-normal distributions were well-suited for both maximum temperature and total rainfall datasets, as they consistently aligned with empirical data. On the other hand, distributions such as uniform, Weibull, and logistic showed a poor fit across multiple metrics, underscoring their limitations in capturing the complexities of climatic variables. The GEV distribution emerged as the optimal model for rainfall and temperature data, consistently outperforming others in key metrics such as the AIC, BIC, and RMSE. It also demonstrated superior performance in GOF tests, including the KS, AD, and CVM tests. This strong performance affirms the robustness of the GEV distribution in modeling climatic extremes and its capacity to provide reliable insights into long-term trends.

Block size analysis revealed the effectiveness of longer temporal aggregations, such as 10-year and 5-year blocks, which produced stable and high return levels across return periods, effectively capturing long-term extreme trends. However, these longer blocks increased uncertainty in parameter estimates due to fewer data points. In contrast, intermediate blocks, such as quarterly and seasonal, struck a balance by capturing seasonal variations while maintaining stability and reliable parameter estimates with moderate AIC and BIC values. High-frequency blocks, such as monthly and 12-month moving averages, although rich in data, exhibited higher AIC and BIC values, suggesting potential overfitting and inefficiency in representing extreme values.

The results of this study are important for Kenya and the East African region, as the adopted methodology can be applied. The comprehensive GOF tests also enhance forecasting temperature and rainfall data, which is crucial for risk assessment and the development of climate adaptation strategies. With this knowledge, predictions and preparations for catastrophic events, such as floods, droughts, or rising temperatures, can be enhanced. With better forecasts, policymakers and the government can improve infrastructure for water catchment systems and enhance agricultural activities through proper planning and disaster preparedness.

However, a key limitation of this study is its focus on individual probability distributions for temperature and rainfall without explicitly addressing the interdependence between these variables. Since temperature and rainfall are inherently related, accurate risk assessments and effective climate adaptation strategies require consideration of their associations. Extensive research has been conducted on the dependence between temperature and rainfall; therefore, future studies should prioritize exploring dependence structures within a multivariate framework using the fitted probability distributions identified in this study. Advanced approaches such as copula models or joint distribution analyses could provide deeper insights into the interactions between these variables, particularly under extreme climatic conditions. Such efforts would significantly enhance the reliability of climate models and their applicability to integrated risk assessment frameworks.

To build on this work, future research should focus on applying this methodology at finer spatial scales using real datasets from various regions in Kenya. Conducting probability distribution analyses at regional levels, incorporating block size analysis, and integrating data from multiple weather stations could yield region-specific insights into seasonal rainfall patterns, further informing targeted climate adaptation strategies. From a policy perspective, the results underscore the need for data-driven strategies that take into account both individual and joint variability of climatic variables. Policymakers should leverage these insights to design robust adaptation measures, such as enhancing agricultural planning, improving water resource management, and enhancing infrastructure resilience tailored to Kenya’s specific climate challenges.

Acknowledgements

The authors acknowledge with gratitude the support from Strathmore Institute of Mathematical Sciences, Strathmore University and the DAAD [ST32 - PKZ: 91789473] in the production of this manuscript.

Author contributions

K.O., B.O. and L.C. conceived the project. K.O. performed the analysis and drafted the manuscript with substantial contributions from B.O., L.C., and C.O. All authors have read and approved the final version of the manuscript.

Data availability

The data that support the findings of this study are accessible to registered users (free registration) on the World Bank, Climate Change Knowledge Portal (https://climateknowledgeportal.worldbank.org/).

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.GOK. Kenya Climate Smart Agriculture Strategy, 2017–2026 (Ministry of Agriculture, Livestock and Fisheries, 2017).
  • 2.Jalango, D. et al. Climate smart agriculture investment plan for kenya. In Accelerating Impacts of CGIAR Climate Research for Africa (AICCRA) (2022).
  • 3.Nyika, J. M. Climate change situation in Kenya and measures towards adaptive management in the water sector. In Research Anthology on Environmental and Societal Impacts of Climate Change, 1857–1872 (IGI Global, 2022).
  • 4.Ngure, M. W., Wandiga, S. O., Olago, D. O. & Oriaso, S. O. Climate change stressors affecting household food security among Kimandi–Wanyaga smallholder farmers in Murang’a County, Kenya. Open Agric.6, 587–608 (2021). [Google Scholar]
  • 5.Mkonda, M. Y. & He, X. Are rainfall and temperature really changing? Farmer’s perceptions, meteorological data, and policy implications in the tanzanian semi-arid zone. Sustainability9, 1412 (2017). [Google Scholar]
  • 6.Al Mamoon, A. & Rahman, A. Selection of the best fit probability distribution in rainfall frequency analysis for Qatar. Nat. Hazards86, 281–296 (2017). [Google Scholar]
  • 7.Sharma, M. A. & Singh, J. B. Use of probability distribution in rainfall analysis. N. Y. Sci. J.3, 40–49 (2010). [Google Scholar]
  • 8.Dzupire, N. C., Ngare, P. & Odongo, L. A copula based bi-variate model for temperature and rainfall processes. Sci. Afr.8, e00365 (2020). [Google Scholar]
  • 9.Athulya, P. & James, K. Best fit probability distributions for monthly radiosonde weather data. Int. J. Adv. Manag. Technol. Eng. Sci.7, 24–31 (2017). [Google Scholar]
  • 10.Ozonur, D., Pobocikova, I. & de Souza, A. Statistical analysis of monthly rainfall in central west Brazil using probability distributions. Model. Earth Syst. Environ.7, 1979–1989 (2021). [Google Scholar]
  • 11.Ximenes, P. S. M. P., Silva, A. S. A., Ashkar, F. & Stosic, T. Best-fit probability distribution models for monthly rainfall of northeastern brazil. Water Sci. Technol.84, 1541–1556 (2021). [DOI] [PubMed] [Google Scholar]
  • 12.Hussain, B. et al. Interdependence between temperature and precipitation: Modeling using copula method toward climate protection. Model. Earth Syst. Environ.8, 2753–2766 (2022). [Google Scholar]
  • 13.Singirankabo, E. & Iyamuremye, E. Modelling extreme rainfall events in Kigali city using generalized pareto distribution. Meteorol. Appl.29, e2076 (2022). [Google Scholar]
  • 14.Agbonaye, A. & Izinyon, O. Best-fit probability distribution model for rainfall frequency analysis of three cities in south eastern Nigeria. Niger. J. Environ. Sci. Technol. (NIJEST)1, 34–42 (2017). [Google Scholar]
  • 15.Douka, M. & Karacostas, T. Statistical analyses of extreme rainfall events in Thessaloniki, Greece. Atmos. Res.208, 60–77 (2018). [Google Scholar]
  • 16.Oseni, B. A. & Ayoola, F. J. Fitting the statistical distribution for daily rainfall in Ibadan, based on chi-square and Kolmogorov–Smirnov goodness-of-fit tests. West Afr. J. Ind. Acad. Res.7, 93–100 (2013). [Google Scholar]
  • 17.Yuan, J., Emura, K., Farnham, C. & Alam, M. A. Frequency analysis of annual maximum hourly precipitation and determination of best fit probability distribution for regions in Japan. Urban Clim.24, 276–286 (2018). [Google Scholar]
  • 18.Alam, M. A., Farnham, C. & Emura, K. Best-fit probability models for maximum monthly rainfall in Bangladesh using Gaussian mixture distributions. Geosciences8, 138 (2018). [Google Scholar]
  • 19.Houessou-Dossou, E. A. Y., Mwangi Gathenya, J., Njuguna, M. & Abiero Gariy, Z. Flood frequency analysis using participatory GIS and rainfall data for two stations in Narok town, Kenya. Hydrology6, 90 (2019). [Google Scholar]
  • 20.Coronado-Hernández, Ó. E., Merlano-Sabalza, E., Díaz-Vergara, Z. & Coronado-Hernández, J. R. Selection of hydrological probability distributions for extreme rainfall events in the regions of Colombia. Water12, 1397 (2020). [Google Scholar]
  • 21.Fadhilah, Y. et al. Fitting the best-fit distribution for the hourly rainfall amount in the Wilayah Persekutuan. Jurnal Teknologi46, 49–58 (2007). [Google Scholar]
  • 22.Hasan, R. H. R. Estimating the best-fitted probability distribution for monthly maximum temperature at the Sylhet station in Bangladesh. J. Math. Stat. Stud.2, 60–67 (2021). [Google Scholar]
  • 23.Hossain, M. Fitting the probability distribution of monthly maximum temperature of some selected stations from the northern part of Bangladesh. Int. J. Ecol. Econ. Stat.39, 80–91 (2018). [Google Scholar]
  • 24.WorldBank. Climate change knowledge portal (2024). Accessed 16 Sept 2023.
  • 25.CullenFrey, A. Probabilistic techniques in exposure assessment (1999).
  • 26.Haddad, K. & Rahman, A. Selection of the best fit flood frequency distribution and parameter estimation procedure: A case study for Tasmania in Australia. Stoch. Environ. Res. Risk Assess.25, 415–428 (2011). [Google Scholar]
  • 27.Fisher, R. A. On the mathematical foundations of theoretical statistics. In Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, vol. 222, 309–368 (1922).
  • 28.Zong, Z. Information-Theoretic Methods for Estimating of Complicated Probability Distributions Vol. 207 (Elsevier, 2006). [Google Scholar]
  • 29.Naghettini, M. Fundamentals of Statistical Hydrology (Springer, 2017). [Google Scholar]
  • 30.Chikobvu, D. & Chifurira, R. Modelling of extreme minimum rainfall using generalised extreme value distribution for Zimbabwe. S. Afr. J. Sci.111, 01–08 (2015). [Google Scholar]
  • 31.Sukrutha, A., Dyuthi, S. R. & Desai, S. Probability distribution for monthly precipitation data in India. arXiv preprintarXiv:1708.03144 (2017).
  • 32.Lima, A. O. et al. Extreme rainfall events over Rio de Janeiro state, brazil: Characterization using probability distribution functions and clustering analysis. Atmos. Res.247, 105221 (2021). [Google Scholar]
  • 33.Moccia, B., Mineo, C., Ridolfi, E., Russo, F. & Napolitano, F. Probability distributions of daily rainfall extremes in Lazio and Sicily, Italy, and design rainfall inferences. J. Hydrol. Reg. Stud.33, 100771 (2021). [Google Scholar]
  • 34.Razali, N. M. et al. Power comparisons of Shapiro–Wilk, Kolmogorov–Smirnov, Lilliefors and Anderson–Darling tests. J. Stat. Model. Anal.2, 21–33 (2011). [Google Scholar]
  • 35.Ng, J. et al. Investigation of the best fit probability distribution for annual maximum rainfall in Kelantan river basin. In IOP Conference Series: Earth and Environmental Science, vol. 476, 012118 (IOP Publishing, 2020).
  • 36.Coronado-Hernández, Ó. E., Merlano-Sabalza, E., Díaz-Vergara, Z. & Coronado-Hernández, J. R. Selection of hydrological probability distributions for extreme rainfall events in the regions of Colombia. Water12, 1397 (2020). [Google Scholar]
  • 37.Singirankabo, E., Iyamuremye, E., Habineza, A. & Nelson, Y. Statistical modelling of maximum temperature in Rwanda using extreme value analysis. Open J. Math. Sci.7, 180–195 (2023). [Google Scholar]
  • 38.Coles, S. & Coles, S. Basics of statistical modeling. In An Introduction to Statistical Modeling of Extreme Values 18–44 (2001).
  • 39.Ng, J. et al. Statistical modelling of extreme temperature in peninsular Malaysia. In IOP Conference Series: Earth and Environmental Science, vol. 1022, 012072 (IOP Publishing, 2022).
  • 40.Ghosh, S., Roy, M. K. & Biswas, S. C. Determination of the best fit probability distribution for monthly rainfall data in Bangladesh. Am. J. Math. Stat.6, 170–174 (2016). [Google Scholar]
  • 41.Villarini, G., Smith, J. A., Serinaldi, F. & Ntelekos, A. A. Analyses of seasonal and annual maximum daily discharge records for central Europe. J. Hydrol.399, 299–312 (2011). [Google Scholar]
  • 42.Hasan, H., Radi, N. A. & Kassim, S. Modeling of extreme temperature using generalized extreme value (GEV) distribution: A case study of Penang. Proc. World Congr. Eng.1, 181–186 (2012). [Google Scholar]
  • 43.Ender, M. & Ma, T. Extreme value modeling of precipitation in case studies for China. Int. J. Sci. Innov. Math. Res. (IJSIMR)2, 23–36 (2014). [Google Scholar]
  • 44.Fowler, H. & Kilsby, C. A regional frequency analysis of united kingdom extreme rainfall from 1961 to 2000. Int. J. Climatol. J. R. Meteorol. Soc.23, 1313–1334 (2003). [Google Scholar]
  • 45.Gilleland, E., Ribatet, M. & Stephenson, A. G. A software review for extreme value analysis. Extremes16, 103–119 (2013). [Google Scholar]
  • 46.Özari, Ç., Eren, Ö. & Saygin, H. A new methodology for the block maxima approach in selecting the optimal block size. Tehnički vjesnik26, 1292–1296 (2019). [Google Scholar]
  • 47.Musyoka, M. M. Spatial–Temporal Characteristics of Rainfall Events in Kenya. Ph.D. thesis, University of Nairobi (2020).
  • 48.Onwuegbuche, F. C. et al. Application of extreme value theory in predicting climate change induced extreme rainfall in Kenya. Int. J. Stat. Probab.8, 85–94 (2019). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The data that support the findings of this study are accessible to registered users (free registration) on the World Bank, Climate Change Knowledge Portal (https://climateknowledgeportal.worldbank.org/).


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES