Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2025 Nov 25;15:41927. doi: 10.1038/s41598-025-25841-4

Cost analysis of enhanced software reliability model with Weibull testing effort function using simulated annealing

Komee 1, Bhoopendra Pachauri 1,
PMCID: PMC12647680  PMID: 41290814

Abstract

In continuation of efforts to improve software reliability assessment, this paper proposed an enhanced Software Reliability Growth Model (SRGM) by incorporating a Weibull testing effort function. The model has integrated multiple testing coverages and change point in imperfect debugging environment to reflect shifts in testing dynamics. To validate the model comparative study has been done using two real-world datasets. For the comparison mean squared error, prediction ratio risk, and predictive power have been used. The model has been coupled with a cost-based software release framework that account for post-release risk, and strategy change expenses. Simulated Annealing optimization technique has been employed to minimize the total cost while satisfying reliability constraints. The results demonstrated that the proposed study not only enhances reliability assessment but also facilitate cost-effective release decision-making compared to existing literature. The proposed work may be extended by considering multiple change points in the future.

Keywords: Testing effort function, Software reliability, Testing coverage, Change-point, Simulated annealing

Subject terms: Engineering, Mathematics and computing

Introduction

Ensuring software reliability is a fundamental objective, when the rapid expansion of software applications across various domains have amplified the demand for reliable, fault-free software system. “Reliability is the probability of success or probability that the system will perform its intended function under specified design limit”1. To ensure software reliability, many Software Reliability Models (SRMs) have been proposed to predict fault/failure behaviour. Various factors like testing time, testing effort, testing coverage, fault reduction factor, fault removal efficiency, detection rate, change points, etc. affect the quality of SRMs.

Basically, there are two types of SRMs: the deterministic and the probabilistic. The probabilistic software reliability models can be classified into different groups such as Markov structure, time-series, Non-homogeneous Poisson Process (NHPP), reliability growth, error seeding, failure rate and curve fitting1. Among these, model based on NHPP have been widely used. These models describe the time-dependent nature of software fault detection and support diverse extensions. They have been developed in perfect debugging and imperfect debugging environments. The models developed in perfect debugging environment consider that the detected faults have been immediately removed and no new faults have been introduced in the fault removal process.

The first NHPP based model has been proposed by Goel and Okumoto. In this model the failure intensity was the product of the constant hazard rate of an individual fault and the number of expected faults remaining in the software2. Later on, various SRGMs have been developed incorporating various factors such as Testing Effort Function (TEF), testing coverage, change points, etc. in perfect and imperfect debugging environment. Among these, the TEF has been used to optimize the allocation of testing resources. It is a function that refers to the resources (time, cost, personnel, etc.) required to conduct the testing process. Pradhan et al. proposed a model which incorporated generalized inflection S-shaped as a TEF in perfect debugging environment3. Dhaka et al. have discussed a SRGM using generalized extended inverse Weibull as the TEF4. Integration of Weibull testing effort function into fault detection and removal has been done by Pradhan et al.5. Kapur et al. formulated an NHPP-based SRGM that accounts for testing effort in reliability growth6. A growth model using exponentiated additive Weibull distribution as a TEF has been introduced by Dhaka et al.7.

A TEF is an important factor for software reliability enhancement but in today’s world it alone may not safeguard effective fault detection. There are also other factors which influence software reliability. One of them is testing coverage which is essential to test the effectiveness of the software. It optimizes the resource allocation by allocating testing effort to failure-sensitive regions. Rani et al. developed a SRGM which incorporated two types of testing coverage functions, delayed S-shaped and inflection S-shaped8. Song et al. proposed a testing coverage based SRGM by considering software operating environment9. A reliability model integrating Weibull TEF with three testing coverage functions namely exponential, delayed S-shaped and logistic has been studied by Aggarwal et al.10. Kumar et al. developed SRGMs by embedding a Weibull-based testing effort into exponential, logistic, and S-shaped coverage frameworks. To evaluate the robustness of these models, they applied genetic algorithm–driven sensitivity analysis11. Iqbal et al. discussed a model incorporating sigmoid testing effort function with three different coverage functions12.

In real software testing, the fault detection rate often does not remain constant. There can be sudden shifts due to altered testing strategies, code stabilization, or phase transitions. Incorporating a change-point allows models to more accurately represent the software reliability. The concept of change-point has been described in detail by Chen et al.13. In NHPP-based SRGM, the change point refers to the point in time during the testing phase where the rate at which faults are detected or removed changes. Pradhan et al. introduced a model incorporating change point, testing effort and error generation that impacts software performance14. A software reliability model considering change-point during the complete lifecycle of a software has been studied by Shrivastava and Kapur15. Chatterjee and Shukla presented the integration of S-shaped testing coverage with change point16. Aggarwal et al. suggested the integration of change point with three testing coverages and testing effort in a perfect debugging environment17. The concept of single change point has been extended to multiple change points by Ke and Huang18.

Recently, a few authors have proposed NHPP based SRGMs in imperfect debugging environment using possible combinations of testing coverage, testing effort and change point. In imperfect debugging new faults may be introduced during the fault removal process due to the changes in the source code. Samal et al. developed a model integrating generalized logistic testing effort with change point in an imperfect debugging environment19. A model embedding error generation and fault removal efficiency along with inflection S-shaped testing coverage in imperfect debugging environment has been designed by Li and Pham20. Pradhan et al. proposed a two-phase growth model incorporating testing effort in imperfect debugging21. Chatterjee and Shukla extended their previous model in imperfect debugging environment using three different coverage functions namely exponential, Weibull, and S-shaped22. Bibyan et al. have also discussed the above said testing coverage functions in their proposed model23. Incorporation of sigmoid testing effort with three testing coverages in an imperfect debugging environment has been studied by Nazir et al.24. Nageswari et al. have proposed a software hazard rate model incorporating multiple change points in an imperfect debugging environment25. Behera and Agarwal have constructed a growth model integrating change points, and generalized logistic testing effort in imperfect debugging26. Pradhan et al. introduced a SRGM that incorporated fault dependency, multi-release, and change-point concepts27.

The above modeling aspects capture the dynamics of software fault detection and fault removal. Their practical relevance is best understood when extended to cost models that quantify the economic implications. In some of the above discussed articles cost modeling has been incorporated along with reliability modeling5,7,8,11,15,17,18,21,26,27. This integration highlights the economic and technical perspectives. However, it is notable that none of the works have presented cost models in isolation, as they are always developed in conjunction with reliability considerations.

Existing literature includes reliability models under both perfect and imperfect debugging assumptions. However, the literature focusing on imperfect debugging is relatively limited. Several SRGMs integrated testing effort, coverage, and change-point under perfect debugging. In contrast, models under imperfect debugging typically address only partial pairings, such as testing effort with coverage, testing effort with change-point, or coverage with change-point, yet a comprehensive model combining all four aspects remained unexplored17. Table 1 emphasizes the prior pairwise combinations. Hence highlighting the research deficiency in consolidating all elements under a cohesive framework.

Table 1.

Comparison between prior and current research.

Author’s Testing effort function Testing coverage Change point Debugging environment
Pradhan et al.14 Yes No No Perfect
Kumar et al.11 Yes Yes No Perfect
Chatterjee and Shukla16 No Yes Yes Perfect
Dhaka et al.4 Yes No No Imperfect
Pradhan et al.21 Yes No Yes Imperfect
Behera and Aggarwal26 Yes No Yes Imperfect
Aggarwal et al.17 Yes Yes Yes Perfect
Proposed model Yes Yes Yes Imperfect

Numerous TEFs such as Gompertz, Rayleigh, logistic, exponential, Weibull etc. have been employed in literature to improve the performance of a software reliability model. A comparative study of these TEFs has been summarized in Table 2, highlighting their distinct features and modeling suitability. Motivated from Aggarwal et al.17 in this paper, Weibull distribution has been employed as TEF in imperfect debugging environment and three coverage (exponential, delayed S-shaped, and logistic) functions have been utilized to fit diverse failure datasets. Change-point mechanism has also been employed in the study to model the abrupt transitions in fault occurrence intensity during the software testing process. So, in short, all four combinations as discussed above have been employed in this study. Furthermore, a new cost model has also been introduced in this study. This cost model incorporates a new factor-strategy change cost. The strategy change cost is the expense involved with switching from one testing/debugging approach to another during the software lifecycle. A detailed cost-based release analysis has been conducted using Simulated Annealing (SA) to optimize testing resources under different reliability constraints along with the sensitivity analysis.

Table 2.

Comparative study of testing effort functions (TEFs).

Function Expression Shape/behaviour Advantages Limitations Remarks
Exponential Inline graphic Monotonic, concave; constant hazard Simple, only one parameter Cannot capture ramp-up or S-shape; unrealistic in many projects Special case of Weibull (k = 1)
Rayleigh Inline graphic Symmetric rise and fall; single peak Clear and intuitive understanding Rigid shape; peak location fixed; less flexible Special case of Weibull (k = 2, and Inline graphic)
Logistic Inline graphic Sigmoid; symmetric around inflection Captures learning curves; empirically accurate in some datasets Symmetry assumption; more parameters; less general than Weibull Useful alternative when symmetric S-curve evident
Gompertz Inline graphic (or equivalent) Asymmetric S-curve; slow start, rapid mid-growth, saturation Good for delayed testing then surge Less flexible than Weibull; not widely used; harder interpretation Sometimes fits skewed data better
Weibull Inline graphic Flexible: concave (k < 1), exponential (k = 1), Rayleigh (k = 2), S-shaped (k > 1) General form; captures varied testing behaviours; includes exponential & Rayleigh as special cases With very high k, can produce unrealistic sharp peak Most preferred due to flexibility and empirical accuracy

The remainder of this paper has been organized as follows: Sect. 2 presents the related work and foundational concepts. Section 3 outlines the proposed model, including the assumptions, and mathematical formulation. In Sect. 4 the experimental setup, dataset, parameter estimation, results and the performance of the model using key reliability metrics have been discussed. Section 5 provides a detailed cost analysis, evaluating the economic impact of the proposed release strategy. Finally, Sect. 6 concludes the paper and highlights directions for the future work.

Theoretical framework

In this section, the basic information of Weibull distribution and simulated annealing has been provided.

Weibull distribution

The Weibull distribution has been designed to capture the dynamic changes in failure behaviour over time. The ability to represent increasing, constant, or decreasing rates makes it ideal to reflect real-world variations in testing intensity. The cumulative density function of the two-parameter Weibull TEF is:

graphic file with name d33e496.gif 1

Here, λ is the scale parameter, k is the shape parameter and Inline graphic is the total testing effort. The shape parameter k governs the curve’s steepness. For k < 1, the rise is initially rapid (indicating a decreasing failure rate), k = 1 yields a simple exponential approach, and k > 1 produces a slower start followed by an accelerating failure rate over time. This flexibility enables the model to reflect varying testing effort intensities and allows the effort curve to adapt more sensitively17. The role of different TEFs in reliability modeling has been illustrated through the comparative study presented in Table 2.

Simulated annealing

Simulated annealing (SA) is a probabilistic technique for approximating the global optimum of a function. The name of the algorithm comes from annealing in metallurgy, which involves heating and controlled cooling of a material. As heating and cooling the material affects both the temperature and the thermodynamic free energy. It is used to approximate the global optimization in a large search space. For large numbers of local optima, SA can find the global optimum. It can be used for computational optimization problems for which exact algorithms fail, usually achieving an approximate solution to the global minimum.

SA was started as a method to solve single objective combinatorial problems. These days, it has been applied to solve both single as well as multi objective optimization problems. Application of SA is not restricted to the optimization of non-linear objective functions, it can also be applied for many other purposes such as recognition of patterns, object classification, etc.28.

The SA based algorithm for single objective optimization is illustrated below29.

graphic file with name 41598_2025_25841_Figa_HTML.jpg

Proposed model

An overview of the proposed model with graphical representation in Fig. 1 along with the notations used, has been given below. At the end of the section a step by step algorithm of the process has also been provided.

Fig. 1.

Fig. 1

Graphical representation of the proposed model.

Notations:

Symbol Meaning
m(t) Mean Value Function (expected cumulative faults detected by time t)
a Total initial faults in the software
α Rate of change of coverage with respect to testing effort
c(W(t)) Testing coverage function as a function of applied effort W(t)
W(t) Testing effort applied up to time t
ϕ Coverage-effort coefficient linking effort to coverage
b(t) Fault detection rate, possibly changing at change-point τ
b1 Fault detection rate before change point
b2 Fault detection rate after change point
β Scale parameter for logistic coverage function
Inline graphic Change-point in fault detection rate
W0 Total testing effort
Inline graphic Scale parameter of Weibull TEF
k Shape parameter of Weibull TEF
C(t) Total cost function
R(W) Software reliability function
R0 Pre-specified reliability
C1(t, Inline graphic) Testing effort cost
C2(t, Inline graphic) Fault detection and fixing cost
C3(t, Inline graphic) Risk of undetected faults, representing potential post-release failures
C4(Inline graphic) Additional cost due to changes in testing strategy
m1 MVF of proposed model case 1 before change point
m2 MVF of proposed model case 1 after change point
c1 Cost of fixing a fault before the change point
c2 Cost of fixing a fault after the change point
c3 Cost per undetected fault post-release
n1 Cost per unit of testing effort
n2 Cost coefficient for strategy change

In this section, a SRGM incorporating testing effort, testing coverages, and change point in an imperfect debugging environment has been introduced. This enhanced model addresses the dynamic shifts in fault detection rates during the testing phase. The analysis undertaken in this study relies on the following assumptions11,17,30.

  1. The fault detection in SRGM follows the NHPP.

  2. The rate of fault detection and removal may fluctuate at any point during the testing phase.

  3. Failures in the software system occur randomly due to the residual faults.

  4. The number of faults detected is directly proportional to the number of faults that remain undetected.

  5. The debugging process is imperfect, and new faults may be introduced during fault removal.

  6. The extent of testing coverage evolves on the amount of testing effort applied.

  7. The testing coverage could be expressed in the terms of faults detection rate as Inline graphic.

  8. The proportion of code covered during testing influences the number of faults detected.

Based on the assumptions above, the rate of change in the mean value function represented by:

graphic file with name d33e963.gif 2

where:

graphic file with name d33e969.gif 3
graphic file with name d33e973.gif 4
graphic file with name d33e977.gif 5

Here, Inline graphic represents the fault detection rate with respect to the testing coverage, Inline graphic represents the rate of change of coverage with respect to testing effort denoted by a constant ϕ, c(W(t)) is testing coverage function as a function of applied effort, W(t) is the testing effort function, Inline graphic is the initial number of faults observed, Inline graphic is the fault introduction rate during the debugging process. λ, k, and Inline graphic are the scale parameter, shape parameter, and total testing effort of the Weibull TEF respectively.

By combining the differential Eqs. (2), (3), (4), and (5) along with the initial condition Inline graphic, the expression for mean value function (MVF) has been given in Eq. (6).

graphic file with name d33e1009.gif 6

where, m(t) is the expected number of faults detected by the time t.

The fault detection rate, b(t) experiences a structural shift at a specific time point, denoted as the change point. When it changes at a specific time Inline graphic, then Inline graphichas been defined as follows:

graphic file with name d33e1026.gif 7

where b1 and b2 are the fault detection rates before and after the change point Inline graphic.

Furthermore, to capture the fault detection behaviour within each phase, the coverage functions as defined in Eqs. (8), (10), and (12) have been adopted17. The cumulative number of faults detected over time as shown in Eqs. (9), (11), and (13) has been calculated using Eq. (6), and (7), along with the coverage functions, while satisfying the condition Inline graphic.

Case 1

graphic file with name d33e1054.gif 8

where, c(t) is the coverage function.

graphic file with name d33e1061.gif 9

Case 2

graphic file with name d33e1069.gif 10
graphic file with name d33e1074.gif 11

Case 3

graphic file with name d33e1083.gif 12
graphic file with name d33e1088.gif 13

where, β is the scale parameter of the logistic distribution function.

The full solution has been documented in the Appendix for further consultation.

Algorithmic description of the model has been given below:

Step 1: Start.

Step 2: Conduct a comprehensive literature review to understand existing work in the field.

Step 3: Identify research gaps from the literature.

Step 4: Address the identified gaps by:

  • Changing the debugging environment assumption from perfect to imperfect.

  • Incorporating a new factor-strategy change cost-into the cost model.

Step 5: Formulate the assumptions underlying the proposed model.

Step 6: Develop the mathematical formulation of the model based on these assumptions.

Step 7: Perform parameter estimation using real datasets.

Step 8: Compare the results with existing literature to validate improvements and contributions.

Step 9: Define the complete cost model integrating fault detection, debugging, and strategy change costs.

Step 10: Conduct sensitivity analysis and cost analysis using SA to assess the model’s robustness.

Step 11: Derive concluding remarks based on the analysis.

Step 12: Document supporting references.

Step 13: End.

To validate the proposed models, a series of statistical evaluations has been done. The results and comparative analysis have been discussed in detail in the next section.

Result analysis

The effectiveness of the proposed models has been evaluated using two different real datasets. Dataset 1 (DS-1) originated from software testing data collected from Tandem Computers32, covering a testing duration of 20 weeks with a total of 100 detected faults. Dataset 2 (DS-2) has been derived from a ground-based radar system project reported by Brooks and Motley (1980), encompassing 35 months of testing and a total of 1301 detected faults. A comparative study has been done between the proposed models and existing SRGM using goodness-of-fit criteria such as mean squared error (MSE), prediction ratio risk (PRR), predictive power (PP), and r-squared (R2). The change point τ is identified by detecting the abrupt variation in the frequency of fault detections. In datasets there may be multiple potential change points, each reflecting noticeable shifts in fault detection rate, such as weeks 9, 12,17 for DS-1 and months 12, 17, 20, 23 for DS-2. From these potential change points the change occurring at 12th week for DS-1 and 12th month for DS-2 has been selected in the study.

The parameter values for the Weibull testing effort function have been estimated using DS-1 and DS-2, and summarized in Table 3. These values have been obtained through a nonlinear least square fitting approach, which has been utilized to accurately determine the shape and scale parameters.

Table 3.

Parameter values of Weibull testing effort function for DS 1 and DS 2.

W0 λ k
DS 1 118.5636 0.0987 1.1097
DS 2 1125.3671 0.0556 2.0006

The parameter estimation for the proposed models, have been carried out using a non-linear least square fitting method. The detailed outcomes for DS-1 and DS-2 have been shown in Tables 4, and 5 respectively. On comparison, it has been observed that the predicted faults approximately similar to the actual faults i.e. 100 defects for DS-1 and 1301 defects for DS-2, indicating a strong alignment between modeled and actual reliability behaviour. With the incorporation of change point, this proximity even improved in both the datasets.

Table 4.

Estimation results of parameter of proposed models for DS 1.

Model a b1 b2 ϕ α β
Without change point Aggarwal et al.17 122 0.008 - 0.017 - -
Proposed Model Case 1 97 0.4868 - 0.0258 0.6336 -
Aggarwal et al.17 135 0.071 - 0.002 - -
Proposed Model Case 2 99 0.9 - 0.0125 0.8725 -
Aggarwal et al.17 131 0.071 - 0.002 - 75.601
Proposed Model Case 3 110 0.4547 - 0.0252 0.5 0.5106
With change point Aggarwal et al.17 132 0.031 0.032 0.005 - -
Proposed Model Case 1 110 0.1419 0.1403 0.0667 0.9 -
Aggarwal et al.17 134 0.049 0.048 0.003 - -
Proposed Model Case 2 99 0.04 0.052 0.2932 0.8034 -
Aggarwal et al.17 135 0.005 0.04 0.037 - 85.798
Proposed Model Case 3 89 0.0351 0.0756 0.3745 0.3796 20.4931

Table 5.

Estimation results of parameter of proposed models for DS 2.

Model a b1 b2 ϕ α β
Without change point Aggarwal et al.17 1661 0.003 - - 0.308 -
Proposed Model Case 1 1661 0.019796 - 0.037944 0.9998 -
Aggarwal et al.17 1693 0.335 - - 0.002 -
Proposed Model Case 2 1423 0.677456 - 0.001251 0.9 -
Aggarwal et al.17 1402 0.002 - - 0.669 1.288
Proposed Model Case 3 1609 0.001681 - 0.5 0.87 0.2508
With change point Aggarwal et al.17 1664 0.267 0.266 - 0.003 -
Proposed Model Case 1 1635 0.14042 0.13956 0.005327 0.93 -
Aggarwal et al.17 1605 0.061 0.060 - 0.015 -
Proposed Model Case 2 1620 0.043566 0.05 0.017131 0.8147 -
Aggarwal et al.17 1649 0.134 0.132 - 0.006 33.249
Proposed Model Case 3 1409 0.6692 0.6666 0.0013 0.94 3.001

Using the estimated parameters along with the change point, Inline graphic weeks, the goodness-of-fit criteria values for the proposed models have been shown in Tables 6 and 7. The result for without change point has been given in Table 6 and with change point in Table 7. The values of MSE for without change point for all three cases have been found to be 20.0911, 13.2775, 23.9756, and with change point 11.2702, 6.5225, 3.7941 respectively. PRR values without change point for all three cases have been found to be 0.4269, 1.8931, 0.5771 and with change point 0.4795, 0.0457, 0.0574 respectively. PP values without change point for all three cases have been found to be 0.3005, 0.4436, 0.3314 and with change point 0.2375, 0.0437, 0.0762 respectively. These results indicate that the models with change point outperformed the models without change point and the existing literature for DS-1. These findings underscored the effectiveness of the Weibull TEF, testing coverage and change point in an imperfect debugging environment. For the better understanding of the results the comparative plot of the cumulative number of software defects over time (in week) has been shown in Figs. 2, and 3, indicating that the proposed models provide a closer alignment with the actual data points.

Table 6.

Analysis of DS-1 (without change point).

Model MSE PRR PP R 2
Kumar and Aggarwal31 23.58422 9.489643 0.848339775 0.987
Aggarwal et al. Model 117 68.49883 6.895268 0.93923 0.976
Aggarwal et al. Model 217 18.70733 7.724101 0.80279 0.978
Aggarwal et al. Model 317 24.8695 10.17587 0.878601 0.976
Proposed Model Case 1 20.0911 0.4269 0.3005 0.9753
Proposed Model Case 2 13.2775 1.8931 0.4436 0.9837
Proposed Model Case 3 23.9756 0.5771 0.3314 0.9705

Table 7.

Analysis of DS-1 (with change point).

Model MSE PRR PP R 2
Kumar and Aggarwal31 17.79812 0.71271163 0.3138299 0.991
Aggarwal et al. Model 117 14.23955 10.17587 0.412895 0.977
Aggarwal et al. Model 217 7.581766 0.480246 0.222864 0.992
Aggarwal et al. Model 317 8.021372 0.091883 0.148177 0.994
Proposed Model Case 1 11.2702 0.4795 0.2375 0.9861
Proposed Model Case 2 6.5225 0.0457 0.0437 0.992
Proposed Model Case 3 3.7941 0.0574 0.0762 0.9953

Fig. 2.

Fig. 2

Actual vs. predicted cumulative defects without change point for DS-1.

Fig. 3.

Fig. 3

Actual vs. predicted cumulative defects with change point for DS-1.

Similarly, the calculated values of the statistical performance metrics for change point Inline graphic months along with the values of the compared model for DS-2 has been shown in Tables 8, and 9. The values of MSE without change point for all the three cases have been found to be 2831.6, 2180.9, 1734.8, and with change point 1217.9, 947.3111, 1180.802 respectively. PRR values for all three cases without change point have been found to be 1.8738, 9.6342, 03.3358 and for change point these values are 1.0375, 5.4894, 3.2223 respectively. PP values without change point for all three cases are 0.9690, 1.5077, 1.0672 and with change point 0.6031, 1.3621, 1.0627 respectively. These results indicate that the models with change point outperformed the models without change point and the existing literature for DS-2 too. The visual illustration of the comparative plot of the cumulative number of software defects over time (in months) has been shown in Figs. 4, and 5.

Table 8.

Analysis of DS-2 (without change point).

Model MSE PRR PP R 2
Aggarwal et al. Model 117 6420.484 2.098579 1.291244 0.994
Aggarwal et al. Model 217 6534.017 66.06595 2.708052 0.944
Aggarwal et al. Model 317 6075.331 18.87345 2.720831 0.997
Proposed Model Case 1 2831.6 1.8738 0.9690 0.9867
Proposed Model Case 2 2180.9 9.6342 1.5077 0.9898
Proposed Model Case 3 1734.8 3.3358 1.0672 0.9918

Table 9.

Analysis of DS-2 (with change point).

Model MSE PRR PP R 2
Aggarwal et al. Model 117 1235.882 1.050907 0.804936 0.994
Aggarwal et al. Model 217 1073.432 15.74234 1.36346 0.995
Aggarwal et al. Model 317 1256.672 10.11175 2.56348 0.995
Proposed Model Case 1 1217.9 1.0375 0.6031 0.9943
Proposed Model Case 2 947.3111 5.4894 1.3621 0.9956
Proposed Model Case 3 1180.802 3.2223 1.0627 0.9944

Fig. 4.

Fig. 4

Actual vs. predicted cumulative defects without change point for DS-2.

Fig. 5.

Fig. 5

Actual vs. predicted cumulative defects with change point for DS-2.

The results clearly demonstrated that the proposed model achieved more effective performance by exhibiting lower values of PP, MSE, and PRR, across both with and without change point scenarios. Furthermore, the outcomes of DS-2 reinforced the trend observed in DS-1, highlighting the superior performance of models incorporating change-point mechanisms over both without change point models and the literature. The graphical and tabular comparisons underscored the significant improvements in the proposed model. Having these improvements established, it becomes crucial to investigate their economic viability, which has been addressed in the following cost analysis.

Cost analysis using simulated annealing

In this section, a cost requirement-based software release policy has been discussed. It focuses on determining the optimal point to conclude testing by balancing reliability and total expenditure. It considered various factors, such as testing costs, fault-fixing cost, post release cost, and strategy change cost.

The total cost function C(t) has integrated various expenses associated with software testing and fault management. Building upon this, the total cost and software reliability have been considered as the evaluation criteria1. The optimal release time has been determined by minimizing C(t) subject to attaining a desired reliability level. Hence, the optimization problem has been written as17:

graphic file with name d33e2351.gif
graphic file with name d33e2357.gif

where, Inline graphic, and Inline graphic is the pre-specified reliability.

Inline graphic= MVF of proposed model case 1 after change point.

  • Inline graphic: Testing effort cost, modeled using the Weibull function.

graphic file with name d33e2389.gif

n1: cost per unit of testing effort.

  • Inline graphic: Fault detection and fixing cost, incorporating imperfect debugging.

graphic file with name d33e2407.gif

c1: cost of fixing a fault before the change point.

c2: cost of fixing a fault after the change point.

Inline graphic= MVF of proposed model case 1 before change point.

  • Inline graphic: Risk of undetected faults, representing potential post-release failures.

graphic file with name d33e2438.gif

c3: cost per undetected fault post-release.

  • Inline graphic: Additional cost due to changes in testing strategy or intensity.

graphic file with name d33e2456.gif

n2: cost coefficient for strategy change.

A preliminary cost analysis has been conducted for three different reliability thresholds (R = 0.80, R = 0.85, and R = 0.90) by varying the cost-related parameters n1, n2, c1, c2, and c3​ one at a time. The ranges for the cost parameters have been chosen based on practical feasibility and typical values in software projects. While these parameters influenced the total cost, they had no significant impact on the optimal release time, which varied only with the reliability threshold. Specifically, for R = 0.80, the total cost ranged from 10177.15 to 26126.41 units; for R = 0.90, it ranged from 11046.56 to 28805.12 units, with corresponding changes in optimal release time. Due to space limitations, only the detailed results for R = 0.85 have been presented in this paper.

The sensitivity analysis of parameters, presented in Table 10, showed how each parameter influenced the total cost. Altering one parameter at a time, while maintaining the others fixed, resulted in a corresponding change in the cost. From the table, it has been evident that varying the parameter n1​ resulted in the largest change in cost, while changes in n2​ had the least impact. This indicates that the cost is most sensitive to n1​ among all the considered parameters. Notably, the optimal release time remained constant at 23.29 for all variations, suggesting that the release time has been primarily influenced by the reliability threshold rather than the cost parameters.

Table 10.

Impact of varying cost parameters on total cost and release time (R = 0.85).

Varied parameter Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic Cost Time
n1 varies and c1, c2, c3, n2 are fixed
 n1 5 10 50 70 15 10600.85 23.29
10 10 50 70 15 19017.29 23.29
15 10 50 70 15 27433.87 23.29
c1 varies and n1, c2, c3, n2 are fixed
 c1 10 5 50 70 15 18596.72 23.29
10 10 50 70 15 19017.29 23.29
10 20 50 70 15 19858.45 23.29
c2 varies and n1, c1, c3, n2 are fixed
 c2 10 10 40 70 15 18782.93 23.29
10 10 50 70 15 19017.29 23.29
10 10 60 70 15 19251.79 23.29
c2 varies and n1, c1, c2, n2 are fixed
 c3 10 10 50 50 15 18992.93 23.29
10 10 50 60 15 19017.29 23.29
10 10 50 70 15 19041.68 23.29
n2 varies and n1, c1, c2, c3 are fixed
 n2 10 10 50 70 10 19017.13 23.29
10 10 50 70 15 19017.29 23.29
10 10 50 70 20 19017.44 23.29

The above analysis provided insights into how individual cost parameters affect the total cost at a fixed reliability level. But it did not ensure a global optimal solution across all parameters simultaneously. Therefore, to explore a more comprehensive optimization strategy, SA has been employed33. The SA algorithm has been configured using MATLAB with a maximum of 10,000 iterations and a function tolerance of 10− 4. It has been used to minimize the total cost function while satisfying reliability constraints at three different levels: R = 0.80, R = 0.85, and R = 0.90. The optimization has been performed over six decision variables: t, n1​, c1​, c2, c3, and n2. The steps of SA algorithm have been given below.

Step 1: Define the optimization problem and initialize their parameters.

Step 2: Precompute constant values.

Step 3: Initialize the solution vector [t, n1​, c1​, c2, c3, n2].

Step 4: Set SA parameters (number of iterations, function tolerance).

Step 5: Evaluate the cost function.

Step 6: Stopping criteria (minimal change in cost) is satisfied, otherwise repeat Step 5.

The optimal values of these parameters along with the corresponding total cost for each reliability threshold has been presented in Table 11.

Table 11.

Simulated annealing optimization results.

Reliability Release time (in weeks) Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic Total cost
0.80 22.48 10 5 60 68 15 17949.74
0.85 23.29 10 17 48 66 14 19549.42
0.90 24.14 10 11 43 64 19 19830.48

The results showed that as reliability requirements increase, both the release time and total cost rise. While n1 remains constant, other parameters like n2, c1, c2, and c3 adjust to meet the reliability targets. This highlighted the cost-reliability trade-off and the importance of parameter tuning for optimal software release decisions.

To further illustrate this trade-off in a practical setting, a numerical example has been presented below.

Numerical Example:

Consider a software project where testing is performed in two phases. In the first phase (0 ≤ t ≤ τ), manual testing has been conducted. For τ = 10 days with a fault detection rate of b₁ = 2 faults/day and a unit testing cost of n₁ = 100. Faults detected before the change point have been fixed at a cost of c₁ = 500 per fault. In the second phase (t > τ), automated tests are added, increasing the detection rate to b₂ = 8 faults/day and the fault fixing cost to c₂ = 800 per fault. The risk of undetected faults post-release is accounted for by c3 = 1000 per fault. And the strategy change cost has been modeled asInline graphic, with n₂ = 200. Using a Weibull-based testing effort function W(t), the total cost has been computed as C(t) ≈ 76,190. While the resulting reliability R(W) = 0.855 satisfies a pre-specified requirement of R₀ = 0.8. This example demonstrates how the strategy change point τ corresponds to a practical shift in testing approach and how all cost components can be quantified, allowing project managers to determine the optimal release time while achieving the desired reliability.

Conclusion

This study presented an enhanced SRGM incorporating Weibull distribution as testing effort function, change point, and various testing coverage functions in an imperfect debugging environment. The models have been thoroughly evaluated using two real-world datasets and compared against the existing literature. The results consistently showed improved predictive performance of the proposed models across key goodness-of-fit metrics such as MSE, PRR, R2, and PP, for both with and without change point. Notably, the models with change point outperformed those without change point and the existing literature in all the evaluated metrics across both datasets, reinforcing the value of incorporating dynamic fault detection for enhanced reliability modeling. These improvements demonstrated the model’s effectiveness in capturing realistic fault detection dynamics under evolving testing strategies.

In addition to reliability modeling, a detailed cost-based release analysis had been conducted using Simulated Annealing to optimize testing resources under different reliability constraints. The results revealed that increased reliability requirements lead to higher testing costs and extended-release times. This unified approach not only improved model accuracy but also provided actionable insights for cost-effective release planning.

While the proposed framework demonstrated strong performance, future work can extend this model by incorporating environmental and contextual factors, or multiple change points with piecewise defined parameters. Additionally, testing on more diverse datasets and integrating machine learning for parameter tuning can further enhance the model’s applicability.

Appendix 1

From differential equation to MVF

graphic file with name d33e3174.gif 17

where,

graphic file with name d33e3179.gif 18
graphic file with name d33e3183.gif 19
graphic file with name d33e3187.gif 20
graphic file with name d33e3191.gif 21

Which gives the following differential equation

graphic file with name d33e3197.gif 22

Changing the independent variable from t to c

graphic file with name d33e3203.gif 23

Equating Eqs. (6) and (7) results in

graphic file with name d33e3209.gif 24

which is a first order ordinary differential equation whose integrating factor is Inline graphic. On integration it gives

graphic file with name d33e3218.gif 25

where q is the integration constant. On applying the initial condition Inline graphic which implies no faults at zero coverage i.e. Inline graphic gives

graphic file with name d33e3232.gif 26

Coverage functions that have been employed in this research paper are as follows:

Case 1:

graphic file with name d33e3240.gif 27

Case 2:

graphic file with name d33e3246.gif 28

Case 3:

graphic file with name d33e3252.gif 29

Incorporating change point τ

graphic file with name d33e3258.gif 30

The uncovered fraction at the change-point has been computed from the coverage function c(W(τ)). For Inline graphic, additional testing covers part of the remaining uncovered code.

Total uncovered fraction is phase-dependent:

graphic file with name d33e3271.gif 31

Case-specific adjustments have been done (e.g., scaling factors in S-shaped and logistic coverage) to ensure continuity of the MVF.

Case 1:

graphic file with name d33e3279.gif 32

Case 2:

graphic file with name d33e3285.gif 33

Case 3:

graphic file with name d33e3291.gif 34

Author contributions

Komee: Writing-editing, Writing-original draft, Visualization, Validation, Methodology, Formal analysis, Conceptualization.Bhoopendra Pachauri: Writing-editing, supervision.

Funding

Open access funding provided by Manipal University Jaipur.

Data availability

Data may be obtained on request from the corresponding author.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Pham, H. System Software Reliability (Springer, 2007).
  • 2.Goel, A. L. & Okumoto, K. Time dependent error-detection rate model for software reliability and other performance measures. IEEE Trans. Reliab.28 (3), 206–211 (1979). [Google Scholar]
  • 3.Pradhan, V., Dhar, J. & Kumar, V. Testing-effort based NHPP software reliability growth model with change-point approach. J. Inf. Sci. Eng.38 (2), 343–355 (2022). [Google Scholar]
  • 4.Dhaka, R., Pachauri, B. & Jain, A. A software reliability growth model for open-source software using sine cosine algorithm. Int. J. Inf. Technol.16 (8), 5173–5181 (2024). [Google Scholar]
  • 5.Pradhan, S. K., Kumar, A. & Kumar, V. Modeling reliability-driven software release strategy considering testing effort with fault detection and correction processes: a control theoretic approach. Int. J. Reliab. Qual. Saf. Eng.32 (2), 240002 (2025). [Google Scholar]
  • 6.Kapur, P. K., Gupta, A., Shatnawi, O. & Yadavalli, V. S. Testing effort control using flexible software reliability growth model with change point. Int. J. Perform. Eng.2 (3), 245–263 (2006). [Google Scholar]
  • 7.Dhaka, R., Pachauri, B. & Jain, A. Parameter Estimation of an SRGM using teaching learning based optimization. Int. J. Inf. Technol.15 (6), 2941–2950 (2023). [Google Scholar]
  • 8.Rani, S., Agarwal, P., Jain, M. & Solanki, R. A software reliability growth model considering testing coverage subject to field environment. Int. J. Math. Oper. Res.18 (2), 145–153 (2021). [Google Scholar]
  • 9.Song, K. Y., Chang, I. H. & Pham, H. A testing coverage model based on NHPP software reliability considering the software operating environment and the sensitivity analysis. Mathematics7 (5), 450 (2019). [Google Scholar]
  • 10.Aggarwal, A. G., Kumar, S. & Gupta, R. Multi-release software reliability assessment: testing coverage-based approach. Int. J. Math. Oper. Res.24 (4), 583–594 (2023). [Google Scholar]
  • 11.Kumar, S., Aggarwal, A. G. & Gupta, R. Modeling the role of testing coverage in the software reliability assessment. Int. J. Math. Eng. Manag Sci.8 (3), 504–513 (2023). [Google Scholar]
  • 12.Iqbal, J., Nazir, R. & Rasool, T. NHPP-based testing coverage model with fault removal efficiency and error generation. Int. J. Reliab. Qual. Saf. Eng.2450046, 16 (2024). [Google Scholar]
  • 13.Chen, J. & Gupta, A. K. Parametric Statistical Change Point Analysis, vol. 192 (Springer, 2000).
  • 14.Pradhan, V., Kumar, A. & Dhar, J. Enhanced growth model of software reliability with generalized inflection S-shaped testing-effort function. J. Interdiscip Math.25 (1), 137–153 (2022). [Google Scholar]
  • 15.Shrivastava, A. K. & Kapur, P. K. Change-points‐based software scheduling. Qual. Reliab. Eng. Int.37 (8), 3282–3296 (2021). [Google Scholar]
  • 16.Chatterjee, S. & Shukla, A. Effect of test coverage and change point on software reliability growth based on time variable fault detection probability. J. Softw.11 (1), 110–117 (2016). [Google Scholar]
  • 17.Aggarwal, A., Kumar, S. & Gupta, R. Testing coverage based NHPP software reliability growth modeling with testing effort and change-point. Int. J. Syst. Assur. Eng. Manage.15 (11), 5157–5166 (2024). [Google Scholar]
  • 18.Ke, S. Z. & Huang, C. Y. Software reliability prediction and management: A multiple change-point model approach. Qual. Reliab. Eng. Int.36 (5), 1678–1707 (2020). [Google Scholar]
  • 19.Samal, U., Kushwaha, S. & Kumar, A. A testing-effort based SRGM incorporating imperfect debugging and change point. Reliab. Theory Appl.18 (1), 86–93 (2023). [Google Scholar]
  • 20.Li, Q. & Pham, H. A testing-coverage software reliability model considering fault removal efficiency and error generation. Plos One. 12 (7), e0181524 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Pradhan, S. K., Kumar, A. & Kumar, V. An optimal resource allocation model considering two-phase software reliability growth model with testing effort and imperfect debugging. Reliab. Theory Appl.16 (2), 241–255 (2021). [Google Scholar]
  • 22.Chatterjee, S. & Shukla, A. A unified approach of testing coverage-based software reliability growth modelling with fault detection probability, imperfect debugging, and change point. J. Softw. Evol. Process.31 (3), e2150 (2019).
  • 23.Bibyan, R., Anand, S., Aggarwal, A. G. & Kaur, G. Multi-release software model based on testing coverage incorporating random effect (SDE). MethodsX1, 102076 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Nazir, R., Iqbal, J., Masoodi, F. S. & Shrivastava, A. K. Developing an innovative imperfect debugging software reliability growth model with enhanced testing coverage strategies. Int. J. Reliab. Qual. Saf. Eng.31 (5), 2450017 (2024). [Google Scholar]
  • 25.Nageswari, N., Mahapatra, A. & Mahapatra, G. S. Predictive framework of software reliability analysis under multiple change points and imperfect debugging. Softw. Qual. J.33 (2), 1–18 (2025). [Google Scholar]
  • 26.Behera, A. K. & Agarwal, P. A software reliability prediction and management incorporating change points based on testing effort. Reliab. Theory Appl.19 (2), 91–100 (2024). [Google Scholar]
  • 27.Pradhan, S. K., Kumar, A. & Kumar, V. Multi release software reliability modelling incorporating fault generation in detection process and fault dependency with change point in correction process. Sci. Rep.15 (1), 23145 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Suman, B. & Kumar, P. A survey of simulated annealing as a tool for single and multiobjective optimization. J. Oper. Res. Soc.57 (10), 1143–1160 (2006). [Google Scholar]
  • 29.Yang, H. Balance of mixed flow assembly line based on industrial engineering mathematics and simulated annealing improved algorithm. Results Eng.22, 102071 (2024). [Google Scholar]
  • 30.Khurshid, S., Shrivastava, A. K. & Iqbal, J. Effort based software reliability model with fault reduction factor, change point and imperfect debugging. Int. J. Inf. Technol.13 (1), 331–340 (2021). [Google Scholar]
  • 31.Kumar, S. & Aggarwal, A. G. Integrating testing coverage, effort and change point in a software reliability growth model: a comprehensive analysis. Reliab. Theory Appl.4 (76), 692–700 (2023).
  • 32.Wood, A. Predicting software reliability. Computer29 (11), 69–77 (1996). [Google Scholar]
  • 33.Zomaya, A. Y. & Kazman, R. Simulated annealing techniques. In Algorithms and Theory of Computation Handbook General Concepts and Techniques 33–35 (2010).

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data may be obtained on request from the corresponding author.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES