Abstract
In continuation of efforts to improve software reliability assessment, this paper proposed an enhanced Software Reliability Growth Model (SRGM) by incorporating a Weibull testing effort function. The model has integrated multiple testing coverages and change point in imperfect debugging environment to reflect shifts in testing dynamics. To validate the model comparative study has been done using two real-world datasets. For the comparison mean squared error, prediction ratio risk, and predictive power have been used. The model has been coupled with a cost-based software release framework that account for post-release risk, and strategy change expenses. Simulated Annealing optimization technique has been employed to minimize the total cost while satisfying reliability constraints. The results demonstrated that the proposed study not only enhances reliability assessment but also facilitate cost-effective release decision-making compared to existing literature. The proposed work may be extended by considering multiple change points in the future.
Keywords: Testing effort function, Software reliability, Testing coverage, Change-point, Simulated annealing
Subject terms: Engineering, Mathematics and computing
Introduction
Ensuring software reliability is a fundamental objective, when the rapid expansion of software applications across various domains have amplified the demand for reliable, fault-free software system. “Reliability is the probability of success or probability that the system will perform its intended function under specified design limit”1. To ensure software reliability, many Software Reliability Models (SRMs) have been proposed to predict fault/failure behaviour. Various factors like testing time, testing effort, testing coverage, fault reduction factor, fault removal efficiency, detection rate, change points, etc. affect the quality of SRMs.
Basically, there are two types of SRMs: the deterministic and the probabilistic. The probabilistic software reliability models can be classified into different groups such as Markov structure, time-series, Non-homogeneous Poisson Process (NHPP), reliability growth, error seeding, failure rate and curve fitting1. Among these, model based on NHPP have been widely used. These models describe the time-dependent nature of software fault detection and support diverse extensions. They have been developed in perfect debugging and imperfect debugging environments. The models developed in perfect debugging environment consider that the detected faults have been immediately removed and no new faults have been introduced in the fault removal process.
The first NHPP based model has been proposed by Goel and Okumoto. In this model the failure intensity was the product of the constant hazard rate of an individual fault and the number of expected faults remaining in the software2. Later on, various SRGMs have been developed incorporating various factors such as Testing Effort Function (TEF), testing coverage, change points, etc. in perfect and imperfect debugging environment. Among these, the TEF has been used to optimize the allocation of testing resources. It is a function that refers to the resources (time, cost, personnel, etc.) required to conduct the testing process. Pradhan et al. proposed a model which incorporated generalized inflection S-shaped as a TEF in perfect debugging environment3. Dhaka et al. have discussed a SRGM using generalized extended inverse Weibull as the TEF4. Integration of Weibull testing effort function into fault detection and removal has been done by Pradhan et al.5. Kapur et al. formulated an NHPP-based SRGM that accounts for testing effort in reliability growth6. A growth model using exponentiated additive Weibull distribution as a TEF has been introduced by Dhaka et al.7.
A TEF is an important factor for software reliability enhancement but in today’s world it alone may not safeguard effective fault detection. There are also other factors which influence software reliability. One of them is testing coverage which is essential to test the effectiveness of the software. It optimizes the resource allocation by allocating testing effort to failure-sensitive regions. Rani et al. developed a SRGM which incorporated two types of testing coverage functions, delayed S-shaped and inflection S-shaped8. Song et al. proposed a testing coverage based SRGM by considering software operating environment9. A reliability model integrating Weibull TEF with three testing coverage functions namely exponential, delayed S-shaped and logistic has been studied by Aggarwal et al.10. Kumar et al. developed SRGMs by embedding a Weibull-based testing effort into exponential, logistic, and S-shaped coverage frameworks. To evaluate the robustness of these models, they applied genetic algorithm–driven sensitivity analysis11. Iqbal et al. discussed a model incorporating sigmoid testing effort function with three different coverage functions12.
In real software testing, the fault detection rate often does not remain constant. There can be sudden shifts due to altered testing strategies, code stabilization, or phase transitions. Incorporating a change-point allows models to more accurately represent the software reliability. The concept of change-point has been described in detail by Chen et al.13. In NHPP-based SRGM, the change point refers to the point in time during the testing phase where the rate at which faults are detected or removed changes. Pradhan et al. introduced a model incorporating change point, testing effort and error generation that impacts software performance14. A software reliability model considering change-point during the complete lifecycle of a software has been studied by Shrivastava and Kapur15. Chatterjee and Shukla presented the integration of S-shaped testing coverage with change point16. Aggarwal et al. suggested the integration of change point with three testing coverages and testing effort in a perfect debugging environment17. The concept of single change point has been extended to multiple change points by Ke and Huang18.
Recently, a few authors have proposed NHPP based SRGMs in imperfect debugging environment using possible combinations of testing coverage, testing effort and change point. In imperfect debugging new faults may be introduced during the fault removal process due to the changes in the source code. Samal et al. developed a model integrating generalized logistic testing effort with change point in an imperfect debugging environment19. A model embedding error generation and fault removal efficiency along with inflection S-shaped testing coverage in imperfect debugging environment has been designed by Li and Pham20. Pradhan et al. proposed a two-phase growth model incorporating testing effort in imperfect debugging21. Chatterjee and Shukla extended their previous model in imperfect debugging environment using three different coverage functions namely exponential, Weibull, and S-shaped22. Bibyan et al. have also discussed the above said testing coverage functions in their proposed model23. Incorporation of sigmoid testing effort with three testing coverages in an imperfect debugging environment has been studied by Nazir et al.24. Nageswari et al. have proposed a software hazard rate model incorporating multiple change points in an imperfect debugging environment25. Behera and Agarwal have constructed a growth model integrating change points, and generalized logistic testing effort in imperfect debugging26. Pradhan et al. introduced a SRGM that incorporated fault dependency, multi-release, and change-point concepts27.
The above modeling aspects capture the dynamics of software fault detection and fault removal. Their practical relevance is best understood when extended to cost models that quantify the economic implications. In some of the above discussed articles cost modeling has been incorporated along with reliability modeling5,7,8,11,15,17,18,21,26,27. This integration highlights the economic and technical perspectives. However, it is notable that none of the works have presented cost models in isolation, as they are always developed in conjunction with reliability considerations.
Existing literature includes reliability models under both perfect and imperfect debugging assumptions. However, the literature focusing on imperfect debugging is relatively limited. Several SRGMs integrated testing effort, coverage, and change-point under perfect debugging. In contrast, models under imperfect debugging typically address only partial pairings, such as testing effort with coverage, testing effort with change-point, or coverage with change-point, yet a comprehensive model combining all four aspects remained unexplored17. Table 1 emphasizes the prior pairwise combinations. Hence highlighting the research deficiency in consolidating all elements under a cohesive framework.
Table 1.
Comparison between prior and current research.
| Author’s | Testing effort function | Testing coverage | Change point | Debugging environment |
|---|---|---|---|---|
| Pradhan et al.14 | Yes | No | No | Perfect |
| Kumar et al.11 | Yes | Yes | No | Perfect |
| Chatterjee and Shukla16 | No | Yes | Yes | Perfect |
| Dhaka et al.4 | Yes | No | No | Imperfect |
| Pradhan et al.21 | Yes | No | Yes | Imperfect |
| Behera and Aggarwal26 | Yes | No | Yes | Imperfect |
| Aggarwal et al.17 | Yes | Yes | Yes | Perfect |
| Proposed model | Yes | Yes | Yes | Imperfect |
Numerous TEFs such as Gompertz, Rayleigh, logistic, exponential, Weibull etc. have been employed in literature to improve the performance of a software reliability model. A comparative study of these TEFs has been summarized in Table 2, highlighting their distinct features and modeling suitability. Motivated from Aggarwal et al.17 in this paper, Weibull distribution has been employed as TEF in imperfect debugging environment and three coverage (exponential, delayed S-shaped, and logistic) functions have been utilized to fit diverse failure datasets. Change-point mechanism has also been employed in the study to model the abrupt transitions in fault occurrence intensity during the software testing process. So, in short, all four combinations as discussed above have been employed in this study. Furthermore, a new cost model has also been introduced in this study. This cost model incorporates a new factor-strategy change cost. The strategy change cost is the expense involved with switching from one testing/debugging approach to another during the software lifecycle. A detailed cost-based release analysis has been conducted using Simulated Annealing (SA) to optimize testing resources under different reliability constraints along with the sensitivity analysis.
Table 2.
Comparative study of testing effort functions (TEFs).
| Function | Expression | Shape/behaviour | Advantages | Limitations | Remarks |
|---|---|---|---|---|---|
| Exponential |
|
Monotonic, concave; constant hazard | Simple, only one parameter | Cannot capture ramp-up or S-shape; unrealistic in many projects | Special case of Weibull (k = 1) |
| Rayleigh |
|
Symmetric rise and fall; single peak | Clear and intuitive understanding | Rigid shape; peak location fixed; less flexible | Special case of Weibull (k = 2, and ) |
| Logistic |
|
Sigmoid; symmetric around inflection | Captures learning curves; empirically accurate in some datasets | Symmetry assumption; more parameters; less general than Weibull | Useful alternative when symmetric S-curve evident |
| Gompertz |
(or equivalent) |
Asymmetric S-curve; slow start, rapid mid-growth, saturation | Good for delayed testing then surge | Less flexible than Weibull; not widely used; harder interpretation | Sometimes fits skewed data better |
| Weibull |
|
Flexible: concave (k < 1), exponential (k = 1), Rayleigh (k = 2), S-shaped (k > 1) | General form; captures varied testing behaviours; includes exponential & Rayleigh as special cases | With very high k, can produce unrealistic sharp peak | Most preferred due to flexibility and empirical accuracy |
The remainder of this paper has been organized as follows: Sect. 2 presents the related work and foundational concepts. Section 3 outlines the proposed model, including the assumptions, and mathematical formulation. In Sect. 4 the experimental setup, dataset, parameter estimation, results and the performance of the model using key reliability metrics have been discussed. Section 5 provides a detailed cost analysis, evaluating the economic impact of the proposed release strategy. Finally, Sect. 6 concludes the paper and highlights directions for the future work.
Theoretical framework
In this section, the basic information of Weibull distribution and simulated annealing has been provided.
Weibull distribution
The Weibull distribution has been designed to capture the dynamic changes in failure behaviour over time. The ability to represent increasing, constant, or decreasing rates makes it ideal to reflect real-world variations in testing intensity. The cumulative density function of the two-parameter Weibull TEF is:
![]() |
1 |
Here, λ is the scale parameter, k is the shape parameter and
is the total testing effort. The shape parameter k governs the curve’s steepness. For k < 1, the rise is initially rapid (indicating a decreasing failure rate), k = 1 yields a simple exponential approach, and k > 1 produces a slower start followed by an accelerating failure rate over time. This flexibility enables the model to reflect varying testing effort intensities and allows the effort curve to adapt more sensitively17. The role of different TEFs in reliability modeling has been illustrated through the comparative study presented in Table 2.
Simulated annealing
Simulated annealing (SA) is a probabilistic technique for approximating the global optimum of a function. The name of the algorithm comes from annealing in metallurgy, which involves heating and controlled cooling of a material. As heating and cooling the material affects both the temperature and the thermodynamic free energy. It is used to approximate the global optimization in a large search space. For large numbers of local optima, SA can find the global optimum. It can be used for computational optimization problems for which exact algorithms fail, usually achieving an approximate solution to the global minimum.
SA was started as a method to solve single objective combinatorial problems. These days, it has been applied to solve both single as well as multi objective optimization problems. Application of SA is not restricted to the optimization of non-linear objective functions, it can also be applied for many other purposes such as recognition of patterns, object classification, etc.28.
The SA based algorithm for single objective optimization is illustrated below29.
Proposed model
An overview of the proposed model with graphical representation in Fig. 1 along with the notations used, has been given below. At the end of the section a step by step algorithm of the process has also been provided.
Fig. 1.
Graphical representation of the proposed model.
Notations:
| Symbol | Meaning |
|---|---|
| m(t) | Mean Value Function (expected cumulative faults detected by time t) |
| a | Total initial faults in the software |
| α | Rate of change of coverage with respect to testing effort |
| c(W(t)) | Testing coverage function as a function of applied effort W(t) |
| W(t) | Testing effort applied up to time t |
| ϕ | Coverage-effort coefficient linking effort to coverage |
| b(t) | Fault detection rate, possibly changing at change-point τ |
| b1 | Fault detection rate before change point |
| b2 | Fault detection rate after change point |
| β | Scale parameter for logistic coverage function |
|
Change-point in fault detection rate |
| W0 | Total testing effort |
|
Scale parameter of Weibull TEF |
| k | Shape parameter of Weibull TEF |
| C(t) | Total cost function |
| R(W) | Software reliability function |
| R0 | Pre-specified reliability |
C1(t, ) |
Testing effort cost |
C2(t, ) |
Fault detection and fixing cost |
C3(t, ) |
Risk of undetected faults, representing potential post-release failures |
C4( ) |
Additional cost due to changes in testing strategy |
| m1 | MVF of proposed model case 1 before change point |
| m2 | MVF of proposed model case 1 after change point |
| c1 | Cost of fixing a fault before the change point |
| c2 | Cost of fixing a fault after the change point |
| c3 | Cost per undetected fault post-release |
| n1 | Cost per unit of testing effort |
| n2 | Cost coefficient for strategy change |
In this section, a SRGM incorporating testing effort, testing coverages, and change point in an imperfect debugging environment has been introduced. This enhanced model addresses the dynamic shifts in fault detection rates during the testing phase. The analysis undertaken in this study relies on the following assumptions11,17,30.
The fault detection in SRGM follows the NHPP.
The rate of fault detection and removal may fluctuate at any point during the testing phase.
Failures in the software system occur randomly due to the residual faults.
The number of faults detected is directly proportional to the number of faults that remain undetected.
The debugging process is imperfect, and new faults may be introduced during fault removal.
The extent of testing coverage evolves on the amount of testing effort applied.
The testing coverage could be expressed in the terms of faults detection rate as
.The proportion of code covered during testing influences the number of faults detected.
Based on the assumptions above, the rate of change in the mean value function represented by:
![]() |
2 |
where:
![]() |
3 |
![]() |
4 |
![]() |
5 |
Here,
represents the fault detection rate with respect to the testing coverage,
represents the rate of change of coverage with respect to testing effort denoted by a constant ϕ, c(W(t)) is testing coverage function as a function of applied effort, W(t) is the testing effort function,
is the initial number of faults observed,
is the fault introduction rate during the debugging process. λ, k, and
are the scale parameter, shape parameter, and total testing effort of the Weibull TEF respectively.
By combining the differential Eqs. (2), (3), (4), and (5) along with the initial condition
, the expression for mean value function (MVF) has been given in Eq. (6).
![]() |
6 |
where, m(t) is the expected number of faults detected by the time t.
The fault detection rate, b(t) experiences a structural shift at a specific time point, denoted as the change point. When it changes at a specific time
, then
has been defined as follows:
![]() |
7 |
where b1 and b2 are the fault detection rates before and after the change point
.
Furthermore, to capture the fault detection behaviour within each phase, the coverage functions as defined in Eqs. (8), (10), and (12) have been adopted17. The cumulative number of faults detected over time as shown in Eqs. (9), (11), and (13) has been calculated using Eq. (6), and (7), along with the coverage functions, while satisfying the condition
.
Case 1
![]() |
8 |
where, c(t) is the coverage function.
![]() |
9 |
Case 2
![]() |
10 |
![]() |
11 |
Case 3
![]() |
12 |
![]() |
13 |
where, β is the scale parameter of the logistic distribution function.
The full solution has been documented in the Appendix for further consultation.
Algorithmic description of the model has been given below:
Step 1: Start.
Step 2: Conduct a comprehensive literature review to understand existing work in the field.
Step 3: Identify research gaps from the literature.
Step 4: Address the identified gaps by:
Changing the debugging environment assumption from perfect to imperfect.
Incorporating a new factor-strategy change cost-into the cost model.
Step 5: Formulate the assumptions underlying the proposed model.
Step 6: Develop the mathematical formulation of the model based on these assumptions.
Step 7: Perform parameter estimation using real datasets.
Step 8: Compare the results with existing literature to validate improvements and contributions.
Step 9: Define the complete cost model integrating fault detection, debugging, and strategy change costs.
Step 10: Conduct sensitivity analysis and cost analysis using SA to assess the model’s robustness.
Step 11: Derive concluding remarks based on the analysis.
Step 12: Document supporting references.
Step 13: End.
To validate the proposed models, a series of statistical evaluations has been done. The results and comparative analysis have been discussed in detail in the next section.
Result analysis
The effectiveness of the proposed models has been evaluated using two different real datasets. Dataset 1 (DS-1) originated from software testing data collected from Tandem Computers32, covering a testing duration of 20 weeks with a total of 100 detected faults. Dataset 2 (DS-2) has been derived from a ground-based radar system project reported by Brooks and Motley (1980), encompassing 35 months of testing and a total of 1301 detected faults. A comparative study has been done between the proposed models and existing SRGM using goodness-of-fit criteria such as mean squared error (MSE), prediction ratio risk (PRR), predictive power (PP), and r-squared (R2). The change point τ is identified by detecting the abrupt variation in the frequency of fault detections. In datasets there may be multiple potential change points, each reflecting noticeable shifts in fault detection rate, such as weeks 9, 12,17 for DS-1 and months 12, 17, 20, 23 for DS-2. From these potential change points the change occurring at 12th week for DS-1 and 12th month for DS-2 has been selected in the study.
The parameter values for the Weibull testing effort function have been estimated using DS-1 and DS-2, and summarized in Table 3. These values have been obtained through a nonlinear least square fitting approach, which has been utilized to accurately determine the shape and scale parameters.
Table 3.
Parameter values of Weibull testing effort function for DS 1 and DS 2.
| W0 | λ | k | |
|---|---|---|---|
| DS 1 | 118.5636 | 0.0987 | 1.1097 |
| DS 2 | 1125.3671 | 0.0556 | 2.0006 |
The parameter estimation for the proposed models, have been carried out using a non-linear least square fitting method. The detailed outcomes for DS-1 and DS-2 have been shown in Tables 4, and 5 respectively. On comparison, it has been observed that the predicted faults approximately similar to the actual faults i.e. 100 defects for DS-1 and 1301 defects for DS-2, indicating a strong alignment between modeled and actual reliability behaviour. With the incorporation of change point, this proximity even improved in both the datasets.
Table 4.
Estimation results of parameter of proposed models for DS 1.
| Model | a | b1 | b2 | ϕ | α | β | |
|---|---|---|---|---|---|---|---|
| Without change point | Aggarwal et al.17 | 122 | 0.008 | - | 0.017 | - | - |
| Proposed Model Case 1 | 97 | 0.4868 | - | 0.0258 | 0.6336 | - | |
| Aggarwal et al.17 | 135 | 0.071 | - | 0.002 | - | - | |
| Proposed Model Case 2 | 99 | 0.9 | - | 0.0125 | 0.8725 | - | |
| Aggarwal et al.17 | 131 | 0.071 | - | 0.002 | - | 75.601 | |
| Proposed Model Case 3 | 110 | 0.4547 | - | 0.0252 | 0.5 | 0.5106 | |
| With change point | Aggarwal et al.17 | 132 | 0.031 | 0.032 | 0.005 | - | - |
| Proposed Model Case 1 | 110 | 0.1419 | 0.1403 | 0.0667 | 0.9 | - | |
| Aggarwal et al.17 | 134 | 0.049 | 0.048 | 0.003 | - | - | |
| Proposed Model Case 2 | 99 | 0.04 | 0.052 | 0.2932 | 0.8034 | - | |
| Aggarwal et al.17 | 135 | 0.005 | 0.04 | 0.037 | - | 85.798 | |
| Proposed Model Case 3 | 89 | 0.0351 | 0.0756 | 0.3745 | 0.3796 | 20.4931 |
Table 5.
Estimation results of parameter of proposed models for DS 2.
| Model | a | b1 | b2 | ϕ | α | β | |
|---|---|---|---|---|---|---|---|
| Without change point | Aggarwal et al.17 | 1661 | 0.003 | - | - | 0.308 | - |
| Proposed Model Case 1 | 1661 | 0.019796 | - | 0.037944 | 0.9998 | - | |
| Aggarwal et al.17 | 1693 | 0.335 | - | - | 0.002 | - | |
| Proposed Model Case 2 | 1423 | 0.677456 | - | 0.001251 | 0.9 | - | |
| Aggarwal et al.17 | 1402 | 0.002 | - | - | 0.669 | 1.288 | |
| Proposed Model Case 3 | 1609 | 0.001681 | - | 0.5 | 0.87 | 0.2508 | |
| With change point | Aggarwal et al.17 | 1664 | 0.267 | 0.266 | - | 0.003 | - |
| Proposed Model Case 1 | 1635 | 0.14042 | 0.13956 | 0.005327 | 0.93 | - | |
| Aggarwal et al.17 | 1605 | 0.061 | 0.060 | - | 0.015 | - | |
| Proposed Model Case 2 | 1620 | 0.043566 | 0.05 | 0.017131 | 0.8147 | - | |
| Aggarwal et al.17 | 1649 | 0.134 | 0.132 | - | 0.006 | 33.249 | |
| Proposed Model Case 3 | 1409 | 0.6692 | 0.6666 | 0.0013 | 0.94 | 3.001 |
Using the estimated parameters along with the change point,
weeks, the goodness-of-fit criteria values for the proposed models have been shown in Tables 6 and 7. The result for without change point has been given in Table 6 and with change point in Table 7. The values of MSE for without change point for all three cases have been found to be 20.0911, 13.2775, 23.9756, and with change point 11.2702, 6.5225, 3.7941 respectively. PRR values without change point for all three cases have been found to be 0.4269, 1.8931, 0.5771 and with change point 0.4795, 0.0457, 0.0574 respectively. PP values without change point for all three cases have been found to be 0.3005, 0.4436, 0.3314 and with change point 0.2375, 0.0437, 0.0762 respectively. These results indicate that the models with change point outperformed the models without change point and the existing literature for DS-1. These findings underscored the effectiveness of the Weibull TEF, testing coverage and change point in an imperfect debugging environment. For the better understanding of the results the comparative plot of the cumulative number of software defects over time (in week) has been shown in Figs. 2, and 3, indicating that the proposed models provide a closer alignment with the actual data points.
Table 6.
Analysis of DS-1 (without change point).
| Model | MSE | PRR | PP | R 2 |
|---|---|---|---|---|
| Kumar and Aggarwal31 | 23.58422 | 9.489643 | 0.848339775 | 0.987 |
| Aggarwal et al. Model 117 | 68.49883 | 6.895268 | 0.93923 | 0.976 |
| Aggarwal et al. Model 217 | 18.70733 | 7.724101 | 0.80279 | 0.978 |
| Aggarwal et al. Model 317 | 24.8695 | 10.17587 | 0.878601 | 0.976 |
| Proposed Model Case 1 | 20.0911 | 0.4269 | 0.3005 | 0.9753 |
| Proposed Model Case 2 | 13.2775 | 1.8931 | 0.4436 | 0.9837 |
| Proposed Model Case 3 | 23.9756 | 0.5771 | 0.3314 | 0.9705 |
Table 7.
Analysis of DS-1 (with change point).
| Model | MSE | PRR | PP | R 2 |
|---|---|---|---|---|
| Kumar and Aggarwal31 | 17.79812 | 0.71271163 | 0.3138299 | 0.991 |
| Aggarwal et al. Model 117 | 14.23955 | 10.17587 | 0.412895 | 0.977 |
| Aggarwal et al. Model 217 | 7.581766 | 0.480246 | 0.222864 | 0.992 |
| Aggarwal et al. Model 317 | 8.021372 | 0.091883 | 0.148177 | 0.994 |
| Proposed Model Case 1 | 11.2702 | 0.4795 | 0.2375 | 0.9861 |
| Proposed Model Case 2 | 6.5225 | 0.0457 | 0.0437 | 0.992 |
| Proposed Model Case 3 | 3.7941 | 0.0574 | 0.0762 | 0.9953 |
Fig. 2.
Actual vs. predicted cumulative defects without change point for DS-1.
Fig. 3.
Actual vs. predicted cumulative defects with change point for DS-1.
Similarly, the calculated values of the statistical performance metrics for change point
months along with the values of the compared model for DS-2 has been shown in Tables 8, and 9. The values of MSE without change point for all the three cases have been found to be 2831.6, 2180.9, 1734.8, and with change point 1217.9, 947.3111, 1180.802 respectively. PRR values for all three cases without change point have been found to be 1.8738, 9.6342, 03.3358 and for change point these values are 1.0375, 5.4894, 3.2223 respectively. PP values without change point for all three cases are 0.9690, 1.5077, 1.0672 and with change point 0.6031, 1.3621, 1.0627 respectively. These results indicate that the models with change point outperformed the models without change point and the existing literature for DS-2 too. The visual illustration of the comparative plot of the cumulative number of software defects over time (in months) has been shown in Figs. 4, and 5.
Table 8.
Analysis of DS-2 (without change point).
| Model | MSE | PRR | PP | R 2 |
|---|---|---|---|---|
| Aggarwal et al. Model 117 | 6420.484 | 2.098579 | 1.291244 | 0.994 |
| Aggarwal et al. Model 217 | 6534.017 | 66.06595 | 2.708052 | 0.944 |
| Aggarwal et al. Model 317 | 6075.331 | 18.87345 | 2.720831 | 0.997 |
| Proposed Model Case 1 | 2831.6 | 1.8738 | 0.9690 | 0.9867 |
| Proposed Model Case 2 | 2180.9 | 9.6342 | 1.5077 | 0.9898 |
| Proposed Model Case 3 | 1734.8 | 3.3358 | 1.0672 | 0.9918 |
Table 9.
Analysis of DS-2 (with change point).
| Model | MSE | PRR | PP | R 2 |
|---|---|---|---|---|
| Aggarwal et al. Model 117 | 1235.882 | 1.050907 | 0.804936 | 0.994 |
| Aggarwal et al. Model 217 | 1073.432 | 15.74234 | 1.36346 | 0.995 |
| Aggarwal et al. Model 317 | 1256.672 | 10.11175 | 2.56348 | 0.995 |
| Proposed Model Case 1 | 1217.9 | 1.0375 | 0.6031 | 0.9943 |
| Proposed Model Case 2 | 947.3111 | 5.4894 | 1.3621 | 0.9956 |
| Proposed Model Case 3 | 1180.802 | 3.2223 | 1.0627 | 0.9944 |
Fig. 4.
Actual vs. predicted cumulative defects without change point for DS-2.
Fig. 5.
Actual vs. predicted cumulative defects with change point for DS-2.
The results clearly demonstrated that the proposed model achieved more effective performance by exhibiting lower values of PP, MSE, and PRR, across both with and without change point scenarios. Furthermore, the outcomes of DS-2 reinforced the trend observed in DS-1, highlighting the superior performance of models incorporating change-point mechanisms over both without change point models and the literature. The graphical and tabular comparisons underscored the significant improvements in the proposed model. Having these improvements established, it becomes crucial to investigate their economic viability, which has been addressed in the following cost analysis.
Cost analysis using simulated annealing
In this section, a cost requirement-based software release policy has been discussed. It focuses on determining the optimal point to conclude testing by balancing reliability and total expenditure. It considered various factors, such as testing costs, fault-fixing cost, post release cost, and strategy change cost.
The total cost function C(t) has integrated various expenses associated with software testing and fault management. Building upon this, the total cost and software reliability have been considered as the evaluation criteria1. The optimal release time has been determined by minimizing C(t) subject to attaining a desired reliability level. Hence, the optimization problem has been written as17:
![]() |
![]() |
where,
, and
is the pre-specified reliability.
= MVF of proposed model case 1 after change point.
: Testing effort cost, modeled using the Weibull function.
![]() |
n1: cost per unit of testing effort.
: Fault detection and fixing cost, incorporating imperfect debugging.
![]() |
c1: cost of fixing a fault before the change point.
c2: cost of fixing a fault after the change point.
= MVF of proposed model case 1 before change point.
: Risk of undetected faults, representing potential post-release failures.
![]() |
c3: cost per undetected fault post-release.
: Additional cost due to changes in testing strategy or intensity.
![]() |
n2: cost coefficient for strategy change.
A preliminary cost analysis has been conducted for three different reliability thresholds (R = 0.80, R = 0.85, and R = 0.90) by varying the cost-related parameters n1, n2, c1, c2, and c3 one at a time. The ranges for the cost parameters have been chosen based on practical feasibility and typical values in software projects. While these parameters influenced the total cost, they had no significant impact on the optimal release time, which varied only with the reliability threshold. Specifically, for R = 0.80, the total cost ranged from 10177.15 to 26126.41 units; for R = 0.90, it ranged from 11046.56 to 28805.12 units, with corresponding changes in optimal release time. Due to space limitations, only the detailed results for R = 0.85 have been presented in this paper.
The sensitivity analysis of parameters, presented in Table 10, showed how each parameter influenced the total cost. Altering one parameter at a time, while maintaining the others fixed, resulted in a corresponding change in the cost. From the table, it has been evident that varying the parameter n1 resulted in the largest change in cost, while changes in n2 had the least impact. This indicates that the cost is most sensitive to n1 among all the considered parameters. Notably, the optimal release time remained constant at 23.29 for all variations, suggesting that the release time has been primarily influenced by the reliability threshold rather than the cost parameters.
Table 10.
Impact of varying cost parameters on total cost and release time (R = 0.85).
| Varied parameter |
|
|
|
|
|
Cost | Time |
|---|---|---|---|---|---|---|---|
| n1 varies and c1, c2, c3, n2 are fixed | |||||||
| n1 | 5 | 10 | 50 | 70 | 15 | 10600.85 | 23.29 |
| 10 | 10 | 50 | 70 | 15 | 19017.29 | 23.29 | |
| 15 | 10 | 50 | 70 | 15 | 27433.87 | 23.29 | |
| c1 varies and n1, c2, c3, n2 are fixed | |||||||
| c1 | 10 | 5 | 50 | 70 | 15 | 18596.72 | 23.29 |
| 10 | 10 | 50 | 70 | 15 | 19017.29 | 23.29 | |
| 10 | 20 | 50 | 70 | 15 | 19858.45 | 23.29 | |
| c2 varies and n1, c1, c3, n2 are fixed | |||||||
| c2 | 10 | 10 | 40 | 70 | 15 | 18782.93 | 23.29 |
| 10 | 10 | 50 | 70 | 15 | 19017.29 | 23.29 | |
| 10 | 10 | 60 | 70 | 15 | 19251.79 | 23.29 | |
| c2 varies and n1, c1, c2, n2 are fixed | |||||||
| c3 | 10 | 10 | 50 | 50 | 15 | 18992.93 | 23.29 |
| 10 | 10 | 50 | 60 | 15 | 19017.29 | 23.29 | |
| 10 | 10 | 50 | 70 | 15 | 19041.68 | 23.29 | |
| n2 varies and n1, c1, c2, c3 are fixed | |||||||
| n2 | 10 | 10 | 50 | 70 | 10 | 19017.13 | 23.29 |
| 10 | 10 | 50 | 70 | 15 | 19017.29 | 23.29 | |
| 10 | 10 | 50 | 70 | 20 | 19017.44 | 23.29 | |
The above analysis provided insights into how individual cost parameters affect the total cost at a fixed reliability level. But it did not ensure a global optimal solution across all parameters simultaneously. Therefore, to explore a more comprehensive optimization strategy, SA has been employed33. The SA algorithm has been configured using MATLAB with a maximum of 10,000 iterations and a function tolerance of 10− 4. It has been used to minimize the total cost function while satisfying reliability constraints at three different levels: R = 0.80, R = 0.85, and R = 0.90. The optimization has been performed over six decision variables: t, n1, c1, c2, c3, and n2. The steps of SA algorithm have been given below.
Step 1: Define the optimization problem and initialize their parameters.
Step 2: Precompute constant values.
Step 3: Initialize the solution vector [t, n1, c1, c2, c3, n2].
Step 4: Set SA parameters (number of iterations, function tolerance).
Step 5: Evaluate the cost function.
Step 6: Stopping criteria (minimal change in cost) is satisfied, otherwise repeat Step 5.
The optimal values of these parameters along with the corresponding total cost for each reliability threshold has been presented in Table 11.
Table 11.
Simulated annealing optimization results.
| Reliability | Release time (in weeks) |
|
|
|
|
|
Total cost |
|---|---|---|---|---|---|---|---|
| 0.80 | 22.48 | 10 | 5 | 60 | 68 | 15 | 17949.74 |
| 0.85 | 23.29 | 10 | 17 | 48 | 66 | 14 | 19549.42 |
| 0.90 | 24.14 | 10 | 11 | 43 | 64 | 19 | 19830.48 |
The results showed that as reliability requirements increase, both the release time and total cost rise. While n1 remains constant, other parameters like n2, c1, c2, and c3 adjust to meet the reliability targets. This highlighted the cost-reliability trade-off and the importance of parameter tuning for optimal software release decisions.
To further illustrate this trade-off in a practical setting, a numerical example has been presented below.
Numerical Example:
Consider a software project where testing is performed in two phases. In the first phase (0 ≤ t ≤ τ), manual testing has been conducted. For τ = 10 days with a fault detection rate of b₁ = 2 faults/day and a unit testing cost of n₁ = 100. Faults detected before the change point have been fixed at a cost of c₁ = 500 per fault. In the second phase (t > τ), automated tests are added, increasing the detection rate to b₂ = 8 faults/day and the fault fixing cost to c₂ = 800 per fault. The risk of undetected faults post-release is accounted for by c3 = 1000 per fault. And the strategy change cost has been modeled as
, with n₂ = 200. Using a Weibull-based testing effort function W(t), the total cost has been computed as C(t) ≈ 76,190. While the resulting reliability R(W) = 0.855 satisfies a pre-specified requirement of R₀ = 0.8. This example demonstrates how the strategy change point τ corresponds to a practical shift in testing approach and how all cost components can be quantified, allowing project managers to determine the optimal release time while achieving the desired reliability.
Conclusion
This study presented an enhanced SRGM incorporating Weibull distribution as testing effort function, change point, and various testing coverage functions in an imperfect debugging environment. The models have been thoroughly evaluated using two real-world datasets and compared against the existing literature. The results consistently showed improved predictive performance of the proposed models across key goodness-of-fit metrics such as MSE, PRR, R2, and PP, for both with and without change point. Notably, the models with change point outperformed those without change point and the existing literature in all the evaluated metrics across both datasets, reinforcing the value of incorporating dynamic fault detection for enhanced reliability modeling. These improvements demonstrated the model’s effectiveness in capturing realistic fault detection dynamics under evolving testing strategies.
In addition to reliability modeling, a detailed cost-based release analysis had been conducted using Simulated Annealing to optimize testing resources under different reliability constraints. The results revealed that increased reliability requirements lead to higher testing costs and extended-release times. This unified approach not only improved model accuracy but also provided actionable insights for cost-effective release planning.
While the proposed framework demonstrated strong performance, future work can extend this model by incorporating environmental and contextual factors, or multiple change points with piecewise defined parameters. Additionally, testing on more diverse datasets and integrating machine learning for parameter tuning can further enhance the model’s applicability.
Appendix 1
From differential equation to MVF
![]() |
17 |
where,
![]() |
18 |
![]() |
19 |
![]() |
20 |
![]() |
21 |
Which gives the following differential equation
![]() |
22 |
Changing the independent variable from t to c
![]() |
23 |
Equating Eqs. (6) and (7) results in
![]() |
24 |
which is a first order ordinary differential equation whose integrating factor is
. On integration it gives
![]() |
25 |
where q is the integration constant. On applying the initial condition
which implies no faults at zero coverage i.e.
gives
![]() |
26 |
Coverage functions that have been employed in this research paper are as follows:
Case 1:
![]() |
27 |
Case 2:
![]() |
28 |
Case 3:
![]() |
29 |
Incorporating change point τ
![]() |
30 |
The uncovered fraction at the change-point has been computed from the coverage function c(W(τ)). For
, additional testing covers part of the remaining uncovered code.
Total uncovered fraction is phase-dependent:
![]() |
31 |
Case-specific adjustments have been done (e.g., scaling factors in S-shaped and logistic coverage) to ensure continuity of the MVF.
Case 1:
![]() |
32 |
Case 2:
![]() |
33 |
Case 3:
![]() |
34 |
Author contributions
Komee: Writing-editing, Writing-original draft, Visualization, Validation, Methodology, Formal analysis, Conceptualization.Bhoopendra Pachauri: Writing-editing, supervision.
Funding
Open access funding provided by Manipal University Jaipur.
Data availability
Data may be obtained on request from the corresponding author.
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Pham, H. System Software Reliability (Springer, 2007).
- 2.Goel, A. L. & Okumoto, K. Time dependent error-detection rate model for software reliability and other performance measures. IEEE Trans. Reliab.28 (3), 206–211 (1979). [Google Scholar]
- 3.Pradhan, V., Dhar, J. & Kumar, V. Testing-effort based NHPP software reliability growth model with change-point approach. J. Inf. Sci. Eng.38 (2), 343–355 (2022). [Google Scholar]
- 4.Dhaka, R., Pachauri, B. & Jain, A. A software reliability growth model for open-source software using sine cosine algorithm. Int. J. Inf. Technol.16 (8), 5173–5181 (2024). [Google Scholar]
- 5.Pradhan, S. K., Kumar, A. & Kumar, V. Modeling reliability-driven software release strategy considering testing effort with fault detection and correction processes: a control theoretic approach. Int. J. Reliab. Qual. Saf. Eng.32 (2), 240002 (2025). [Google Scholar]
- 6.Kapur, P. K., Gupta, A., Shatnawi, O. & Yadavalli, V. S. Testing effort control using flexible software reliability growth model with change point. Int. J. Perform. Eng.2 (3), 245–263 (2006). [Google Scholar]
- 7.Dhaka, R., Pachauri, B. & Jain, A. Parameter Estimation of an SRGM using teaching learning based optimization. Int. J. Inf. Technol.15 (6), 2941–2950 (2023). [Google Scholar]
- 8.Rani, S., Agarwal, P., Jain, M. & Solanki, R. A software reliability growth model considering testing coverage subject to field environment. Int. J. Math. Oper. Res.18 (2), 145–153 (2021). [Google Scholar]
- 9.Song, K. Y., Chang, I. H. & Pham, H. A testing coverage model based on NHPP software reliability considering the software operating environment and the sensitivity analysis. Mathematics7 (5), 450 (2019). [Google Scholar]
- 10.Aggarwal, A. G., Kumar, S. & Gupta, R. Multi-release software reliability assessment: testing coverage-based approach. Int. J. Math. Oper. Res.24 (4), 583–594 (2023). [Google Scholar]
- 11.Kumar, S., Aggarwal, A. G. & Gupta, R. Modeling the role of testing coverage in the software reliability assessment. Int. J. Math. Eng. Manag Sci.8 (3), 504–513 (2023). [Google Scholar]
- 12.Iqbal, J., Nazir, R. & Rasool, T. NHPP-based testing coverage model with fault removal efficiency and error generation. Int. J. Reliab. Qual. Saf. Eng.2450046, 16 (2024). [Google Scholar]
- 13.Chen, J. & Gupta, A. K. Parametric Statistical Change Point Analysis, vol. 192 (Springer, 2000).
- 14.Pradhan, V., Kumar, A. & Dhar, J. Enhanced growth model of software reliability with generalized inflection S-shaped testing-effort function. J. Interdiscip Math.25 (1), 137–153 (2022). [Google Scholar]
- 15.Shrivastava, A. K. & Kapur, P. K. Change-points‐based software scheduling. Qual. Reliab. Eng. Int.37 (8), 3282–3296 (2021). [Google Scholar]
- 16.Chatterjee, S. & Shukla, A. Effect of test coverage and change point on software reliability growth based on time variable fault detection probability. J. Softw.11 (1), 110–117 (2016). [Google Scholar]
- 17.Aggarwal, A., Kumar, S. & Gupta, R. Testing coverage based NHPP software reliability growth modeling with testing effort and change-point. Int. J. Syst. Assur. Eng. Manage.15 (11), 5157–5166 (2024). [Google Scholar]
- 18.Ke, S. Z. & Huang, C. Y. Software reliability prediction and management: A multiple change-point model approach. Qual. Reliab. Eng. Int.36 (5), 1678–1707 (2020). [Google Scholar]
- 19.Samal, U., Kushwaha, S. & Kumar, A. A testing-effort based SRGM incorporating imperfect debugging and change point. Reliab. Theory Appl.18 (1), 86–93 (2023). [Google Scholar]
- 20.Li, Q. & Pham, H. A testing-coverage software reliability model considering fault removal efficiency and error generation. Plos One. 12 (7), e0181524 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Pradhan, S. K., Kumar, A. & Kumar, V. An optimal resource allocation model considering two-phase software reliability growth model with testing effort and imperfect debugging. Reliab. Theory Appl.16 (2), 241–255 (2021). [Google Scholar]
- 22.Chatterjee, S. & Shukla, A. A unified approach of testing coverage-based software reliability growth modelling with fault detection probability, imperfect debugging, and change point. J. Softw. Evol. Process.31 (3), e2150 (2019).
- 23.Bibyan, R., Anand, S., Aggarwal, A. G. & Kaur, G. Multi-release software model based on testing coverage incorporating random effect (SDE). MethodsX1, 102076 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Nazir, R., Iqbal, J., Masoodi, F. S. & Shrivastava, A. K. Developing an innovative imperfect debugging software reliability growth model with enhanced testing coverage strategies. Int. J. Reliab. Qual. Saf. Eng.31 (5), 2450017 (2024). [Google Scholar]
- 25.Nageswari, N., Mahapatra, A. & Mahapatra, G. S. Predictive framework of software reliability analysis under multiple change points and imperfect debugging. Softw. Qual. J.33 (2), 1–18 (2025). [Google Scholar]
- 26.Behera, A. K. & Agarwal, P. A software reliability prediction and management incorporating change points based on testing effort. Reliab. Theory Appl.19 (2), 91–100 (2024). [Google Scholar]
- 27.Pradhan, S. K., Kumar, A. & Kumar, V. Multi release software reliability modelling incorporating fault generation in detection process and fault dependency with change point in correction process. Sci. Rep.15 (1), 23145 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Suman, B. & Kumar, P. A survey of simulated annealing as a tool for single and multiobjective optimization. J. Oper. Res. Soc.57 (10), 1143–1160 (2006). [Google Scholar]
- 29.Yang, H. Balance of mixed flow assembly line based on industrial engineering mathematics and simulated annealing improved algorithm. Results Eng.22, 102071 (2024). [Google Scholar]
- 30.Khurshid, S., Shrivastava, A. K. & Iqbal, J. Effort based software reliability model with fault reduction factor, change point and imperfect debugging. Int. J. Inf. Technol.13 (1), 331–340 (2021). [Google Scholar]
- 31.Kumar, S. & Aggarwal, A. G. Integrating testing coverage, effort and change point in a software reliability growth model: a comprehensive analysis. Reliab. Theory Appl.4 (76), 692–700 (2023).
- 32.Wood, A. Predicting software reliability. Computer29 (11), 69–77 (1996). [Google Scholar]
- 33.Zomaya, A. Y. & Kazman, R. Simulated annealing techniques. In Algorithms and Theory of Computation Handbook General Concepts and Techniques 33–35 (2010).
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Data may be obtained on request from the corresponding author.

































































