Abstract
Health care audits are crucial in managing the government insurance programs that are estimated to have losses amounting to billions of dollars every year. Statistical methods such as sampling have long been used to handle their size and complexity. Sampling from health care claims data can benefit from multi-stage approaches, especially when the evaluation of the tradeoffs between precision and cost is important. The use of decision models could facilitate health care auditors and policy makers make the best use of these sampling outputs. This paper proposes an integrated multi-stage sampling and decision-making framework that enables auditors address the tradeoffs between audit costs and expected overpayment recovery. We illustrate the framework and discuss insights utilizing a variety of overpayment scenarios for payment populations including U.S. Medicare Part B claims payment data.
Keywords: Health care audits, multi-stage sampling, health care fraud, decision analysis, information theory
1. Introduction
Increasing health care costs create financial concerns for policy makers in many countries. For instance, in the USA, annual health care expenditures reached $3.6 trillion or $11,172 per person in 2018, which accounted for of the nation's Gross Domestic Product [7]. Up to 10% of this spending is estimated to be lost as overpayments in the form of fraud, waste and abuse [5]. These health care overpayments range from upcoding to sophisticated kickback networks; see Ekin [14] for an overview of examples. In addition to the adverse direct cost implications on both the government and tax-payers, these overpayments could also result in diminished quality of care and indirect negative health impacts on patients. In order to control and oversee health care spending, most health care insurance programs utilize audits. Traditionally, domain experts would manually investigate the billings and make informed decisions whether they are overpaid or not. Hence, health care audits are costly and time-consuming. The size and complexity of the systems deter health care officials from comprehensive auditing. Therefore, statistical sampling methods have become an integral part of health care audits.
Statistical sampling corresponds to choosing representative subsets of claims of interest to estimate the population parameters. Once health care audit sampling output is retrieved, the current practice mostly includes ad-hoc decision making. Despite the fact that a variety of sampling methods have been proposed, the statistical output has not been incorporated into decision models neither in practice nor the literature. This could decrease the efficiency of analytical methods used within health care fraud assessment. Health care audit decisions are complex and multi-faceted. They can involve consideration of audit costs, expected recovery amounts, accuracy of the extrapolations (overpayment estimation) and potential regret costs. In particular, the audit costs include the investigation time spent by the experts and the physical resources. The recovery costs correspond to the recovery due to audits as well as the projected recovery. Even a small improvement in these audit sampling decisions could be crucial in a multi-billion dollar domain that impacts most, if not all, of the population [13].
1.1. Statistical sampling for health care audits
The general objective of sampling is to make inference about overpayment through variables such as the proportion of overpaid claims and the overpayment amount. In the USA, current governmental sampling guidelines [3] recommend to use the lower limit of a one-sided confidence interval for the total overpayments as the recovery (recoupment) amount from the audited provider. Using the lower bound protects the provider by preventing the recovery of an amount greater than the actual value of erroneous payments with higher confidence. Such overpayment estimation could be challenging for skewed populations, see [12,16,27] for alternative approaches. In practice, Rat-Stats [30] is among the statistical software that is used by the Office of Inspector General, Office of Audit Services to help with the statistical auditing. Woodard [37] presents the use of health care audit sampling practices by U.S. governmental programs.
The statistical sampling methods include but are not limited to simple random sampling, stratified and multi-stage sampling methods. Among these, simple random sampling is the most popular due to its ease of use and communication. Stratified sampling could be preferred when the auditor has information and/or interest about population subgroups. It separates the population into mutually exclusive, homogeneous segments (strata) by a stratification variable, and then draws samples from each segment (stratum). The weighted stratum overpayment estimates would yield a total overpayment estimate. Given a fixed pre-determined sample size, stratified sampling could result in a smaller margin of error compared to simple random sampling [8]. Having said that stratified sampling designs need to be constructed carefully, which include the choice of number of strata, the stratum boundaries, and the sample sizes of each stratum. There are a number of strategies that can be used to determine the stratum boundaries [20]. These include but not limited to Neyman allocation [29], cumulative square root rule [9] and the stratification approach [24]. For instance, Neyman allocation is based on considering the variance of the estimates within all strata.
The complexity of the health care claims data can benefit from multi-stage approaches, especially when the evaluation of the tradeoff between precision and cost is important. Rat-Stats allows the iterative use of sampling methods. Ignatova and Edwards [21] propose a multi-stage framework in that the total sample size is determined by evaluating the number of overpaid claims in a probe sample. Musal and Ekin [28] introduce an iterative stratified sampling method that uses Lindley's entropy measure to evaluate the expected information of the prospective samples. They use the 10th percentile of the posterior distribution of the total number of overpaid claims in the population to compute the recovery estimate. Their information-theoretic approach is shown to decrease estimation errors for heterogeneous medical claims data. However, their method is based on the estimation of a percentile of distribution and they employ it only for a known initial sample size.
1.2. Decision models for healthcare audits
Decision models have been used in health care services and administration extensively. The successful implementations include but are not limited to supporting management decisions in healthcare organizations, evaluation of health care providers and helping physicians to identify effective treatments [17]. However, the use of decision models to help subject domain experts remains to be an understudied step in the overall health care fraud assessment procedures [15]. The method of Iyengar et al. [22] is among the limited decision model-related work in health care fraud assessment. They propose a rule generation method for identifying and ranking candidate audit targets from a database of prescription drug claims. The modeling of the tradeoff between costs and recovery amounts in health care audits is discussed by Ekin [13], albeit without any numerical analysis. The need for measures to determine the effectiveness of the analytical methods has also been highlighted by the U.S. Governmental Accountability Office [18].
Health care fraud detection and the respective audit activities could benefit from decision models for better utilization of scarce resources. Such health care resource allocation methods have been applied in other domains of healthcare [1,38]. In the general fraud detection literature, a number of decision-theoretic approaches are considered within the financial and auto insurance industries. The evaluation of the tradeoffs among the cost of false positives and negatives is crucial and cost-based metrics such as ROC analysis are utilized [31]. Dionne et al. [11] propose a scoring based auditing approach in auto insurance fraud cases. Ulvila and Gaffney [36] present an integration of ROC and cost analysis to develop an expected cost metric for evaluating computer intrusion detection systems. Sahin et al. [32] propose a cost-sensitive decision tree approach for credit card fraud detection. Torgo and Lopes [35] address the prioritization of investigation leads for efficient allocation of audit resources. Their utility-based fraud detection model is based on using the likelihood of fraud, inspection costs and expected payoff.
1.3. Motivation, contribution and overview
This paper aims to fill the gap in health care audits literature and practice by proposing an integrated decision-making framework that utilizes the output of multi-stage sampling framework. This will facilitate health care audits and help policy makers. The proposed decision models consider initial and additional audit costs and expected recovery amount as well as a budget constraint. This framework enables auditors to analyze the resource allocation tradeoffs between audit costs and expected overpayment recovery while considering the impact of statistical learning within a predetermined budget.
To the best of our knowledge, other than a few simple rule-based frameworks, no decision models are utilized in the practice of health care audit sampling. In addition, none of the existing work in the literature has studied the impact of learning within the data-driven sampling decision model. This paper uses the entropy-based information quantification to determine the next stratum to sample from within a multi-stage sampling framework. It differs from the sampling method of [28] by utilizing Lindley's entropy for a set of decision alternatives and retrieving the overpayment recovery estimate with a different estimation method that is based on the confidence interval approach using the standard deviation and the mean of sample overpayments. This innovative application provides a semi-automated decision-making alternative within the health care fraud assessment systems that can help the auditors utilize the scarce audit resources efficiently. The proposed models are general in that it could be used within other audit sampling settings.
The paper is organized as follows. The following section presents the proposed multi-stage sampling method and the decision models. Section 3 presents the payment and overpayment data, while Section 4 illustrates the use of the integrated decision making and statistical framework with an analysis. The paper concludes with an overview and a discussion of future research directions. A brief Appendix presents complementary statistics of the data.
2. Methods
2.1. Multi-stage sampling framework
The information-theoretic multi-stage stratified sampling method used in this paper utilizes Neyman allocation for the initial allocation and Lindley's entropy measure for additional allocation. Neyman allocation is based on constructing homogeneous strata by the use of the mean and the variance of each stratum. Initial allocations are done so that the within variances among all strata are minimized. However, it does not consider the whole information content of the prospective samples. In order to quantify the available information, entropy has been long utilized in a number of different disciplines [33]. In this paper, we use Lindley's entropy interpretation [25]. In particular, entropy corresponds to the uncertainty summary about a probability distribution and is defined as the expected value of the log probability distribution [34]. We compute the information content of the current and prospective samples to determine the stratum that provides the highest expected information gain. This framework is based on quantifying the uncertainty of the number of overpaid samples, which is summarized by the derived Beta-binomial distribution. In order to satisfy the governmental guidelines, the left tail of a 90% confidence interval of the estimated mean overpayment amount is computed as the recovery amount. This framework enables us to compute the expected recovery amount estimates for all the decision alternatives of sample allocations to be used as inputs for the proposed decision models.
In the following, the details of the utilized sampling method are presented. First, we introduce the notation. is the number of claims in stratum h, which adds up to the total size of payment population as in . Similarly, and correspond to the number and proportion of overpaid claims in stratum h. The payment and overpayment amounts of claims in stratum h are represented by the vectors and , respectively. Once a sample of size n is retrieved and allocated, and denote the sample payment and overpayment amounts of claims in stratum while refers to the number of overpaid claims in stratum h with size .
Next, we present how the initial and additional allocation of claims and the respective payment values are done. As part of the initial allocation, number of claims are allocated to strata based on the mean and standard deviation of payment and , for each stratum, . The initial sample size for stratum h is written as
(1) |
where . For allocation of additional samples, Lindley's entropy is used. In general, for stratum h, we would like to obtain , the expected information gain of observing an additional overpaid sample claim, , from a pre-determined size of additional claims, given the number of overpaid claims in stratum h, . In doing so, we will utilize the distribution of after already having observed claims with over-payments. For clarity, the stratum notation h is suppressed from the following equations:
Since ; this could be rewritten as
(2) |
The probability distribution of k among n claims is Binomial with parameters : . We define the prior distribution of ρ with a Beta distribution with the parameter vector . The parameter vector ; n, and are suppressed for notational clarity. The parameters, α and β are updated after observing data: and , respectively. The hyper-parameter values a and b are set to be 0.01 and 0.01, which corresponds to a weakly informative distribution. These are determined such that there exists a non zero probability of a claim being legitimate or fraudulent while having high sensitivity to the observed data. The posterior distribution of is therefore defined as Beta-binomial due to conjugacy. Similarly, the posterior distribution of is retrieved as a Beta-binomial distribution:
(3) |
These distributions are used within the following sampling framework to retrieve the overpayment recovery estimate as a function of decision alternatives of the initial and additional allocation sample sizes. The steps of the utilized sampling framework are:
For each stratum, use an initial Neyman allocation scheme and retrieve overpayments among n sampled claims.
The expected information gain of an additional sample, from stratum h, with the outcome , is computed by using Equation (2.1) for each stratum.
Determine , the stratum with the highest expected information gain and sample from that stratum.
The expected total overpayment amount, is computed given the sample mean overpayment for each stratum, , : .
The pooled standard deviation estimate for the total overpayment amount is computed as . The overpayment standard deviation estimate of each stratum h is . The sample mean of overpayment proportion is whereas its sample variance is denoted as , and is the payment standard deviation.
- Finally, the overpayment recovery estimate, is computed as
(4)
As a result, we retrieve the total overpayment recovery estimate, , as a function of and to be used within the proposed decision models.
2.2. Integrated decision models
We present two audit sampling decision models to be used depending on the state of the audit. The first model is proposed for cases where the auditor has only determined the provider of interest, and has access to the related payment data. This model enables auditors to consider the tradeoffs between audit costs and expected recovery while deciding how to allocate the sampling resources among the initial and potential additional investigations within the budget. The second model solves for the decision problem where the objective is to find the optimal additional sample size for a given initial sample.
Model 1 is written as
(5) |
The objective of this simple optimization model is to minimize the total audit cost and maximize the expected recovery gain within a given budget. The objective function could be vaguely referred to as net gain. The decision variables are the initial and additional number of claims to sample; and .
Total audit costs correspond to the initial and additional sample resource allocation. The additional samples generally cost more than the initial samples, taking into account the extra investigation setup after the initial allocation. Hence, the unit audit cost for initial investigations is assumed to be less than the unit audit cost for additional investigations; . The total audit cost, , cannot exceed the total audit budget, B.
The recovery consists of both the recovery due to audits and expected recovery from the population. The sample overpayment value is generally requested in full from the investigated provider as a result of the audits. The average recovery due to audits is estimated as the average sample overpayment, . The demanded refund from the provider under investigation is . The expected recovery, , is computed according to the governmental guidelines as part of the sampling framework. It is a function of the decision alternatives from set A ; as in . In practice, the government is only able to recover a certain percentage of the inferred population overpayments. Thus, the recovery gain from the population is discounted by using a recoupment percentage, r; so the gain due to recovery can be written as .
In this model, the objective function consists of . The expectation function is used to consider the randomness in the drawn samples and the resulting sample overpayment values. and are the random variables in our model, and differ in a given replication. While solving for the model, we use Monte Carlo simulation with a reasonably large number of replications to estimate the expectation function.
Model 2 could be considered as a subset of Model 1. It is based on maximizing expected utility given the already allocated initial sample size: . It focuses on the tradeoffs between the cost and expected recovery involved with choosing the additional sample size. The functional details are as same as Model 1, therefore excluded from the discussion for brevity.
3. Health care claims data
This paper utilizes two payment populations to demonstrate the versatility of the framework. The first payment population is compiled by Musal and Ekin [28], using the publicly available data from 2008 CMS Outpatient Procedures [4]. It includes the procedure codes with frequent overpaid billings in investigations such as the billings for ‘J9041’ (Injection of Bortezomib 0.1 mg). This data set consists of 8278 claims with payment values. The actual allocation of payments to L = 5 strata is done using the R package GA4Stratification [23] so that each stratum consists of increasing dollar amounts of payments. Table 1 presents the descriptive statistics of payment values.
Table 1.
Descriptive statistics of real world payment population.
Stratum | Mean | Std. dev. | Median | |
---|---|---|---|---|
h = 1 | 69.40 | 46.92 | 40.00 | 3949 |
h = 2 | 915.69 | 180.87 | 70.00 | 1402 |
h = 3 | 2335.37 | 244.61 | 2500.00 | 588 |
h = 4 | 3076.60 | 206.76 | 3000.00 | 1675 |
h = 5 | 4012.65 | 246.71 | 4100.00 | 664 |
Overall | 1298.47 | 1441.22 | 600.00 | N = 8278 |
We simulate another payment population which has left skewness, see Figure 1 for the density plot. Left skewness is oft seen in health care claims payment data where providers may upcode the system. Our left skewed population has the same population size as the real life data. It has lower variance and correspond to relatively higher payment values compared to the real world payment population. Table 2 presents the descriptive statistics of the left skewed payment population.
Figure 1.
The density plot of the left skewed payment population.
Table 2.
Descriptive statistics of left skewed payment population.
Stratum | Mean | Std. dev. | Median | |
---|---|---|---|---|
h = 1 | 7390.03 | 474.59 | 7523.10 | 527 |
h = 2 | 8274.94 | 195.62 | 8296.77 | 1351 |
h = 3 | 8844.99 | 145.28 | 8849.42 | 1970 |
h = 4 | 9286.63 | 114.38 | 9289.63 | 2329 |
h = 5 | 9679.78 | 127.71 | 9665.91 | 2101 |
Overall | 8995.53 | 672.08 | 9125.18 | N = 8278 |
3.1. Overpayment data
While using the proposed framework in an actual audit, the decision makers would have access to the audit results including overpayment values. However, such audit results and overpayment data are not publicly available due to privacy concerns. In addition, the proposed general decision making framework should be tested for different overpayment patterns in order to demonstrate its versatility. Therefore, we resort to simulation and construct a number of overpayment scenarios. The overpayment amounts of populations, for each stratum, are retrieved.
We assume that all claims are either fully legitimate or overpaid, similarly to the existing literature, i.e. [12]. For each scenario, the overpayment data is generated by using the overpayment proportion parameter for stratum h and ith scenario. In order to represent various scenarios, we utilize seven scenarios as presented in Table 3.
Table 3.
Overpayment proportions for each stratum and scenario.
Scenario | h = 1 | h = 3 | h = 4 | h = 5 | ||
---|---|---|---|---|---|---|
i = 1 | 0.00 | 0.25 | 0.50 | 0.75 | 1.00 | |
i = 2 | 1.00 | 0.75 | 0.50 | 0.25 | 0.00 | |
i = 3 | 0.05 | 0.25 | 0.50 | 0.75 | 0.95 | |
i = 4 | 0.95 | 0.75 | 0.50 | 0.25 | 0.05 | |
i = 5 | 0.50 | 0.50 | 0.50 | 0.50 | 0.50 | |
i = 6 | 0.25 | 0.25 | 0.25 | 0.25 | 0.25 | |
i = 7 | 0.75 | 0.75 | 0.75 | 0.75 | 0.75 |
The first two scenarios have cases that represent differing proportions of overpaid claims in each stratum. For instance, the first and fifth strata for the second overpayment scenario has and . This means all claims in the first stratum are fraudulent indicating that low payment claims are fraudulent in that particular scenario. Whereas all high payment claims (which correspond to the fifth stratum) are fully legitimate. On the other hand, scenarios 5, 6 and 7 have the same overpayment proportions throughout all strata.
Using these choice of parameters, we run a simulation of size 5000, and record the averages of overpayment data for each payment population and overpayment scenario. This enables us to consider different settings, and demonstrate the versatility of the proposed models. The average descriptive statistics per claim of the total overpayment data for the real world and simulated payment populations are presented in the Appendix.
4. Application
This section mainly demonstrates the application using the simulated left skewed payment population. First, we go through the steps of the integrated statistical sampling and decision making process to illustrate the learning aspect. We discuss expected net gains based on decisions involving an auditing budget. Within our framework a decision involves an initial and an additional sample size whose overall auditing costs compose the budget. The alternatives for the initial sample size decision should be large enough for proper implementation of Central Limit Theorem and stratified sampling. It is difficult to recommend a general region for the decision search space of the sample size, as it depends on the characteristics of the population. These are generally determined by the auditors. One practical issue to consider is the search size of at each iteration. In this particular paper, we consider increments of 1 which corresponds to investigating the impact of sampling a new claim for each decision alternative of initial sample size. This provides the most general and flexible application of the proposed decision framework. In practice, it might be preferred to sample a fixed amount of additional claims, say 5, all at once for convenience. This can be accommodated by the proposed iterative decision-making framework. Next, we present evidence of the statistical validity of the results. Finally, we conduct a sensitivity analysis with respect to varying parameters, and discuss the changes in outcomes for various overpayment scenarios and payment populations.
First, we present the step by step demonstration of the multi-stage sampling algorithm for a particular alternative of the initial sample size using a randomly chosen replication. For demonstration, we choose the initial sample size as 45, and increase additional sample size with increments of 1 up to 90. Table 4 presents the output of the iterative allocation of the additional samples for some iterations as well as the initial and final allocation among strata.
Table 4.
One replication of the multi-stage sampling framework evolution for left skewed payment population and overpayment scenario 2.
Iteration | h = 1 | h = 2 | h = 3 | h = 4 | h = 5 |
---|---|---|---|---|---|
Initial allocation | 8 | 9 | 10 | 9 | 9 |
t = 1, n = 46 | 0 | 0 | 0 | 1 | 0 |
t = 2, n = 47 | 0 | 0 | 0 | 1 | 0 |
t = 3, n = 48 | 0 | 1 | 0 | 0 | 0 |
t = 4, n = 49 | 0 | 1 | 0 | 0 | 0 |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
t = 42, n = 87 | 0 | 1 | 0 | 0 | 0 |
t = 43, n = 88 | 0 | 0 | 0 | 1 | 0 |
t = 44, n = 89 | 0 | 0 | 1 | 0 | 0 |
t = 45, n = 90 | 0 | 0 | 1 | 0 | 0 |
Final allocation | 8 | 14 | 34 | 25 | 9 |
Initially, at t = 0, Neyman allocation provides the initial allocation of samples to 5 strata. The remaining allocations are done with respect to the expected amount of information to be gained from the next sampling iteration. 8 claims are allocated to the first stratum at t = 0. In overpayment scenario 2, the probability of overpayments in the first stratum is 1. Therefore, the algorithm concludes that there is no more information to be gained by sampling from that stratum. That stratum is not allocated any other samples during the remaining iterations. Similarly, the fifth stratum has no overpayments in the population, hence no more samples are allocated. On the other hand, there is a larger number of samples allocated to strata 3 and 4 since there is more information to be gained by sampling from these strata. This indicates learning and results in a decreased standard error for these strata in the final recovery dollar amount.
The proposed statistical method facilitates iterative learning, which differentiates it from other multi-stage sampling frameworks such as Neyman allocation. Table 5 provides a comparison of sampling resource allocation between the Neyman allocation and the utilized entropy-based method. The impact of learning within the ultimate sampling allocation could be recognized from this output. While the entropy-based method does not allocate any more samples to the first and fifth strata after the very initial sample allocation, the Neyman allocation keeps allocating many samples to those strata due to variance.
Table 5.
The sample allocation of Neyman allocation vs information theoretic sampling for left skewed payment and overpayment scenario 2.
Algorithm | h = 1 | h = 2 | h = 3 | h = 4 | h = 5 | Total |
---|---|---|---|---|---|---|
Information theoretic | 8 | 14 | 34 | 25 | 9 | 90 |
Neyman allocation | 20 | 17 | 16 | 17 | 20 | 90 |
The outcomes of the 5000 replications of the sampling method are used as inputs for the decision models. For the decision models, we choose the unit sampling costs as $1000 and $3500 for demonstration purposes. The sampling costs include but are not limited to the investigation setup, salary and expenses of the auditor. The recoupment percentage, r is assumed to be 0.1, using the information in the Annual Report to Congress on the Medicare and Medicaid integrity programs for fiscal years 2013 and 2014 [6]. In particular, 490 billion dollars were spent on Medicare in 2014. Assuming 10 % overpayment rate, out of 49 billion dollars overpayments, only 4.765 billion dollars were recovered. This approximately corresponds to a recovery rate of 10 cents for each overpaid dollar. The budget, B is set as 1,000,000. All parameters can be modified depending on the nature of the medical audit.
First, we utilize the proposed Model 1 to determine the tradeoffs between all decision alternatives when the analyst does not have any prior allocation information. For illustration, we have defined the discrete set of decision alternatives, A in the ranges of for and for with increments of 1. For instance, for the left-skewed payment population and overpayment scenario 2, the optimal decision is found to be . This provides the auditor with the best resource allocation within the budget when he/she does not have any information with regards to the initial sample size. On the other hand, Model 2 can be utilized for any given initial sample size. For instance, in the following, we share the optimal solutions of Model 2 given four alternatives of initial sample sizes for the left-skewed payment population and overpayment scenario 2. In particular, the initial sample sizes of 15, 30, 45 and 60 are already allocated to 5 strata. For this pre-determined initial sample size and cost assumptions involving the initial and added samples, we obtain the additional sample size for which the net gain is maximized.
Table 6 shows the total sample allocation decisions that provide the optimal net gain for each of the selected decision alternatives of initial sample sizes. These are 29 ( 15 + 14 ), 57 ( 30 + 27 ), 60 ( 45 + 15) and 70 ( 60 + 10), respectively. The higher number of additional samples do not necessarily result in higher net gains. For instance, in the case of an initial sample size of 60; the net gain becomes the highest when 10 additional samples are allocated. The cost of additional sampling units is not justified by the potential additional expected recovery that is provided by the learning from samples. It should be noted that in the unlikely case of negligible sampling costs; sampling more could be preferred since that would decrease the standard error and improve estimation; and recovery amount would be closer to the actual population overpayment. Figure 2 further illustrates the proposed model where every circle represents the average net gain for each of the decision alternatives of additional samples. The trends were smoothed out by using the Loess curves to help recognize concave behaviour in some cases.
Table 6.
Optimal additional sample allocation decisions for select decision alternatives of initial sample sizes.
14 | 27 | 15 | 10 |
Figure 2.
Expected net gain (y-axis, in 1000 s) for additional allocation and total sample size alternatives (x-axis). In particular, from top left to bottom right, additional claims up to the sizes of are added with increments of one to samples with the initial claim sizes of , to result with a total number of audited claims up to maximum values of , respectively.
Next, we present a discussion based on the fitted Loess smoothing curve for the 45 additional samples taken on top of the initial sample of 45. For this scenario, the total population overpayment is computed as $26,398,584. The total overpayment recovery is estimated to be between $20,595,164 to $22,390,622 for total sample sizes ranging from 46 to 90, as standard error varies between $2,936,329 and $4,107,211.
Figure 3 provides the expected net gain for the decision alternatives of additional allocation. The expected net gain on the y-axis of this plot is the difference between extrapolated recovery amount and the sampling cost. The initial steep increase is followed by a flattening out and a steep decrease of the expected net gain. Two vertical lines are drawn to illustrate how the expected net gain differs with respect to the decision of additional sample size. If 50 claims (5 additional claims in addition to the initial 45) are used, it is quite clear it would lead to a sub-optimal solution. The preferred solutions seem to concentrate between 60 and 70 claims. Increasing sample size decreases the standard error of the recovery at a diminishing rate.
Figure 3.
Expected net gain (y-axis, in 1000 s) for select decision alternatives (x-axis). In particular, additional claims up to a size of 45 are added with increments of one to a sample with the initial claim size of 45 to result with a total number of audited claims up to maximum values of 90.
These results show the importance of performing cost analysis for the decision maker to determine the optimal sampling resource allocation. This is where the proposed decision-modeling framework becomes beneficial. The proposed decision model allows the decision maker to assess tradeoffs with respect to the cost and learning due to the additional samples and its impact on expected net gain. This is the main contribution of the proposed integrated decision making and sampling framework.
It is crucial that sampling output and overpayment estimates are statistically valid to be used in officially acceptable health care audits. Since we use the officially accepted lower limit of a one-sided confidence interval for the total overpayments as the recovery (recoupment) amount from the audited provider, the coverage probabilities are expected to be around . In order to assess that, Figure 4 displays the average coverage probabilities for the aforementioned case. The probabilities for almost all additional sample sizes are around line. If they get higher compared to that line, they are deemed as inefficient. Whereas if they are lower than 90 %; they have the risk of not having enough coverage which indicates statistical invalidity. For brevity, we do not provide the coverage probabilities for other cases. They are reasonable and available on request.
Figure 4.
Expected coverage probability (y-axis) for select decision alternatives (x-axis). In particular, additional claims up to a size of 45 are added with increments of one to a sample with the initial claim size of 45 to result with a total number of audited claims up to maximum values of 90.
4.1. Sensitivity analysis
So far, we have presented the results for a particular case and set of parameters. In the following, we will share results for different parameter values to assess the sensitivity. First, we run a sensitivity analysis for the unit cost of additional samples; .
Table 7 gives the optimal allocation for different values of for initial allocation of 45. The inverse relationship between and can easily be recognized. When it becomes very costly to draw additional samples, i.e. ; the optimal additional sample size decreases to 11, whereas when there is no additional cost other than the regular sampling cost, i.e. ; the optimal additional sample size is found as 44.
Table 7.
Optimal additional sample sizes for initial allocation of 45 with varying .
$1000 | $2250 | $3500 | $4750 | $6000 | |
---|---|---|---|---|---|
44 | 44 | 15 | 15 | 11 |
Next, we analyze the sensitivity of the results for different overpayment scenarios. In doing so, we provide a comparison of the integrated decision and sampling framework for decision alternatives of initial sample sizes of 15, 30, 45 and 60 for all overpayment scenarios and left-skewed payment population.
Table 8 reveals the differences in the optimal allocation for various overpayment scenarios. For instance, the differences for the optimal decisions of Scenario 6 compared to other scenarios can be recognized. This could potentially be explained by the relatively low and equal probability of overpayments across all strata. Hence, the additional samples may not provide as much as value to justify their cost. The additional allocation also highly depends on the initial sample size. For the lower initial allocation alternatives such as 15, initial allocations result in large standard errors which in turn increases optimal additional allocation, even in scenario 6. Whereas when the initial sample allocation is 60, the optimal additional allocation values are not high compared to the initial allocation for any of the scenarios.
Table 8.
Optimum additional sample sizes for various overpayment scenarios and the left-skewed payment population.
Scen 1 | Scen 2 | Scen 3 | Scen 4 | Scen 5 | Scen 6 | Scen 7 | |
---|---|---|---|---|---|---|---|
15 | 15 | 14 | 14 | 15 | 14 | 12 | 15 |
30 | 23 | 27 | 29 | 29 | 28 | 1 | 30 |
45 | 29 | 15 | 37 | 12 | 43 | 2 | 43 |
60 | 33 | 10 | 35 | 10 | 17 | 1 | 35 |
Lastly, we replicate the analysis for the real world payment population. Table 9 presents the final sampling resource allocations for all overpayment scenarios. Figure 5 presents the allocation and its impact on expected net gain for the four alternatives of initial sample allocation.
Table 9.
Optimum additional sample size for various overpayment scenarios for the real world payment population.
Scen 1 | Scen 2 | Scen 3 | Scen 4 | Scen 5 | Scen 6 | Scen 7 | |
---|---|---|---|---|---|---|---|
15 | 14 | 15 | 14 | 15 | 1 | 15 | |
30 | 19 | 29 | 9 | 29 | 1 | 30 | |
34 | 9 | 31 | 3 | 37 | 1 | 38 | |
21 | 3 | 27 | 1 | 14 | 1 | 42 |
Figure 5.
Expected net gain (y-axis, in 1000 s) for additional allocation and total sample size alternatives (x-axis) for the real world payment population. In particular, from top left to bottom right, additional claims up to the sizes of are added with increments of one to samples with the initial claim sizes of , to result with a total number of audited claims up to maximum values of , respectively.
The results for the first payment (real-world) population are slightly different, but are still in line with the earlier findings. The resource allocation differs with respect to the initial sample size and the overpayment scenario. The results of the tradeoff between investigation costs and expected recovery amounts becomes more evident for overpayment scenarios with a low probability of overpaid claims. Similar to the previous findings, this is clearly demonstrated for the 6th overpayment scenario, where the population overpayment probabilities are fixed at 25%. The additional sampling allocation is shown to have the least amount of benefit on expected net gain despite the improvements in expected recovery estimation.
Overall, the proposed frameworks enable auditors to assess the tradeoffs between costs and expected recovery amount while potentially considering the budget constraint. In particular, the tradeoff between the cost of drawing an additional sample and improvement in expected recovery due to learning and decreasing standard error is crucial. Additional sampling can be preferred despite its higher unit cost when the expected recovery becomes higher due to learning. The relative advantage of the information-theoretic method is smallest in the cases of small additional sample sizes especially for lower sizes of initial allocations. The monetary gain contribution is shown to decrease with increasing total sample size. It should be emphasized that since auditors generally utilize relatively small samples for medical audits, even small improvements in sampling designs are crucial.
5. Conclusion
The size and heterogeneity of medical claims data prompt the use of statistical and decision modelling tools to aid in medical audits. Although the statistical and data mining tools have been adopted widely in the last three decades, decision models have not been incorporated to the health care audit practice. This creates inefficiency in handling health care overpayments, which are estimated to be in billions of dollars. This paper fills that gap by presenting an integrated medical audit sampling decision analytics framework. Our decision-making framework builds on the output of information-theoretic multi-stage stratified sampling approach and ensures the overpayment amount estimates are statistically valid. In doing so, the impact of learning within the data-driven statistical model on the decision model is evaluated.
The proposed multi-stage sampling framework presents overpayment recovery estimates that are valid with respect to governmental guidelines. This output is utilized by the proposed decision models, and in turn, enable health care policy decision makers and auditors assess the tradeoffs between the audit costs and expected overpayment recovery for decision alternatives of initial and additional sample sizes. In cases where the auditor is in the beginning of an investigation, Model 1 could help with both sample resource allocation decisions at the initial and possibly additional phases. Model 2 assumes that the initial allocation is already done, and lets the decision maker choose the size of the additional sampling allocation. In cases where the auditor is not interested in additional samples, which is generally the practice of choice, the models help the auditors evaluate potential inefficiencies as a result of their decisions. Payment data from the U.S. Medicare Part B claims and simulated left-skewed data are used for various overpayment patterns to demonstrate the use of proposed decision models and to illustrate tradeoffs.
Recent advances in information technology allow wider adoption of statistical and decision models to deal with challenges in the health care industry. This could make semi-automated decision-making frameworks a viable option for health care audit sampling. This innovative application presents great potential to improve the efficiency of health care audit sampling decision making. Given the scarce nature of health care fraud assessment resources, even a small improvement would correspond to gain of millions of dollars as well as the improvement in public confidence to health care audit systems. The proposed framework is general in that it could be used within other audit sampling settings such as tax audits.
There are a number of limitations and potential future research directions. The extent of sensitivity analysis can be improved. For instance, the impact of changes for the unit recoupment percentage could be further analyzed. A more comprehensive decision-making framework could be designed over multi-stages in which the initial investigation decision at the first stage can impact the potential decisions at later stages. An alternative version could include using the entire distributions of the number of overpaid claims and recovery amount in a fully Bayesian framework.
In terms of the decision model, other decision criteria such as the potential regret cost due to overpayment estimation errors could be considered. The estimation errors decrease with larger sample sizes. Such incorporation of regret cost could be feasible for cases with known overpayment patterns. In addition, the increments in the decision alternatives of the additional sample size, , could be generalized by using any positive integer value instead of the value one as in our application. Our choice of decision alternative increments of 1 allows maximum flexibility in the decision-making framework realistically assuming the investigation of claims one at a time. However, for ease of implementation, the auditors may prefer to run the proposed framework for higher alternative values of the additional sample size.
Lastly, the proposed decision-making framework is compatible with alternative multi-stage sampling and estimation frameworks. Potential alternatives could include the proposal of a sequential interval estimation procedure, see [19,26] for overviews. For instance, Chattopadhyay and De [2] propose interval estimation of the Gini index with a specified confidence coefficient and a specified margin of error. They establish an iterative scheme where variance and confidence interval are recomputed at each iteration using the increased sample size, until the pre-specified confidence interval width is obtained. This could be modified to construct an alternative method for computing the required sample size. In cases where the claims data can be grouped by procedure codes to form clusters, sequential estimation frameworks as in [10] could also lead to more efficient sampling procedures.
Acknowledgements
This work was supported by the Texas State University 2019/2020 Faculty Development Leave and Presidential Research Leave Award. It was also partially supported by the National Science Foundation under Grant DMS-1638521 to the Statistical and Applied Mathematical Sciences Institute. We thank the attendees of 52th Hawaii International Conference on System Sciences and INFORMS 2018 Annual Meeting for their constructive comments.
Appendix.
Overpayment data
Tables A1 and A2 present the average descriptive statistics per claim of the total overpayment data for the real world and simulated payment populations, populations 1 and 2, respectively. The differences in the descriptive statistics between scenarios and the payment populations can be recognized easily. In order to provide a better understanding of the overpayment data, the descriptive statistics of a particular scenario are presented in Table A3.
Table A1.
Overpayment population average descriptive statistics per claim generated for payment population 1 and each overpayment scenario.
Scenario | Mean | Std. dev. | 2.5% | 25% | Median | 75% | 97.5% |
---|---|---|---|---|---|---|---|
i = 1 | 910 | 1464 | 0 | 0 | 0 | 2469 | 4300 |
i = 2 | 388 | 800 | 0 | 0 | 50 | 150 | 3497 |
i = 3 | 896 | 1451 | 0 | 0 | 0 | 2383 | 4300 |
i = 4 | 403 | 833 | 0 | 0 | 50 | 150 | 4299 |
i = 5 | 649 | 1208 | 0 | 0 | 9 | 629 | 4300 |
i = 6 | 324 | 914 | 0 | 0 | 0 | 11 | 4300 |
i = 7 | 974 | 1369 | 0 | 8 | 90 | 2251 | 4300 |
Table A2.
Overpayment population average descriptive statistics per claim generated for payment population 2 and each overpayment scenario.
Scenario | Mean | Std. dev. | 2.5% | 25% | Median | 75% | 97.5% |
---|---|---|---|---|---|---|---|
i = 1 | 5667 | 4568 | 0 | 0 | 8951 | 9504 | 9992 |
i = 2 | 3344 | 4179 | 0 | 0 | 0 | 8410 | 9494 |
i = 3 | 5572 | 4577 | 0 | 0 | 8901 | 9484 | 9992 |
i = 4 | 3439 | 4220 | 0 | 0 | 0 | 8467 | 9966 |
i = 5 | 4505 | 4529 | 0 | 0 | 3433 | 9143 | 9990 |
i = 6 | 2254 | 3918 | 0 | 0 | 0 | 3569 | 9986 |
i = 7 | 6758 | 3943 | 0 | 3249 | 8849 | 9383 | 9991 |
Table A3.
Overpayment average descriptive statistics per claim for each stratum generated for payment population 1 and overpayment scenario 3.
Stratum | Mean | SD | 2.5% | 25% | Median | 75% | 97.5% |
---|---|---|---|---|---|---|---|
h = 1 | 52 | 51 | 0 | 6 | 40 | 78 | 347 |
h = 2 | 687 | 426 | 0 | 223 | 950 | 1000 | 1486 |
h = 3 | 1751 | 1033 | 0 | 860 | 2397 | 2500 | 2500 |
h = 4 | 2306 | 1345 | 0 | 1368 | 2900 | 3229 | 3500 |
h = 5 | 3010 | 1750 | 0 | 1857 | 3871 | 4200 | 4300 |
Funding Statement
This work was supported by the Texas State University 2019/2020 Faculty Development Leave and Presidential Research Leave Award. It was also partially supported by the National Science Foundation under Grant DMS-1638521 to the Statistical and Applied Mathematical Sciences Institute.
Disclosure statement
No potential conflict of interest was reported by the author(s).
References
- 1.Bastian N.D., Ekin T., Kang H., Griffin P.M., Fulton L.V. and Grannan B.C., Stochastic multi-objective auto-optimization for resource allocation decision-making in fixed-input health systems, Health Care Manage. Sci. 20 (2017), pp. 246–264. [DOI] [PubMed] [Google Scholar]
- 2.Chattopadhyay B. and De S.K., Estimation of Gini index within pre-specified error bound, Econometrics 4 (2016), pp. 30. [Google Scholar]
- 3.CMS , Program memorandum carriers transmittal b-01-01, 2001. Available at rb.gy/cpolor (accessed 3 March 2019).
- 4.CMS , Basic Stand Alone (BSA) Medicare claims public use files (PUFs), 2010. Available at https://www.cms.gov/bsapufs (accessed 7 November 2019).
- 5.CMS , Medicare fee for service 2014 improper payments report, The Centers for Medicare & Medicaid Services, 2015. Available at https://go.cms.gov/2yt1Dzc (accessed 1 September 2019).
- 6.CMS , Annual Report to Congress on the Medicare and Medicaid integrity programs for fiscal years 2013 and 2014, The Centers for Medicare & Medicaid Services, 2016. Available at rb.gy/rcjhcf.
- 7.CMS , CMS Office of the Actuary releases 2018 national health expenditures, The Centers for Medicare & Medicaid Services, 2019. Available at https://www.cms.gov/newsroom/press-releases/cms-office-actuary-releases-2018-national-health-expenditures (accessed 7 December 2019).
- 8.Cochran W.G., Sampling Techniques, John Wiley & Sons, Hoboken, NJ, 2007. [Google Scholar]
- 9.Dalenius T. and Hodges Jr J.L., Minimum variance stratification, J. Am. Stat. Assoc. 54 (1959), pp. 88–101. [Google Scholar]
- 10.Darku F.B., Konietschke F. and Chattopadhyay B., Gini index estimation within pre-specified error bound: Application to Indian household survey data, Econometrics 8 (2020), pp. 26. [Google Scholar]
- 11.Dionne G., Giuliano F. and Picard P., Optimal auditing with scoring: Theory and application to insurance fraud, Manage. Sci. 55 (2009), pp. 58–70. [Google Scholar]
- 12.Edwards D., Ward-Besser G., Lasecki J., Parker B., Wieduwilt K., Wu F. and Moorhead P., The minimum sum method: A distribution-free sampling procedure for medicare fraud investigations, Health Serv. Outcomes Res. Methodol. 4 (2003), pp. 241–263. [Google Scholar]
- 13.Ekin T, An Integrated Decision-making Framework for Medical Audit Sampling. Proceedings of the 52nd Hawaii International Conference on System Sciences, 2019, pp. 4107–4114. Computer Society Press.
- 14.Ekin T, Statistics and Health Care Fraud: How to Save Billions, CRC Press, Boca Raton, FL, 2019. [Google Scholar]
- 15.Ekin T., Ieva F., Ruggeri F. and Soyer R., Statistical medical fraud assessment: Exposition to an emerging field, Int. Stat. Rev. 86 (2018), pp. 379–402. [Google Scholar]
- 16.Ekin T., Musal R.M. and Fulton L.V., Overpayment models for medical audits: Multiple scenarios, J Appl. Stat. 42 (2015), pp. 2391–2405. [Google Scholar]
- 17.Faltin F., Kenett R.S. and Ruggeri F., Statistical Methods in Healthcare, John Wiley & Sons, West Sussex, UK, 2012. [Google Scholar]
- 18.GAO , Medicare fraud prevention: CMS has implemented a predictive analytics system, but needs to define measures to determine its effectiveness, United States Governmental Accountability Office, 2012. Available at http://www.gao.gov/products/GAO-13-104 (accessed 1 September 2019).
- 19.Ghosh M., Mukhopadhyay N. and Sen P.K., Sequential Estimation, Vol. 904, John Wiley & Sons, Hoboken, NJ, 2011. [Google Scholar]
- 20.Hidiroglou M.A. and Kozak M., Stratification of skewed populations: A comparison of optimisation-based versus approximate methods, Int. Stat. Rev. 86(1) (2018), pp. 87–105. [Google Scholar]
- 21.Ignatova I. and Edwards D., Probe samples and the minimum sum method for Medicare fraud investigations, Health Serv. Outcomes Res. Methodol. 8 (2008), pp. 209–221. [Google Scholar]
- 22.Iyengar V.S., Hermiz K.B. and Natarajan R., Computer-aided auditing of prescription drug claims, Health Care Manage. Sci. (17), pp. 203–214. [DOI] [PubMed] [Google Scholar]
- 23.Keskintürk T. and Er Ş., A genetic algorithm approach to determine stratum boundaries and sample sizes of each stratum in stratified sampling, Comput. Stat. Data Anal. 52 (2007), pp. 53–67. [Google Scholar]
- 24.Lavallée P. and Hidiroglou M., On the stratification of skewed populations, Surv. Methodol. 14 (1988), pp. 33–43. [Google Scholar]
- 25.Lindley D.V., On a measure of the information provided by an experiment, Ann. Math. Stat. 27 (1956), pp. 986–1005. [Google Scholar]
- 26.Mukhopadhyay N. and De Silva B.M., Sequential Methods and their Applications, CRC Press, Boca Raton, FL, 2008. [Google Scholar]
- 27.Musal R.M. and Ekin T., Medical overpayment estimation: A Bayesian approach, Stat. Model. 17 (2017), pp. 196–222. [Google Scholar]
- 28.Musal M. and Ekin T., Information-theoretic multistage sampling framework for medical audits, Appl. Stoch. Models Bus. Ind. 34 (2018), pp. 893–907. [Google Scholar]
- 29.Neyman J., On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection, J. R. Stat. Soc. 97 (1934), pp. 558–625. [Google Scholar]
- 30.OIG , Rat-stats-statistical software, 2019. Available at http://oig.hhs.gov/compliance/rat-stats/index.asp (accessed 30 April 2019).
- 31.Phua C., Lee V., Smith K. and Gayler R, A comprehensive survey of data mining-based fraud detection research, preprint (2010). Available at arXiv:1009.6119.
- 32.Sahin Y., Bulkan S. and Duman E., A cost-sensitive decision tree approach for fraud detection, Expert Syst. Appl. 40 (2013), pp. 5916–5923. [Google Scholar]
- 33.Shannon C., A mathematical theory of communication, Bell Syst. Tech. J. 27 (1948), pp. 623–656. [Google Scholar]
- 34.Soofi E.S., Capturing the intangible concept of information, J. Am. Stat. Assoc. 89 (1994), pp. 1243–1254. [Google Scholar]
- 35.Torgo L. and Lopes E, Utility-based Fraud Detection. Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, Vol. 2, AAAI Press, 2011, pp. 1517–1522.
- 36.Ulvila J.W. and Gaffney J.E.. A decision analysis method for evaluating computer intrusion detection systems, Decis. Anal. 1 (2004), pp. 35–50. [Google Scholar]
- 37.Woodard B., Fighting healthcare fraud with statistics, Significance 12 (2015), pp. 22–25. [Google Scholar]
- 38.Zhang H., Wernz C. and Hughes D., Modeling and designing health care payment innovations for medical imaging, Health Care Manage. Sci. 21 (2018), pp. 37–51. [DOI] [PubMed] [Google Scholar]