Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Sep 1.
Published in final edited form as: Health Care Manag Sci. 2018 Aug 26;22(3):489–511. doi: 10.1007/s10729-018-9455-5

Measuring efficiency of community health centers: A multi-model approach considering quality of care and heterogeneous operating environments

Ronald G McGarvey a,b,*, Andreas Thorsen c, Maggie L Thorsen d, Rohith Madhi Reddy a
PMCID: PMC6391222  NIHMSID: NIHMS987049  PMID: 30145727

Abstract

Over 1,300 federally-qualified health centers (FQHCs) in the US provide care to vulnerable populations in different contexts, addressing diverse patient health and socioeconomic characteristics. In this study, we use data envelopment analysis (DEA) to measure FQHC performance, applying several techniques to account for both quality of outputs and heterogeneity among FQHC operating environments. To address quality, we examine two formulations, the Two-Model DEA approach of Shimshak and Lenard (denoted S/L), and a variant of the Quality-Adjusted DEA approach of Sherman and Zhou (denoted S/Z). To mitigate the aforementioned heterogeneities, a data science approach utilizing latent class analysis (LCA) is conducted on a set of metrics not included in the DEA, to identify latent typologies of FQHCs. Each DEA quality approach is applied in both an aggregated (including all FQHCs in a single DEA model) and a partitioned case (solving a DEA model for each latent class, such that an FQHC is compared only to its peer group). We find that the efficient frontier for the aggregated S/L approach disproportionately included smaller FQHCs, whereas the aggregated S/Z approach’s reference set included many larger FQHCs. The partitioned cases found that both the S/L and S/Z aggregated models disproportionately disfavored (different) members of certain classes with respect to efficiency scores. Based on these results, we provide general insights into the tradeoffs of using these two models in conjunction with a clustering approach such as LCA.

Keywords: Data envelopment analysis, Data science, Latent class analysis, OR in health services, Public sector OR

1. Introduction

Charnes, Cooper and Rhodes [1] introduced Data Envelopment Analysis (DEA) as a means for comparing the relative efficiencies of different decision making units (DMUs) across multidimensional sets of inputs consumed and outputs produced. DEA concepts have subsequently been used to evaluate the performance of many different types of organizations, considering DMUs as varied as public schools [2] and healthcare providers [3]. Such applications of DEA evaluate the management of an organization (the aforementioned DMU) by comparing its efficiency to that of its peers, typically with the objective of identifying a subset of high-performing organizations whose practices might be adopted by others (although DEA does not provide the means for identifying which practices of high-performing organizations are the ones responsible for its success relative to its peers).

For meaningful interpretations to be drawn from a DEA application, the DMUs under consideration should be sufficiently similar to warrant comparison. The potential impact of scale economies was recognized early in the history of DEA research as one factor that might unduly penalize very small (or very large) DMUs, independent of their managers’ relative efficiency. Banker, Charnes and Cooper [4] developed an alternative DEA formulation that allows for economies of scale to be reflected in the efficiency calculation for any DMU, addressing issues related to the relative size of a DMU (where size can be defined in terms of the inputs consumed or the outputs produced). However, it is possible that a pair of DMUs might be so dissimilar in dimensions unrelated to size (e.g., the socio-economic characteristics of schools’ student bodies, or differing regulatory environments for electricity providers in different states) that any attempt to compare them based solely on a notion of efficiency relating inputs to outputs would systematically advantage one DMU relative to the other.

Consider the evaluation of community health centers. In the United States, there are over 1,300 federally-qualified health centers (FQHCs) providing medical services to nearly 23 million medically under-served and uninsured Americans [5]. FQHC patients are predominately low-income, with 92% of patients served in Fiscal Year 2014 earning 200% or less of the federal poverty threshold [5]. FQHCs provide services that address disparities across a variety of health outcomes, including pregnancy and childbirth; according to Rosenbaum [6], FQHCs account for 10% of low-income births in the United States.

When evaluated from an efficiency standpoint, in the aggregate, FQHCs have struggled to meet targeted output goals. According to the Health Resources and Services Administration annual performance report [5], in Fiscal Year 2014, FQHCs failed to meet overall goals with respect to number of patients served and annual increase in cost per patient served. Of course, production outputs are not the only relevant metrics for evaluating FQHC performance. From a quality perspective, this annual performance report shows that aggregate targets were satisfied with respect to health outcomes such as percent of low birth weight deliveries (7.3%) and quality of care metrics such as percent of pregnant patients beginning prenatal care in the first trimester (72%).

DEA could provide a means to evaluate the efficiency of individual FQHCs, as one component of a broader effort to improve the returns on government investment in health centers. However, while FQHCs target disadvantaged populations who are generally at a higher risk for a variety of health problems, there are significant differences across centers. An FQHC serving low-income patients in a high-density urban location such as New York City may face different challenges (e.g., variations in sociodemographic characteristics, socioeconomic status, and health status of patients served) than an FQHC serving low-income patients in rural Montana. These differences may constrain the potential efficiency that one FQHC can achieve relative to another. Prior DEA studies have found FQHC technical efficiency to be related to such contextual factors as the percent of the population on Medicaid and Medicare, and percent Hispanics living in the service region [7]. Other researchers [8] have found that FQHCs with low efficiency are more likely to be in urban locations and to be located in counties with higher median income. They also find that low-efficiency FQHCs serve fewer children and elderly patients and serve more uninsured patients. Given these findings, a single DEA performed over the set of all FQHCs might present a biased evaluation of health centers by drawing comparisons between such distinct settings. In support of this claim, VanderWielen and Ozcan [9] state, in reference to efficiency evaluation of free clinics, “…it is clear that each clinic provides care to a unique population that likely affects clinic performance” (p. 483).

In this research, we propose to mitigate the impact of such heterogeneity across FQHCs on DEA efficiency scores for pregnancy-related outcomes by utilizing Latent Class Analysis (LCA). LCA, a machine learning technique originally developed by Lazarsfeld and Henry [10] that is commonly used in the social sciences, will be used to first classify DMUs according to a set of underlying characteristics. A separate DEA analysis can then be performed for each class, utilizing a set of input and output metrics that is disjoint from the characteristics utilized to identify class membership. Such an approach should allow for comparison of a DMU to a subset of organizations more closely resembling a peer group. This approach is similar to the prior analyses of [11] and [12], each of whom used a two-stage LCA-DEA approach to examine the efficiency of electricity providers. From a methodological perspective, this paper differs from those studies in that technical efficiency is not the only output of interest, quality measures also need to be considered in evaluation of healthcare providers. In this research, we apply two methods for incorporating quality outputs into the DEA model, and compare them. First, the Two-Model approach of Shimshak and Lenard [13] is utilized for efficiency evaluation, in an attempt to prevent inclusion of FQHCs with high production outputs but poor output quality from the efficient reference set. As an alternative approach, we apply the Quality-Adjusted DEA approach of Sherman and Zhu [14] to remove low-quality DMUs from the efficient reference set.

The case study described in this paper makes several contributions to the literature on DEA, data science, and health services delivery. This is the first paper to use a LCA-DEA framework in an application where quality outputs are critical to DMU evaluation. More generally, this paper provides the first DEA evaluation of FQHC performance using any clustering approach (not only LCA) to ensure that each FQHC is only compared against other FQHCs that provide care to similar populations. This paper also contrasts the results of two published approaches to control for quality outputs in DEA models, and finds that one approach [13] can lead to the somewhat counterintuitive result that the efficiency score for a DMU can decrease when it is evaluated with respect to a subset of the original DMU set. We discuss the similarities and differences between the efficient reference sets from each model with respect to the LCA and DEA measures. Our results demonstrate the importance of drawing comparisons between peer organizations when measuring FQHC performance, and the utility of taking a partitioned DEA approach, with some classes of FQHCs being disproportionately disfavored by the Two-Model approach and other classes disproportionately disfavored by the Quality-Adjusted DEA approach.

The remainder of this paper is organized as follows. Section 2 contains a literature review of related DEA research, and provides an overview of LCA techniques. Section 3 provides a description of the data collection procedures utilized for this evaluation of pregnancy-related healthcare provision at FQHCs. Section 4 presents the computational results and discussion, contrasting the results of a single DEA over the set of all FQHCs (referred to as the aggregated case) with our classification-based approach (referred to as the partitioned case). Section 5 then provides concluding remarks and suggestions for further research.

2. Literature review

2.1. DEA for FQHCs

Many papers have been published applying DEA to measure the efficiency of health care organizations, and this area was identified as one of DEA’s most popular application areas in a recent survey [15]. Several reviews of efficiency measurement in health care, with a focus on DEA, have been written (e.g., [16], [17], [18]) as well as entire books and book chapters (e.g., [19], [20]), along with papers examining particular types of service providers, such as medical home hospitals [21] or free clinics [22]. More recent papers have examined aspects of health systems using Dynamic DEA (e.g., [23], [24]). A difference between DEA applied to healthcare organizations versus other application areas is that in healthcare quality is at least as important as cost, so it is important to consider quality in some fashion when performing the DEA analysis (e.g., [25], [26], [27]). Therefore, in this paper, we consider several approaches for including quality in the models.

There have been recent papers using the DEA methodology that focus specifically on FQHCs. Rahman and Capitman [28] examined factors that affected efficiency of FQHCs and rural health clinics in the San Joaquin Valley, California. Marathe et al. [7] found technical efficiency of FQHCs to be related to such contextual factors as the percent of the population on Medicaid and Medicare, and percent Hispanics living in the service region. A limitation of both of the previous studies was that quality of care was not considered. Amico et al. [8] examine the relationship between grant revenues and FQHC efficiencies. While no quality measures were used in their DEA models, the authors included one quality measure (proportion of patients who had access to prenatal care in the first trimester) as a covariate in their multivariate regression analyses.

While the literature on DEA applications in healthcare organizations is extremely large, it is not without limitations. Hollingsworth and Street [29] suggest that healthcare policy makers are wary of efficiency analyses, such as DEA, because of distrust in the reliability of the method. When comparing the efficiency estimate of a DMU using DEA to the same DMU using other efficiency measurement methods, such as stochastic frontier analysis (SFA), there can be large differences, and those efficiency estimates are sensitive to the particular formulation chosen by the modeler [29]. For instance, Giuffrida and Gravelle [30] find low correlations between DEA efficiencies and SFA in a study of primary care providers. In contrast, Bastian et al. [31] evaluate hospital efficiency using DEA and SFA and find congruency in direction between the two techniques. Indeed, the literature has mixed results when comparing DEA to regression-based methods such as SFA. In this paper, we compare and contrast several formulations for evaluating the efficiency of FQHCs using the DEA approach.

2.2. LCA overview

Latent class analysis (LCA) is a clustering methodology widely used in the social, behavioral, and health sciences [32], [33]. LCA is a term commonly used when referring to mixture models more broadly. This statistical method identifies unobserved, latent subgroups within populations based on observed variables. In the current study, observed indicators of sociodemographic characteristics of the patient population served by health centers and characteristics of the regional setting where FQHCs operate are used to establish the existence of common, latent compositional groups of FQHCs, which may be associated with distinct organizational and patient experiences of health care. Such an approach considers a number of factors that may be associated with the health and health care of FQHC populations broadly, not just prenatal experiences specifically.

This model-based clustering approach estimates parameters with an iterative estimation approach using the maximum-likelihood function. Given a specified model and data set, the maximum likelihood function indicates the likelihood of the observed empirical data, conditional on the parameter estimates of the model [32], [34]. A major advantage of this model-based clustering approach is the flexibility with which it handles observed variable distributions, as no decisions must be made about the scaling or functional form of observed variables [35]. Further, LCA is a probabilistic clustering approach which accounts for uncertainty in the designation of class membership. In practice, individuals are assigned into the latent class for which they have the highest posterior probability of membership using the classify-analyze approach [36].

Of primary importance in LCA is model selection and determining the optimal number of latent classes. The selection of the latent class model, with a specified number of classes, is not determined a priori but rather relies on several measures of model fit. According to simulation studies, information criteria (the Bayesian Information Criteria (BIC), Sample-Adjusted Bayesian Information Criteria (SABIC), and Akaike Information Criteria (AIC)) and likelihood-based tests (Lo-Mendell Rubin) should be jointly-considered when determining the optimal solution for the number of latent classes [37]. Entropy, a measure which indicates how unambiguously individual cases can be classified, also aids in model selection. Researchers must also consider the substantive interpretability of the model and strive for parsimony when evaluating and selecting models [32].

3. Data collection process

Data for this study include both FQHC operational data, patient population data, and data on the region served by each FQHC. The data used in the DEA consisted of FQHC operational data, while the data used in the LCA included both patient population data and regional data. The sources of data included both FQHC-level data and regional-level data, which were all converted to FQHC-level data.

Data from the 2015 Uniform Data System (UDS) included data on 1,375 health centers. FQHCs outside of the 50 US states and FQHCs that served less than 100 total patients were excluded for the LCA (n=1,331). An additional 220 FQHCs that served five or fewer prenatal patients in 2015 or had zero Medical Full-Time Equivalents (FTEs) were excluded in the DEA models, contributing to a final analytic sample of 1,111 FQHCs.

3.1. DEA data

Data on DEA inputs and outputs come from the 2015 UDS. DEA outputs include what we refer to as production outputs (annual number of prenatal patients served and annual number of prenatal patients who delivered) and quality outputs (percentage of prenatal patients who receive care in first trimester and percentage of low birth weight births (LBW)). These counts of prenatal patients who delivered include those who deliver at a location other than the FQHC; since these deliveries are still the result of the cumulative care received by the prenatal patient through the health center, they are therefore considered to be a production output of the FQHC. DEA inputs include the following ten variables:

  • Total non-patient revenue (in US Dollars): This includes total federal grants, total non-federal grants and contracts, and other non-patient related revenue (e.g., donations, interest income, and rent from tenants). This variable is used as a proxy for inputs bought by the grants/contracts.

  • Nine major service staffing categories (in FTEs): Primary Care Physician, Non-Primary Care Physician, Nurse Practitioner/Physician Assistant/Certified Nurse Midwife, Medical (e.g., nurses, medical technicians), Dental, Mental Health, Substance Abuse, Vision, Enabling. In addition to medical care, other health services have been linked with pregnancy-related health outcomes including dental care [38], substance use treatment [39], mental health care [40], vision [41] and enabling services such as transportation [42]. By including the distribution of these staff at FQHCs in the DEA model, we capture multiple dimensions of care that contribute to better health outcomes. FQHCs with five or fewer Medical FTEs were excluded from this analysis (n = 23).

The DEA model presented in this paper calculates the efficiency of the provision of a single health service (related to pregnancy) at FQHCs, using managerial inputs and clinical outputs. As prescribed by the authors of [43], statistical analyses were performed to ensure that an isotonic relationship exists between inputs and outputs. This analysis found a positive correlation that was significant at the p=0.001 level between each input-production output pair (these correlation coefficients appear in Table A.1 in Appendix A), with the sole exception of Substance Abuse FTE, which exhibited a positive, but not significant, correlation with each output; we retain this input in our DEA given the strong relationship known to exist between substance abuse treatment and pregnancy-related outcomes [39]. The DEA inputs that are included relate to all of the services provided at the FQHCs, not only those for prenatal services. The implications of this issue are discussed in detail in section 5.1.

Table A.2 in Appendix A presents descriptive statistics for this set of DEA input data.

3.2. LCA data

For the LCA, data on FQHC patient demographics come from the 2015 UDS. These data were obtained at the FQHC level, and include the following seven measures: patient age (% of patients who are children, % of patients who are elderly), % of patients who are non-White, % of patients best served in another language, % of patients in poverty (at or below 100% of the federal poverty line), patient insurance status (% of uninsured patients, % of patients on Medicaid, % of patients on Medicare), total number of patients served at the FQHC, and the percent of patients served who were prenatal patients..

Regional data come from the US Census American Community Survey (ACS; 2010–2014) and Behavioral Risk Factor Surveillance Survey (BRFSS; 2009–2012). These data were obtained at the zip code tabulation area (ZCTA) level. ZCTAs are geographic estimations of the US Postal Service ZIP code service areas, since ZIP codes represent postal routes and are not geographic areas. In addition, the UDS data included information on the number of patients served by each FQHC at the ZCTA level. These data were used together to convert the ZCTA-level survey data to FQHC-level data by averaging the variables of interest across all ZCTAs served by each FQHC. For example, to calculate the percent of population on Medicaid for the region served by an FQHC, we add the total people on Medicaid among all ZCTAs that are served by an FQHC and divide by the total population of those ZCTAs. This approach yields better measures of FQHC regional populations than alternative approaches such as using data on the county where the FQHC is located. As many FQHCs serve patients outside of their county, and the demographic distribution within a county may be variable, using information at a ZCTA-level rather than county-level better captures the population living in the service region.

In summary, the following four regional measures were used in the LCA: % of non-White population in the FQHC service region; % of the region living in poverty; insurance status of the regional population (% uninsured, % on Medicaid); and % of the service region population with no usual source of health care. Two measures of urbanity were also included: service region population density (population per square mile) and a binary indicator of the U.S. Census designation of urban/rural status (0 “rural” 1 “urban”). Both the continuous measures of population density and total number of FQHC patients were transformed to adjust for skewness by taking their natural log.

Table A.3 in Appendix A presents descriptive statistics for this set of LCA data across the restricted set of 1,111 FQHCs included in the DEA models.

4. Computational results and discussion

In this section we first present results from the latent class analysis. We then describe the DEA models, the procedure for implementing the models, and present the results. In all DEA models, an input-oriented variable return-to-scale (VRS) model was used because it is assumed that FQHCs have greater control over inputs such as labor rather than the output measures, and this approach allows the model to capture the potential influence of nonlinear scale economies (i.e., an increase in the inputs may not result in a proportional increase in outputs). Similar assumptions were made in previous FQHC-related research (e.g., [7], [8], [28]).

4.1. LCA results

A latent class analysis was performed using thirteen measures of the FQHC regional and patient population characteristics, estimating solutions having between two and seven classes. When comparing model fit statistics and likelihood-based tests, the five class solution was determined to be the preferred solution, as it fit the data well, significantly improved model fit over alternative solutions, and had a clear interpretation (see Table 1 for a comparison of model fit solutions). Entropy, a measure of disambiguation between classes ranging between 0 and 1, was highest for the four class solution, declined slightly for the five, six, and seven class solution but remained high, indicating a clear distinction between classes. The three information criteria measures (AIC, BIC, SABIC) declined as the number of classes increased, and were lowest for the seven class solution, indicating that this was the ideal solution according to these measures. The Lo-Mendell Rubin test indicates whether a solution with k classes is a significant improvement in model fit over a solution with k-1 classes. This test was significant for solutions with two through five classes, indicating that the five class solution was best. As an exploratory analysis, it is common in LCA for fit measures to disagree, as they are capturing different aspects of how well the model approximates the data. By performing this analysis across a range of possible classes we are able to make comparisons both on statistical measures and substantive theory. The process of assessing relative model fit involves striking the right balance between fitting a model to the data and parsimony [32]. When evaluating model fit and interpretability the solution with five classes was determined to best fit the data and was established as the preferred solution. Model solutions with more classes (e.g., the 7 class solution) tend to split up classes identified under the 5-class solution in ways that we believe are unnecessarily complicated and do not add any substantive value. Further, some research suggests that information criteria, such as AIC and BIC, may exhibit an inflated preference for more complex models, particularly with large sample sizes [44].

Table 1:

Comparing Fit Statistics of different Latent Class solutions of FQHCs

Number of Latent Classes AIC BIC Adjusted BIC Entropy Lo-Mendell Rubin Adjusted LRT test
value p value

2 −25220.4 −24955.5 −25117.5 0.886 3625.01 0.000
3 −26636.1 −26277.7 −26496.9 0.887 1440.57 0.001
4 −28048.3 −27596.5 −27872.8 0.903 1412.61 0.000
5 −28854.0 −28308.7 −28642.2 0.894 1328.25 0.009
6 −29453.9 −28815.1 −29205.8 0.885 630.97 0.284
7 −29927.5 −29195.2 −29643.1 0.891 519.33 0.282

Table 2 presents information on the FQHC regional and patient characteristics used to generate the five latent classes. The first column displays the overall means of the observed variables for the full sample of FQHCs, subsequent columns present the means by latent class. The first two classes are comprised of FQHCs primarily in rural areas (class one: 12% urban, class two: 7% urban) with low population densities (class 1: 146 individuals/square mile, class 2: 185 individuals/sq. mile). While both classes have similar regional characteristics, the patients served by FQHCs in class one are significantly more likely to be non-white, children, living in poverty, and on Medicaid or uninsured compared to patients served by the FQHCs in class 2. Therefore, we label class 1 More Diverse, Rural Poor (27% of the sample). Compared to the other latent classes, the patients served by FQHCs in class 2 were more likely to be older, white patients on Medicare. Given these characteristics, we label this class of FQHCs Older Rural Whites. Two distinct urban classes emerge in our LCA. Class 3 is comprised of FQHCs located in urban areas with a moderate population density, a low rate of poverty, and a low rate of medically uninsured individuals in the region compared to other latent classes. Despite the relative socioeconomic advantage of the regions in which these FQHCs are located, the patients served by FQHCs in class 3 are themselves disadvantaged (e.g. in poverty, racial minorities, on Medicaid). FQHCs in this class are also large (over 26,000 annual patients on average), with a significantly higher percentage of children for patients compared to other classes. Thus, we label this class of FQHCs, Large Urban Serving Poor in Low Poverty Area (31% of the sample). Class 4 are FQHCs situated in densely populated areas, with over 10,000 individuals per square mile on average. The regions in which these health centers are located, as well as the patients served by these FQHCs, have high rates of poverty, racial minorities, medically uninsured individuals, and individuals who lack access to a usual source of medical care. Given these characteristics, we label this group Dense Urban Poor Racial Minorities (14% of the sample). We label the fifth and final class of FQHCs Uninsured Patients, given their definitively high rate of patients who lack medical insurance (60.8% on average). The regional characteristics of FQHCs in class 5 largely approximate the averages for the full sample, with about half of the health centers residing in urban areas. A large proportion of patients served by class 5 FQHCs live in poverty and are racial minorities. This class also has the highest rate of patients best served in a language other than English, compared to the other classes of FQHCs.

Table 2.

FQHC patient and regional characteristics, means by latent class membership

Full Sample Class 1 Class 2 Class 3 Class 4 Class 5
More Diverse, Rural Poor Older Rural Whites Large Urban Serving Poor in Low Poverty Area Dense Urban Poor Racial Minorities Uninsured Patients

Regional Characteristics
    Population Density (pop/sq. mile) 2,135 146 185 1,492 10,013 1,120
    Proportion Urban 45% 12% 7% 70% 75% 55%
    % non-white 0.279 0.180 0.162 0.293 0.516 0.295
    % in poverty 0.177 0.173 0.175 0.155 0.238 0.177
    % on Medicaid 0.249 0.231 0.243 0.232 0.362 0.217
    % uninsured 0.123 0.122 0.111 0.098 0.160 0.155
    % with no usual source of care 0.199 0.189 0.177 0.177 0.253 0.230
Patient Characteristics
    Total number of patients 17,926 13,360 10,641 26,544 20,259 11,840
    % patients who are children 0.264 0.265 0.185 0.321 0.272 0.199
    % patients who are elderly 0.089 0.103 0.208 0.059 0.057 0.059
    % non-white patients 0.545 0.298 0.180 0.678 0.857 0.711
    % of patients best served in another language 0.170 0.058 0.031 0.233 0.250 0.275
    % of patients in poverty 0.669 0.604 0.476 0.719 0.779 0.734
    % of uninsured patients 0.273 0.232 0.144 0.216 0.234 0.608
    % of patients on Medicaid 0.432 0.371 0.269 0.581 0.595 0.206
    % of patients on Medicare 0.101 0.124 0.231 0.072 0.062 0.053
    % prenatal patients 0.017 0.011 0.005 0.025 0.019 0.015

N 1,331 358 165 418 191 199
Proportion 100% 26.9% 12.4% 31.4% 14.4% 15.0%

Considering only the 1,111 FQHCs included in our DEA analysis, there were 297, 119, 391, 157 and 147 FQHCs for which DEA efficiency scores were computed, across LCA classes 1, 2, 3, 4 and 5, respectively.

4.2. Aggregated S/L approach

4.2.1. Description of the procedure

We first ran the aggregated procedure, using the Two-Model approach of Shimshak and Lenard [13]. Under this approach, which builds on the research of Chilingerian and Sherman [45], one solves a single DEA over the set of all FQHCs for the production outputs, and then solves a single DEA over the set of all FQHCs for the quality outputs, with the same set of inputs used in each case. Refer to the efficiencies obtained by these two model runs as ProdEff_0 and QualEff_0, respectively. In order to ensure that only FQHCs with acceptable quality outputs are included in the efficient reference set for production outputs, all FQHCs with a production efficiency of one and an unacceptably low QualEff_0 score need to be removed from the analysis, and then the production output model is run again to obtain a new production efficiency for the remaining FQHCs (those that have been removed retain their production efficiency score of one, but are not included in the efficient reference set in the next iteration). The process iterates as many times as are necessary until all FQHCs in the efficient reference set for production outputs have an acceptably high QualEff_0 value; let ProdEff_* denote the production efficiency scores obtained at the end of this iterative process. The entire process is then repeated so that only FQHCs with acceptable production outputs are included in the efficient reference set for quality outputs. Here, all FQHCs with a quality efficiency of one and an unacceptably low ProdEff_0 score are removed from the analysis, and the quality output model is run again to obtain new quality efficiency scores for the remaining FQHCs, with this process also repeated as many times as are necessary until all FQHCs in the efficient reference set for quality outputs have an acceptably high ProdEff_0 score; let QualEff_* denote the quality efficiency scores obtained at the end of this iterative process. Algorithm 1 and Algorithm 2 in Appendix B present the S/L procedures used to generate the ProdEff_* and QualEff_* scores, respectively.

A challenge to any implementation of the Shimshak and Lenard approach (denoted hereafter as S/L) is the identification of acceptability thresholds for the QualEff_0 and ProdEff_0 scores. In the 2017 HRSA Annual Performance Report [5], target values were identified for Fiscal Year (FY) 2015 quality of care and health outcomes. From a quality of care perspective, the target value for “Percentage of pregnant Heath Center patients beginning prenatal care in the first trimester” was 66% (hereafter, we will refer to this measure as Quality_Access). From a health outcomes perspective, the target value for “Rate of births less than 2500 grams (low birth weight) to prenatal Health Center patients compared to the national low birth weight rate” was “5% below national rate”. The National Center for Health Statistics [46] reports that 8.07% of babies born in 2015 were born low birth weight; applying this HRSA target to this national rate gives a target value of 0.95*0.0807=7.67% low birth weight rate. Because DEA requires that outputs be scaled such that larger values denote better performance, we will utilize one minus the low birth weight rate (i.e., the rate of non-low birth weight births) as our low birth weight index (hereafter, we will refer to this measure as Quality_LBWI); note that the target value for Quality_LBWI would thus be 92.33%.

An examination of the QualEff_0 scores relative to these targets found that a threshold of QualEff_0 = 0.95 worked well to maintain an acceptably high average performance. There were 52 FQHCs with a QualEff_0 score of 0.95 or greater, across this subset of centers the average performance was Quality_Access = 86% and Quality_LBWI = 92.92%, both of which exceed the target values. There were 1,059 FQHCs with a QualEff_0 score of less than 0.95, across this subset of centers the average performance was Quality_Access = 75% and Quality_LBWI = 91.51%, here the Quality_LBWI target is not satisfied. Due to a lack of objective criteria for identifying a threshold for ProdEff_0, we utilized an identical threshold value of ProdEff_0 = 0.95; there were 113 FQHCs that satisfied this threshold.

4.2.2. Description of the results and efficient reference sets

Utilizing these threshold values, the aggregated S/L procedure was utilized to identify a ProdEff_* and a QualEff_* score for each FQHC; Figure 1 presents a scatterplot graph of these efficiency scores. Table 3 presents summary statistics of these ProdEff_* and QualEff_* scores, by class.

Fig. 1.

Fig. 1

Production efficiency versus quality efficiency, aggregated S/L procedure

Table 3.

Production efficiency and quality efficiency, by class, aggregated S/L procedure

Class Efficiency metric Mean Standard deviation Min Q1 Median Q3 Max
1 ProdEff_* 0.481 0.371 0.034 0.159 0.310 1.000 1.000
QualEff_* 0.275 0.262 0.021 0.089 0.171 0.354 1.000
2 ProdEff_* 0.330 0.289 0.037 0.131 0.239 0.373 1.000
QualEff_* 0.405 0.292 0.020 0.172 0.294 0.613 1.000
3 ProdEff_* 0.743 0.361 0.037 0.342 1.000 1.000 1.000
QualEff_* 0.143 0.178 0.007 0.040 0.080 0.159 1.000
4 ProdEff_* 0.736 0.345 0.084 0.368 1.000 1.000 1.000
QualEff_* 0.241 0.300 0.004 0.056 0.107 0.244 1.000
5 ProdEff_* 0.587 0.379 0.050 0.205 0.484 1.000 1.000
QualEff_* 0.237 0.255 0.009 0.075 0.143 0.279 1.000
Total ProdEff_* 0.607 0.385 0.034 0.212 0.572 1.000 1.000
QualEff_* 0.232 0.257 0.004 0.062 0.129 0.293 1.000

Observe in Figure 1 that there is a large number of points with ProdEff_* = 1 and relatively low QualEff_* scores, these correspond to points that were excluded from the reference set for production efficiency. Similarly, the points with QualEff_* = 1 and relatively low ProdEff_* scores are those that were excluded from the reference set for quality efficiency. In total, there were 35 points in the reference set for production efficiency and 34 points in the reference set for quality efficiency. There were 34 FQHCs appearing in both efficient reference sets, one FQHC (AK9) appeared in the reference set for production efficiency only.

Out of this set of 35 efficient FQHCs, 11 were members of class 1, four were members of class 2, three were members of class 3, nine were members of class 4, and eight were members of class 5. Observe in Table 3 that average production and quality efficiency vary across latent classes, with class 2 (Older Rural Whites) exhibiting the lowest average production efficiency and the highest average quality efficiency. With respect to their LCA characteristics, these 35 efficient FQHCs were generally similar to the full sample. An ANOVA analysis found a statistically significant difference (at the p=0.01 level) between the members of the efficient reference set and all other FQHCs for only two of the 17 total LCA characteristics: the Regional % uninsured (with efficient FQHCs having a larger mean value), and the Total number of patients (with efficient FQHCs generally relatively small, treating an average of 3,694 patients, compared to an average of 20,900 patients for all other FQHCs). With respect to their DEA characteristics, however, these 35 efficient FQHCs were very different from the full sample included in the DEA. An ANOVA analysis found a statistically significant difference (at the p=0.01 level) between the members of the efficient reference set and all other FQHCs for seven of the ten input DEA characteristics (all except Non-Primary Care Physician, Substance Abuse, and Vision). This ANOVA analysis found statistically significant differences (at the p=0.01 level) for both DEA production outputs, with members of the efficient reference set serving relatively few prenatal patients (mean 63) and few prenatal patients who delivered (mean 27), versus respective mean values of 497 and 262 for all other FQHCs. For DEA quality outputs, this ANOVA analysis did not find any statistically significant differences between members of the efficient reference set and all other FQHCs.

There were 475 FQHCs excluded from the reference set for production efficiency due to having a QualEff_0 score less than the threshold of 0.95. There were 22 FQHCs excluded from the reference set for quality efficiency due to having a ProdEff_0 score less than the threshold of 0.95. No FQHC was excluded from both reference sets.

4.3. Partitioned S/L approach

4.3.1. Description of the procedure

Next, we consider an approach to evaluate the efficiency of FQHCs in a way that takes into account compositional differences (in terms of patient demographics and regional differences) that may be linked with differential challenges in delivering health care efficiently and providing quality health outcomes. We consider that the latent classes obtained by LCA may better represent FQHC peer groups that could be used as comparison sets for the DEA models. Therefore, we ran separate S/L models for each of the five latent classes. We used the same thresholds (0.95) for the QualEff_0 and ProdEff_0 scores as we did in the aggregated model.

4.3.2. Description of the results and efficient reference sets

Utilizing these threshold values, this partitioned S/L procedure was utilized to identify a ProdEff_* and a QualEff_* score for each FQHC; Figure 2 presents a scatterplot graph of these efficiency scores. Table 4 presents summary statistics of the ProdEff_* and QualEff_* scores, by class, for the partitioned S/L procedure.

Fig. 2.

Fig. 2

Production efficiency versus quality efficiency, partitioned S/L procedure

Table 4.

Production efficiency and quality efficiency, by class, partitioned S/L procedure

Class Efficiency metric Mean Standard deviation Min Q1 Median Q3 Max
1 ProdEff_* 0.739 0.302 0.126 0.422 0.976 1.000 1.000
QualEff_* 0.421 0.292 0.047 0.196 0.331 0.591 1.000
2 ProdEff_* 0.618 0.356 0.066 0.274 0.547 1.000 1.000
QualEff_* 0.514 0.351 0.020 0.184 0.453 0.915 1.000
3 ProdEff_* 0.714 0.308 0.077 0.429 0.843 1.000 1.000
QualEff_* 0.365 0.332 0.017 0.109 0.218 0.539 1.000
4 ProdEff_* 0.765 0.312 0.105 0.453 1.000 1.000 1.000
QualEff_* 0.344 0.327 0.011 0.104 0.194 0.472 1.000
5 ProdEff_* 0.676 0.318 0.092 0.372 0.733 1.000 1.000
QualEff_* 0.475 0.332 0.026 0.193 0.360 0.795 1.000
Total ProdEff_* 0.712 0.316 0.066 0.398 0.862 1.000 1.000
QualEff_* 0.408 0.327 0.011 0.140 0.288 0.619 1.000

Observe in Figure 2 that there is a large number of points with ProdEff_* = 1 and relatively low QualEff_* scores, these correspond to points that were excluded from their class’s reference set for production efficiency. Similarly, the points with QualEff_* = 1 and relatively low ProdEff_* scores are those that were excluded from their class’s reference set for quality efficiency. In total, there were 104 points in the five reference sets for production efficiency and 105 points in the five reference sets for quality efficiency. There were 102 FQHCs appearing in both a production efficiency reference set and a quality efficiency reference set, two FQHCs (MI30 and NY3) in their class’s production efficiency reference set but not their class’s quality efficiency reference set, and three FQHCs (DC8, KY17 and OK16) in their class’s quality efficiency reference set but not their class’s production efficiency reference set. FQHC KY17 demonstrates a somewhat counterintuitive result that can occur using the S/L procedure. KY17 was excluded from its class’s production efficiency reference set because its QualEff_0 value was 0.360, however it was included in its class’s quality efficiency reference set, since it achieved QualEff_*=1.

Out of this set of 107 efficient FQHCs appearing in at least one reference set, 29 were members for class 1, five were members for class 2, 30 were members for class 3, 21 were members for class 4, and 22 were members for class 5. Observe in Table 4 that, as was the case with the aggregated approach, average production and quality efficiency also varied across classes, despite separate DEA model runs being performed for each class; once again, class 2 (Older Rural Whites) exhibits the lowest average production efficiency and the highest average quality efficiency.

With respect to their LCA characteristics, these 107 efficient FQHCs were generally similar to the other members of their classes, with the same consistent exception as observed for the aggregated S/L procedure: across all classes, these efficient FQHCs were generally relatively small. At a significance level of p=0.01, ANOVA analyses contrasting the members of an efficient reference set versus all other FQHCs found a statistically significant difference for only one LCA metric (Total number of patients) for two classes (class 3 and class 4), and a statistically significant difference for two LCA metrics (Total number of patients, along with the % patients on Medicaid, with efficient FQHCs having a smaller mean value) for two classes (class 1 and class 5).

With respect to their DEA characteristics, these 107 efficient FQHCs were again very different from the full sample included in the DEA. For one class (class 3), an ANOVA analysis found a statistically significant difference (at the p=0.01 level) between the members of the efficient reference set and all other FQHCs for seven of the ten input DEA characteristics (all except Non-Primary Care Physician, Substance Abuse and Vision); for two classes (class 1 and class 5), this ANOVA analysis did not find a statistically significant difference for four DEA inputs (the three previously mentioned for class 3, along with Mental Health); and for one class (class 4) this ANOVA analysis did not demonstrate a statistically significant difference for six DEA inputs (the four previously mentioned for class 1, along with Total non-patient revenue and Primary Care Physician). For two classes (class 1 and class 3), this ANOVA analysis found statistically significant differences (at the p=0.01 level) for both DEA production outputs, with members of the efficient reference sets serving relatively few prenatal patients and few prenatal patients who delivered, versus the mean values for all other FQHCs. For one class (class 4), this ANOVA analysis found a statistically significant difference for only one DEA production output (number of prenatal patients served), while for the remaining two classes (class 2 and class 5) this ANOVA analysis found no statistically significant differences for DEA production outputs. For the DEA quality outputs this ANOVA analysis did not find any statistically significant differences between members of the efficient reference sets and all other FQHCs. Note that for class 2, these ANOVA analyses found no statistically significant differences between the members of the efficient reference set and all other FQHCs for any LCA or DEA metric, likely due to the small number of FQHCs in the class 2 efficient reference set.

There were 413 FQHCs excluded from the reference set for production efficiency due to having a QualEff_0 score less than the threshold of 0.95. There were 64 FQHCs excluded from the reference set for quality efficiency due to having a ProdEff_0 score less than the threshold of 0.95. Under partitioning, there were three FQHCs (CA121, MA11, ME2) that were excluded from both reference sets.

4.3.3. Comparing the results of Partitioned S/L and Aggregated S/L approaches

Consider now the impact of partitioning on production efficiency scores. Note that all 35 FQHCs that appeared in the efficient reference sets under the aggregated S/L procedure also appeared in the efficient reference sets under the partitioned S/L procedure. Figure 3 presents a scatterplot graph contrasting the ProdEff_* scores obtained by the aggregated S/L procedure with the ProdEff_* scores obtained by the partitioned S/L procedure. Table 5 presents descriptive statistics, by latent class, on the difference between the Partitioned ProdEff_* value and the Aggregated ProdEff_* value. Unexpectedly, the mean production efficiency for class 3 (Large Urban Serving Poor in Low Poverty Area) under the partitioned case was lower than under the aggregated case. Section 4.6 discusses the conditions leading to this somewhat counterintuitive result.

Fig. 3.

Fig. 3

Production efficiency, aggregated vs. partitioned, S/L procedure

Table 5.

Production efficiency, partitioned minus production efficiency, aggregated, S/L procedure

Class Mean Standard deviation Min Q1 Median Q3 Max
1 0.258 0.262 0.000 0.000 0.197 0.401 0.963
2 0.288 0.339 0.000 0.004 0.071 0.638 0.963
3 −0.029 0.297 −0.869 −0.048 0.000 0.116 0.758
4 0.029 0.158 −0.717 0.000 0.000 0.041 0.574
5 0.090 0.248 −0.783 0.000 0.092 0.192 0.716
Total 0.105 0.300 −0.869 0.000 0.011 0.230 0.963

An ANOVA analysis, using the Bonferroni test for pairwise comparisons, found a statistically significant difference (at the p=0.001 level) for the change in production efficiency values for the following pairs of classes: 1 and 3, 1 and 4, 1 and 5, 2 and 3, 2 and 4, 2 and 5, 3 and 5. This suggests that, compared to the partitioned approach, the aggregated approach disproportionately favors members of class 3 (Large Urban Serving Poor in Low Poverty Area) and disproportionately disfavors members of classes 1 (More Diverse, Rural Poor) and 2 (Older Rural Whites).

4.4. Aggregated S/Z approach

4.4.1. Description of the procedure

An examination of the quality output metrics for the efficient reference sets generated by the aggregated S/L procedure demonstrates a potential drawback to the use of the S/L procedure. Despite limiting the members of this reference set for production efficiency to include only FQHCs with a quality efficiency score QualEff_0 greater than or equal to 0.95, many members of this set had relatively poor performance on output metrics Quality_Access and Quality_LBWI. Out of the 35 members of the reference set for production efficiency generated by the aggregated S/L procedure, five had Quality_Access scores less than the target value of 66%, and ten had Quality_LBWI rates less than the target of 92.33%, these counts include two FQHCs that failed to meet either quality target and were yet included in this efficient reference set. For example, FQHC IL37 achieved Quality_Access = 0.59 and Quality_LBWI = 81.82%, and yet it was defined as efficient with respect to quality, with a QualEff_0 score equal to one, because it consumed relatively little of the DEA inputs to achieve these outputs.

As an alternative approach to overcome this limitation, we utilized a variant of the Quality-Adjusted DEA approach of Sherman and Zhu [14] to limit the reference set for production efficiency to high-quality FQHCs. However, whereas these authors utilized a single quality output metric as a threshold for inclusion in the efficient reference set for production outputs, in this analysis we utilize both the quality of care output Quality_Access and the health outcomes output Quality_LBWI. Our implementation of the Sherman and Zhu procedure (denoted hereafter as S/Z) begins by first solving a single DEA over the set of all FQHCs for the production outputs, using the same of inputs as before. Refer to the efficiencies obtained by this model run as ProdEff_0. In order to ensure that only FQHCs with acceptable quality outputs are included in the efficient reference set for production outputs, all FQHCs with both a production efficiency of one and at least one quality output that does not satisfy the HRSA standards (Quality_Access score less than the target value of 0.66, and/or a Quality_LBWI rate less than the target of 92.33%) need to be removed from the analysis, and then the production output model is run again to obtain a new production efficiency for the remaining FQHCs (those that have been removed retain their production efficiency score of one, but are not included in efficient reference set in the next iteration). The process iterates as many times as are necessary until all FQHCs in the efficient reference set for production outputs have quality output values greater than or equal to the target values; let ProdEff_* denote the production efficiency scores obtained at the end of this iterative process. Algorithm 3 in Appendix B presents the S/Z procedure used to generate these ProdEff_* scores.

4.4.2. Description of the results and efficient reference sets

Utilizing these threshold values, the aggregated S/Z procedure was utilized to identify a ProdEff_* score for each FQHC; Figure 4 presents a scatterplot graph of these efficiency scores versus the quality of care metric Quality_Access, while Figure 5 presents a scatterplot graph of these efficiency scores versus the health outcomes metric Quality_LBWI. Table 6 presents summary statistics of the ProdEff_*, Quality_Access, and Quality_LBWI scores, by class, for the aggregated S/Z procedure.

Fig. 4.

Fig. 4

Efficiency vs. access, aggregated S/Z procedure

Fig. 5.

Fig. 5

Efficiency vs. LBW index, aggregated S/Z procedure

Table 6.

Production efficiency, access, and LBW index, by class, aggregated S/Z procedure

Class Metric Mean Standard deviation Min Q1 Median Q3 Max
1 ProdEff_* 0.342 0.259 0.034 0.148 0.263 0.450 1.000
Quality_Access 0.789 0.160 0.000 0.701 0.813 0.900 1.000
Quality_LBWI 0.910 0.124 0.000 0.890 0.930 1.000 1.000
2 ProdEff_* 0.276 0.227 0.035 0.127 0.221 0.337 1.000
Quality_Access 0.886 0.118 0.357 0.815 0.920 1.000 1.000
Quality_LBWI 0.916 0.148 0.167 0.900 0.959 1.000 1.000
3 ProdEff_* 0.456 0.301 0.036 0.223 0.352 0.644 1.000
Quality_Access 0.738 0.129 0.063 0.667 0.751 0.826 1.000
Quality_LBWI 0.915 0.093 0.000 0.902 0.927 0.948 1.000
4 ProdEff_* 0.481 0.295 0.075 0.252 0.383 0.676 1.000
Quality_Access 0.737 0.153 0.111 0.654 0.756 0.834 1.000
Quality_LBWI 0.919 0.094 0.333 0.905 0.936 0.961 1.000
5 ProdEff_* 0.428 0.302 0.045 0.191 0.317 0.600 1.000
Quality_Access 0.671 0.176 0.169 0.545 0.685 0.789 1.000
Quality_LBWI 0.924 0.077 0.556 0.897 0.933 0.977 1.000
Total ProdEff_* 0.406 0.290 0.034 0.181 0.313 0.549 1.000
Quality_Access 0.759 0.157 0.000 0.667 0.778 0.871 1.000
Quality_LBWI 0.916 0.107 0.000 0.900 0.932 0.966 1.000

There is again a large number of cases with ProdEff_* = 1 that had relatively poor quality output metrics, these were excluded from the reference set for production efficiency. In total, there were 57 points in the reference set for production efficiency.

Out of this set of 57 efficient FQHCs, seven were members of class 1, seven were members of class 2, 20 were members of class 3, 13 were members of class 4, and ten were members of class 5. Table 6 highlights that substantial variation exists across latent classes in terms of average efficiency, quality of care, and pregnancy outcomes. The rank ordering of classes by average ProdEff_* score is similar to that obtained by the aggregated S/L procedure, with class 3 (Large Urban Serving Poor in Low Poverty Area) and class 4 (Dense Urban Poor Racial Minorities) having the two highest average production efficiencies in both instances, and class 2 (Older Rural Whites) having the lowest average production efficiency in both instances. With respect to their LCA characteristics, these 57 efficient FQHCs were again generally similar to the full sample. An ANOVA analysis found a statistically significant difference (at the p=0.01 level) between the members of the efficient reference set and all other FQHCs for only three of the 17 total LCA characteristics: Total number of patients, the % of patients best served in another language (with efficient FQHCs having a larger mean value), and the % prenatal patients (with efficient FQHCs having a larger mean value). Recall that LCA metric Total number of patients showed a statistically significant difference for the aggregated S/L procedure and for four of the five classes under the partitioned S/L procedure, with efficient FQHCs relatively small in all cases. However, for the aggregated S/Z procedure, LCA metric Total number of patients is significant at the p=0.01 level, but now the efficient FQHCs are generally relatively large, treating an average of 29,673 patients, compared to an average of 19,855 patients for all other FQHCs.

With respect to their DEA input characteristics, these 57 efficient FQHCs were much more similar to the full sample included in the DEA than were the 35 FQHCs in the efficient reference sets obtained from the aggregated S/L procedure. An ANOVA analysis found a statistically significant difference (at the p=0.01 level) between the members of the efficient reference set and all other FQHCs for only two of the ten input DEA characteristics (only Medical and Primary Care Physician). This ANOVA analysis found statistically significant differences (at the p=0.01 level) for both DEA production outputs and for one DEA quality output (Quality_LBWI), with members of the efficient reference set serving relatively many prenatal patients (mean 1,278) and many prenatal patients who delivered (mean 705), versus respective mean values of 440 and 230 for all other FQHCs, and members of the efficient reference set achieving higher values on both quality metrics (mean Quality_Access level of 80% and mean Quality LBWI rate of 96.35%), versus respective mean values of 76% and 91.31% for all other FQHCs.

There were 77 total FQHCs excluded from the reference set for production efficiency due to having quality output scores less than the target values. There were 36 FQHCs excluded from the efficient reference set due to having a Quality_Access score less than 66%, and 54 FQHCs excluded from the efficient reference set due to having a Quality_LBWI rate less than the target value of 92.33% (these totals include 13 FQHCs that were excluded from the efficient reference set due to not satisfying both of the quality output metrics). Observe that unlike the S/L procedure, these is no potential confusion about the performance of any FQHC relative to these HRSA standards for quality of care and health outcomes based on a DEA quality efficiency score.

4.5. Partitioned S/Z approach

4.5.1. Description of the procedure

Following a similar line of reasoning as for the implementation of the partitioned S/L procedure, we utilized the same LCA classes and then ran the S/Z model, using the same fixed quality thresholds, to obtain new results for each of the five classes.

4.5.2. Description of the results and efficient reference sets

Table 7 presents summary statistics of the ProdEff_* scores, by class, for this partitioned S/Z procedure.

Table 7.

Production efficiency, by class, partitioned S/Z procedure

Class Mean Standard deviation Min Q1 Median Q3 Max
1 0.607 0.310 0.072 0.330 0.545 1.000 1.000
2 0.420 0.309 0.062 0.177 0.293 0.531 1.000
3 0.650 0.286 0.112 0.411 0.630 1.000 1.000
4 0.681 0.314 0.111 0.390 0.704 1.000 1.000
5 0.681 0.311 0.096 0.382 0.729 1.000 1.000
Total 0.623 0.311 0.062 0.344 0.574 1.000 1.000

Examining the average production efficiency within latent classes in Table 7, a similar pattern emerges as from the results from the partitioned S/L approach. Despite separate DEA model runs being performed for each class, class 2 (Older Rural Whites) has the lowest average production efficiency of all classes and class 4 (Dense Urban Poor Racial Minorities) has the highest average production efficiency of all classes in both the partitioned S/L and the partitioned S/Z approaches. Under the partitioned S/Z approach, there were 153 points across the five reference sets for production efficiency. Out of this set of 153 efficient FQHCs, 40 were members for class 1, 15 were members for class 2, 40 were members for class 3, 36 were members for class 4, and 22 were members for class 5.

With respect to their LCA characteristics, these 153 efficient FQHCs were again relatively similar to the other members of their classes. Recall that total patients served was the primary LCA metric for which the partitioned S/L efficient reference sets differed from the full samples within their classes (with a statistically significant difference for four of the five classes, where the members of the efficient reference sets served relatively few patients, on average). For the partitioned S/Z procedure, at a significance level of p=0.01, ANOVA analyses contrasting the members of the efficient reference sets versus all other FQHCs found no statistically significant difference for any LCA metric for one class (class 5), and a statistically significant difference for only one LCA metric (% prenatal patients) for three classes (class 1, class 2 and class 4), with efficient FQHCs having a higher mean value in each case. This ANOVA analysis found the efficient reference set of class 3 to exhibit a statistically significant difference for three LCA metrics (% prenatal patients, with efficient FQHCs having a higher mean value, % of patients who are elderly, with efficient FQHCs having a smaller mean value, and % of patients on Medicare, with efficient FQHCs having a smaller mean value).

With respect to their DEA input characteristics, these 153 efficient FQHCs were quite similar to the other members of their classes, a very different result than was observed for the efficient reference sets generated from the partitioned S/L procedure. An ANOVA analysis found no statistically significant differences (at the p=0.01 level) between the members of the efficient reference set and all other FQHCs for any DEA input metric for class 1, class 2, class 3 and class 4. This ANOVA analysis found the efficient reference set of class 5 to exhibit a statistically significant difference for two DEA input metrics (Nurse Practitioner/Physician Assistant/Certified Nurse Midwife, with efficient FQHCs having a smaller mean value, and Dental, with efficient FQHCs having a smaller mean value). For the DEA production outputs, this ANOVA analysis found statistically significant differences (at the p=0.01 level) between the members of the efficient reference set and all other FQHCs for only class 3 and class 4 (with efficient FQHCs serving more patients and more patients who delivered in both cases). This ANOVA analysis found a statistically significant difference for the DEA quality outputs for four classes (class 1, class 3, class 4, and class 5), with efficient FQHCs having a higher mean Quality_Access value and a higher mean Quality_LBWI value in all cases.

4.5.3. Comparing the results of Partitioned S/Z to the other approaches

Observe that, when compared to the efficient reference set generated by the aggregated S/Z procedure, this partitioned S/Z approach identified efficient reference sets that could be viewed as more representative of their full populations. Unlike the aggregated S/Z approach, whose efficient reference set differed substantially from the full sample with respect to Total number of patients, % of patients best served in another language, and % prenatal patients, there were no LCA metrics for which, across all five classes, the members of the partitioned S/Z efficient reference sets exhibited significantly large discrepancies with respect to the other members of their classes.

When compared to the DEA outputs generated by the aggregated S/Z procedure, the overall average values of outputs for the aggregated S/Z efficient reference set (1,278 prenatal patients and 705 patients who delivered) were larger than the average values for individual class’s efficient reference sets under the partitioned S/Z procedure in all ten cases (the efficient reference set for class 3 had the largest values for each metric, with 1,258 prenatal patients and 677 patients who delivered). This demonstrates that the partitioned S/Z procedure allows many smaller FQHCs to enter the efficient reference sets, overcoming some potential scale economy advantages enjoyed by larger centers. For the Quality_Access output, the five efficient reference sets obtained by the partitioned S/Z approach each had an average score of at least 80%, performing as well or better than the aggregated S/Z procedure’s efficient reference set. For the Quality_LBWI output, the five efficient reference sets obtained by the partitioned S/Z approach had average scores ranging between 95.77% (for class 4) and 97.97% (for class 5), comparable to the average value of 96.35% achieved by the aggregated S/Z procedure’s efficient reference set.

There were 179 total FQHCs excluded from the efficient reference sets due to having quality output scores less than the target values. There were 85 FQHCs excluded from the efficient reference sets due to having a Quality_Access score less than 66%, and 134 FQHCs excluded from the efficient reference sets due to having a Quality_LBWI rate less than 92.33% (these totals include 40 FQHCs that were excluded from the efficient reference sets due to not satisfying both of the quality output metrics).

Consider now the impact of partitioning on production efficiency scores under the S/Z approach. Note that all 57 FQHCs that appeared in the efficient reference set under the aggregated S/Z procedure also appeared in the efficient reference sets under the partitioned S/Z procedure. Figure 6 presents a scatterplot graph contrasting the ProdEff_* scores obtained by the aggregated S/Z procedure with the ProdEff_* scores obtained by the partitioned S/Z procedure. Table 8 presents descriptive statistics, by latent class, on the difference between the Partitioned ProdEff_* value and the Aggregated ProdEff_* values. Observe from Figure 6 that efficiency scores never worsen under partitioning for the S/Z model. This is because the reference set for the partitioned S/Z model is necessarily a subset of the reference set for the aggregated S/Z model. For the S/Z approach, we essentially eliminate all of the DMUs with low quality (Quality_Access < 0.66 and/or Quality_LBWI < 0.9233) from potential inclusion in the efficient reference set at the start of the analysis. Thus, the efficiency based on the class-specific subset cannot be less than the efficiency based on the entire set (as observed in Figure 6). Recall from Figure 3 that the partitioned S/L procedure did not demonstrate this property. This phenomenon is discussed in more detail in Section 4.6 below.

Fig. 6.

Fig. 6

Aggregated vs. partitioned, S/Z procedure

Table 8.

Production efficiency, partitioned minus production efficiency, aggregated, S/Z procedure

Class Mean Standard deviation Min Q1 Median Q3 Max
1 0.265 0.166 0.000 0.147 0.237 0.368 0.764
2 0.144 0.217 0.000 0.005 0.046 0.158 0.806
3 0.195 0.189 0.000 0.056 0.144 0.281 0.798
4 0.200 0.181 0.000 0.052 0.150 0.322 0.718
5 0.253 0.219 0.000 0.067 0.204 0.400 0.811
Total 0.216 0.193 0.000 0.064 0.170 0.322 0.811

We performed an ANOVA analysis, using the Bonferroni test for pairwise comparisons, and found a statistically significant difference (at the p=0.01 level) for the change in production efficiency values for the following pairs of classes: 1 and 2, 1 and 3, 1 and 4, 2 and 5. This suggests that, compared to the partitioned S/Z approach, the aggregated S/Z approach disproportionately disfavors members of classes 1 (More Diverse, Rural Poor) and 5 (Uninsured Patients).

Consider next the impact of the S/L versus the S/Z procedure on the production efficiency scores generated by partitioning. Figure 7 presents a scatterplot graph contrasting the ProdEff_* scores obtained by the partitioned S/L procedure with those obtained by the partitioned S/Z procedure. Table 9 presents descriptive statistics, by latent class, on the difference between the Partitioned ProdEff_* value under the S/Z approach and the Partitioned ProdEff_* value obtained by the S/L approach. Observe from Figure 7 that class 5 (Uninsured Patients) has more FQHCs with a higher ProdEff_* score under the S/Z approach than under the S/L approach, although 20% of the class 5 FQHCs achieved a higher ProfEff_* score under the S/L approach. For class 4 (Dense Urban Poor Racial Minorities), 27% of the FQHCs had a higher ProdEff_* score under the S/Z approach, while 28% had a higher score under the S/L approach. Classes 1 (More Diverse, Rural Poor) and 3 (Large Urban Serving Poor in Low Poverty Area) had more FQHCs with a higher score under the S/L approach (46% and 43%, respectively), than under the S/Z approach (27% and 35%, respectively). Class 2 (Older Rural Whites) again differs from the other classes, with zero FQHCs achieving a higher ProdEff_* score under the S/Z approach and 57% of its FQHCs achieving a higher score under the S/L approach (the remaining 43% scored the same under both approaches).

Fig. 7.

Fig. 7

Production efficiency, partitioned S/L procedure vs. partitioned S/Z procedure

Table 9.

Production efficiency: partitioned S/Z procedure minus partitioned, S/L procedure

Class Mean Standard deviation Min Q1 Median Q3 Max
1 −0.132 0.261 −0.895 −0.173 0.000 0.002 0.618
2 −0.198 0.275 −0.905 −0.332 −0.020 0.000 0.000
3 −0.063 0.229 −0.801 −0.161 0.000 0.044 0.611
4 −0.084 0.214 −0.839 −0.034 0.000 0.001 0.186
5 0.005 0.227 −0.695 0.000 0.002 0.087 0.600
Total −0.090 0.247 −0.905 −0.151 0.000 0.012 0.618

An ANOVA analysis, using the Bonferroni test for pairwise comparisons, found a statistically significant difference (at the p=0.001 level) for the change in efficiency values for the following pairs of classes: 1 and 5, 2 and 3, 2 and 4, 2 and 5. Statistically significant differences were also observed at the p=0.05 level for the following pair of classes: 1 and 3, 3 and 5, 4 and 5. This suggests that the partitioned S/Z approach generates higher average efficiencies for members of 5 (Uninsured Patients), while the partitioned S/L approach generates higher average efficiencies for members of class 2 (Older Rural Whites). This underscores the importance of utilizing an appropriate control mechanism for quality outputs when making efficiency comparisons between health care organizations.

4.6. Discussion

Comparing the aggregated S/L model and the aggregated S/Z model, we found that the FQHCs that made up their efficient frontiers were similar to and different from the full sample in different ways. Particularly, we found that the efficient frontier for the aggregated S/L approach included FQHCs that were similar to the full sample for 15 of the 17 measures used in the LCA, however this efficient reference set disproportionately included smaller FQHCs. Given that our DEA model used the variable return-to-scale (VRS) formulation, it was somewhat surprising to observe that all 35 FQHCs in the efficient reference sets for the aggregated S/L model both served less than the full sample’s average of 483 prenatal patients and served fewer than the full sample’s average of 255 prenatal patients who delivered (the average values across these 35 FQHCs were 63 prenatal patients and 27 prenatal patients who delivered). Conversely, the efficient frontier for the aggregated S/Z approach included FQHCs that were similar to the full sample for 14 of the 17 LCA measures, and included a mix of both large and small FQHCs. Across this set of 57 FQHCs, there were 28 which served more than the full sample’s average for prenatal patients, and 27 which performed more than the full sample’s average for prenatal patients who delivered (the average values across these 57 FQHCs were 1,278 prenatal patients and 705 deliveries).

This discrepancy across the composition of the efficient reference sets is primarily attributable to the S/L model’s use of a quality efficiency metric, rather than a fixed quality standard. Reflecting on our results, we believe it is important to ensure that the conceptualization of quality fits the application being studied. For example, FQHC IL37 achieved Quality_Access=59% and Quality_LBWI = 81.82%, and yet it was defined as efficient with respect to quality, with a QualEff_0 score equal to one, because it consumed relatively little of the DEA inputs to achieve these outputs. In fact, out of the 35 FQHCs in the aggregated S/L model’s efficient reference sets, five had Quality_Access less than the target value of 66% and ten had Quality_LBWI less than the target value of 92.33%. Stakeholders must decide whether the concept of health care quality described by the S/L model, which is partially driven by use of resources (inputs) to achieve health outcomes (outputs) is suitable, particularly when low use of inputs can result in relatively poor outputs and still achieve quality efficiency from a DEA perspective.

The use of LCA to partition the full sample and account for heterogeneity among FQHC operating environments appears to be both necessary and effective as a strategy to ensure that each FQHC is compared against other FQHCs that provide care in similar contexts. Its necessity is illustrated by an analysis of the difference between efficiency scores obtained under the aggregated and partitioned approaches, which found that the S/L aggregated model disproportionately disfavors members of the rural classes 1 (More Diverse, Rural Poor) and 2 (Older Rural Whites) while the S/Z aggregated model disproportionately disfavors members of classes 1 and 5 (Uninsured Patients). Its effectiveness is demonstrated by the fact that, when compared to the efficient reference set generated by the aggregated S/Z procedure, this partitioned S/Z approach identified efficient reference sets that could be viewed as more representative of their full populations, with no LCA metrics for which, across all five classes, the members of the partitioned S/Z efficient reference sets exhibited consistently large discrepancies with respect to the other members of their classes. Moreover, when compared to the DEA outputs generated by the aggregated S/Z procedure, the partitioned S/Z procedure allowed many smaller FQHCs to enter the efficient reference sets, mitigating some of the scale economy advantages enjoyed by larger FQHCs.

As illustrated in Figure 6, each FQHC’s efficiency score under the partitioned S/Z model is at least as high as its aggregated S/Z efficiency score. This occurs because partitioning reduces the DEA sample, which may result in removing DMUs from the reference set and replacing them with less efficient DMUs, thus increasing one’s efficiency score. The S/L model does not maintain this property, because the S/L approach does not necessarily eliminate the same “high efficiency-low quality” DMUs under the aggregated and partitioned approaches. For example, DMU X might be efficient with acceptable quality in the partitioned approach, but X might be excluded from consideration in the aggregated approach due to poor quality when compared to the larger original set of DMUs. In this case, DMU Y might suffer from comparison with X in the partitioned model, but not be subject to comparison with X in the aggregated approach. This can be viewed as a limitation of the partitioned S/L approach.

To illustrate this effect in our results, consider FQHC IL31. Table 10 shows IL31’s production efficiency from the partitioned S/L model for each iteration as well as its set of reference DMUs. Based on the final comparison with MI37, NJ17, OH22 and PA6, the model returns ProdEff_*=0.429. Each of these four reference set members has QualEff_0=1.0 in the partitioned model.

Table 10.

Partitioned S/L model results for IL31 (* indicates reference DMUs dropped in the next iteration due to low QualEff_0)

Iteration ProdEff Reference Set
1 0.407 KY23, NJ17, PA6, *IL17
2, 3, 4, 5 0.429 MI37, NJ17, OH22, PA6

However, in the aggregated case, NJ17 gets excluded in iteration 2 due to its QualEff_0=0.26. Thus, in the aggregated case, IL31 does not have to compete with NJ17 after iteration 1. Presumably, IL31’s reference set of DMUs for the partitioned model are more efficient than the reference set of DMUs for the aggregated model, and thus IL31 ends up with a higher efficiency value of ProdEff_*=0.641 in the aggregated case. The efficiency of IL31 for each iteration of the aggregated S/L model is shown in Table 11. Observe that the final reference sets of IL31 for the aggregated and partitioned S/L models has only one common FQHC (MI37).

Table 11.

Aggregated S/L model results for IL31 (* indicates reference DMUs dropped in the next iteration due to low QualEff_0)

Iteration ProdEff Reference Set
1 0.354 NC9, OR31, PA6, *IL17, *NJ17
2 0.523 CA166, NJ3, OR31, PA32, *MI18
3 0.617 CA166, MI37, NJ3, *PA16
4, 5, 6, 7, 8, 9, 10 0.641 CA166, CA174, MI37, NJ3

Figure 3 shows that class 3 (Large Urban Serving Poor in Low Poverty Area) has most of the FQHCs that are subject to this efficiency score reduction when going from the aggregated to the partitioned model. To understand why this is more prevalent for class 3, the second column of Table 12 shows the percent of FQHCs, by class, that were efficient and not excluded in the partitioned model which were excluded in the aggregated model (such as NJ17 in the case described above). Since class 3 has many such FQHCs (occurs at least 1.7 times as frequently as for any other class), it stands to reason that class 3 would have more points whose efficiency worsens under partitioning. As a robustness check for this reasoning, notice that class 2 (Older Rural Whites) has zero such FQHCs, and as expected none of the class 2 FQHCs had their efficiency scores worsen in the partitioned model. The third column of Table 12 shows, for each class, the average aggregated model quality efficiency for FQHCs that were efficient but not excluded in the partitioned model. This suggests that the partitioned efficient reference set for class 3 has relatively poor quality, compared to the full FQHC set. Why this is particularly an issue for class 3 is a topic for future research.

Table 12.

Quality under aggregated model for FQHCs in partitioned model’s reference set for production efficiency

Class Percent of FQHCs in partitioned model’s reference set for production efficiency that were excluded from aggregated model Average aggregated model quality efficiency for FQHCs in partitioned model’s reference set for production efficiency
1 7% 84%
2 0% 88%
3 34% 51%
4 20% 87%
5 18% 71%

Decision-makers who decide to use the S/L approach but prefer not to have efficiency score reduction when going from the aggregated to the partitioned model could eliminate the issue by using the aggregated quality efficiencies as the QualEff_0 thresholds for the partitioned model. However, additional computational testing found that this technique can produce partitioned reference sets with very low aggregated production efficiency. Therefore, it is important to carefully consider how to interpret the partitioned DEA results if one utilizes the S/L approach.

5. Conclusions

In this paper, we modeled the efficiency of FQHCs using two DEA models (i.e., the S/L and S/Z approaches) that handle quality outputs differently. For each model, we contrasted the results of a single DEA over the set of all FQHCs (referred to as the aggregated case) with a classification-based approach (referred to as the partitioned case) that utilized LCA, where the LCA included FQHC patient population measures as well as regional measures. We examined how the FQHCs that comprise the efficient frontiers for these models compare to the LCA measures. We found that the efficient frontier for the aggregated S/L approach disproportionately included smaller FQHCs. For the aggregated S/Z approach, the size of the efficient reference set’s FQHCs were larger than average. The efficient frontier for the aggregated S/Z approach much more closely approximated the sample means for DEA inputs and outputs, relative to the set from the aggregated S/L approach. These same general contrasts were also observed between the partitioned S/L and partitioned S/Z approaches.

Overall, we found the S/Z approach to be superior in several ways. The efficient reference set generated by the S/Z approach was not dominated by small FQHCs, and it included only FQHCs that achieved target values for quality outputs, contrary to the efficient reference set generated by the S/L approach. Methodologically, we found that FQHC efficiencies could decrease in the partitioned S/L case, relative to the aggregated case, which is counterintuitive, since partitioning compares an FQHC to a subset of the original comparison set; the S/Z approach is immune to such peculiar results.

Substantively, it was clear in the literature that patient population differences can affect FQHC performance. While other papers in the FQHC DEA literature have used measures such as socioeconomic status, insurance status, and race as covariates that are controlled for in multivariate analyses, these analyses have been post hoc (i.e., after the DEA has already been run), potentially leading to biased efficiency scores. A major contribution of our paper is to perform the LCA first, thus allowing the FQHCs to be only compared against their relevant peer group in the DEA. Based on ANOVA analyses, we observed significant differences in efficiencies and quality metrics across latent classes. Future research should continue to examine this variation in efficiency across FQHC typologies to better understand the source of this variation and its consequences for patient health outcomes.

5.1. Limitations and future research directions

The data used in this study are publicly available through HRSA. However, a limitation of this study is related to the proprietary nature of some FQHC staffing and utilization data. While we were able to obtain aggregate FQHC-level data by major service category (e.g., the Nurse Practitioner/Physician Assistant/Certified Nurse Midwife category), data separated by the specific staffing types was not available. Secondly, the DEA output for prenatal patients served represents data on the total number of unique patients, and not total number of patient encounters. Since the number of times a given patient visits an FQHC may be highly variable, this may influence the productivity of the health center. Patient encounters would be an useful additional measure, since an FQHC that serves patients with more frequent visits may have more difficultly remaining efficient (because of the extra resources needed to serve those patients) and, at the same time, may have an ability to deliver better health outcomes. Third, the FQHC service data is aggregated in the sense that it does not separate the data for FQHCs that operate multiple service locations. This limitation does not impact the patient population data, however, since the ZCTA-level patient data indeed does capture, at a fine-grained level of detail, the residence location of all patients who receive service at any of an FQHC’s service locations.

As previously mentioned, the DEA inputs that are included relate to all of the services provided at the FQHCs, not only those for prenatal services. Therefore, an FQHC with relatively many resources that serves few prenatal patients could be interpreted in this analysis as either being inefficient, or as an FQHC whose strategy does not heavily prioritize prenatal service delivery (perhaps because of low demand for that service). Particularly for the aggregated case, we do not claim that our results allow us to distinguish between these two interpretations. This issue is not easily resolved, since FQHC staff serve patients with many needs (e.g., medical, dental, mental health, and substance abuse). Clearly, a perfectly matching set of inputs and outputs is not easily found for complex service organizations (e.g., banking [47]), and the issue is common for assessment of primary care health providers. For example, Amado and Dyson [48] mention, “…performance assessment in primary care is necessarily partial, as we cannot measure all the essential quality aspects” (p. 928). However, for the partitioned case, we propose that the DEA results are a better measure of efficiency since FQHCs are only compared to their peer group (which includes percentage prenatal patients as a measure for the peergroup classification). This study is the first attempt to use LCA to construct such peer groups in an effort to mitigate the input-output mismatch issue, future research is needed to explore this approach in other settings.

There are many other future research directions related to this study that would be of interest to scholars across several disciplines. First, while this paper’s results point to variations in efficiency across latent classes, an in-depth interpretation of the underlying sociodemographic processes which contribute to such variation is beyond the scope of the current manuscript. Accordingly, in order to better understand why these latent classes matter in the context of pregnancy-related health inequalities and FQHC efficiency, future work aims to extend this work using an interdisciplinary approach from the fields of sociology and demography. A second area of future research related to this project could involve focusing on how efficiently FQHCs deliver other health outcomes (e.g., chronic illnesses, dental health, mental health, etc.). Third, future research should continue to explore how both compositional and operational characteristics of health organizations contribute to unique outcomes when modeling and evaluating efficiency. Finally, there may be other application areas that would benefit from a partitioned DEA modeling approach to account for variation in populations served as well as institutional and environmental characteristics that may impact quality and efficiency.

Acknowledgements:

Research reported in this publication was supported by the National Institute of General Medical Sciences (NIGMS) of the National Institutes of Health, Award Number 5P20GM104417. The funding source had no role in study design; collection, analysis and interpretation of data; in the writing of the report; or in the decision to submit the article for publication. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Appendix A: Statistical analyses for data sets

Table A.1.

Pairwise correlation coefficients between DEA inputs and DEA production outputs

DEA input Annual number of prenatal patients served Annual number of prenatal patients who delivered
Total non-patient revenue 0.6029* 0.5782*
Primary Care Physician 0.8038* 0.7883*
Non-Primary Care Physician 0.2248* 0.2285*
Nurse Practitioner/ Physician Assistant/ Certified Nurse Midwife 0.6827* 0.6719*
Medical 0.8067* 0.8066*
Dental 0.5403* 0.5540*
Mental Health 0.3137* 0.3015*
Substance Abuse 0.0350 0.0360
Vision 0.2521* 0.2563*
Enabling 0.6424* 0.6270*

“*” indicates statistical significance at the p=0.001 level

Table A.2.

Descriptive statistics for DEA input data

Metric Mean Standard deviation Min Q1 Median Q3 Max
Total non-patient revenue (thousands of $) 6,033 8,195 0 2,220 3,809 7,068 157,885
Primary Care Physician 9.5 13.2 0.0 2.4 5.3 11.5 153.7
Non-Primary Care Physician 0.3 3.6 0.0 0.0 0.0 0.0 109.1
Nurse Practitioner/Physician Assistant/Certified Nurse Midwife 8.7 8.8 0.0 3.0 5.9 10.8 64.4
Medical 38.2 47.5 0.0 10.9 22.2 46.5 456.1
Dental 12.3 17.6 0.0 1.6 7.0 15.9 176.4
Mental Health 6.5 15.1 0.0 0.8 2.3 6.1 217.8
Substance Abuse 0.8 4.4 0.0 0.0 0.0 0.0 93.5
Vision 0.5 1.8 0.0 0.0 0.0 0.0 29.1
Enabling 15.6 20.9 0.0 4.0 8.6 19.1 204.0

Table A.3.

Descriptive statistics for LCA data, across 1,111 FQHCs in DEA models

Metric Mean Standard deviation Min Q1 Media n Q3 Max
Population Density (pop/sq. mile) 1,895.7 5,179.6 0.6 89.5 275.1 1,527.6 54,951.6
Proportion Urban 0.456 0.498 0.000 0.000 0.000 1.000 1.000
% non-white 0.280 0.176 0.026 0.138 0.249 0.401 0.891
% in poverty 0.176 0.046 0.051 0.145 0.173 0.201 0.367
% on Medicaid 0.249 0.071 0.063 0.201 0.237 0.286 0.553
% uninsured 0.123 0.045 0.031 0.089 0.120 0.148 0.318
% with no usual source of care 0.199 0.061 0.061 0.153 0.195 0.234 0.442
Total number of patients 20,358 22,901 543 6,699 12,898 25,563 188,122
% patients who are children 0.278 0.122 0.000 0.199 0.271 0.357 0.845
% patients who are elderly 0.088 0.056 0.000 0.049 0.070 0.113 0.350
% non-white patients 0.564 0.311 0.002 0.280 0.628 0.838 0.998
% of patients best served in another language 0.179 0.206 0.000 0.018 0.097 0.283 1.000
% of patients in poverty 0.671 0.185 0.000 0.559 0.702 0.807 1.000
% of uninsured patients 0.262 0.171 0.011 0.137 0.224 0.354 1.000
% of patients on Medicaid 0.444 0.188 0.000 0.291 0.452 0.596 0.955
% of patients on Medicare 0.099 0.065 0.000 0.051 0.083 0.135 0.395
% prenatal patients 0.019 0.020 0.000 0.005 0.014 0.027 0.182

Appendix B: Algorithms for implementation of S/L and S/Z procedures

Sets:

  • R: DEA production outputs; R=r1,…,r2

  • Q: DEA quality outputs; Q=q1,…,q2

    • q1: Quality_Access

    • q2: Quality_LBWI

  • J: FQHCs; J=j1,…,j1111

Parameters:

  • k: iteration counter

  • Ek: number of FQHCs excluded due to iteration k

Algorithm 1:

S/L procedure, ProdEff_*

k=1
do while k=1 or Ek−1 > 0
    Ek = 0
    for j in J do
        Solve DEA model across set J, using production outputs R, recording efficiency Θj
        ProdEffj*=Θj
    end for
    for j in J do
        Solve DEA model across set J, using quality outputs Q, recording efficiency Θ^j
    end for
    for j in J do
        if Θj = 1 and Θ^j<0.95 then
            J = J\j
            Ek = Ek + 1
        end if
    end for
    k=k+1
end do

Algorithm 2:

S/L procedure, QualEff_*

k=1
do while k=1 or Ek−1 > 0
    Ek = 0
    for j in J do
        Solve DEA model across set J, using production outputs R, recording efficiency Θj
    end for
    for j in J do
        Solve DEA model across set J, using quality outputs Q, recording efficiency Θ^j
        QualEffj*=Θ^j
    end for
    for j in J do
        if Θj < 0.95 and Θ^j=1 then
            J = J\j
            Ek = Ek + 1
        end if
    end for
    k=k+1
end do

Algorithm 3:

S/Z procedure

k=1
do while k=1 or Ek−1 > 0
    Ek = 0
    for j in J do
        Solve DEA model across set J, using production outputs R, recording efficiency Θj
        ProdEffj*=Θj
    end for
    for j in J do
        if Θj = 1 and (q1 < 0.66 or q2 < 0.9233) then
            J = J\j
            Ek = Ek + 1
        end if
    end for
    k=k+1
end do

References

  • [1].Charnes A, Cooper WW, Rhodes E (1978) Measuring the efficiency of decision making units. Eur J Oper Res, 2(6), 429–444 [Google Scholar]
  • [2].Charnes A, Cooper WW, Rhodes E (1981) Evaluating program and managerial efficiency: an application of data envelopment analysis to program follow through. Manage Sci, 27(6), 668–697 [Google Scholar]
  • [3].Nunamaker TR (1983) Measuring routine nursing service efficiency: a comparison of cost per patient day and data envelopment analysis models. Health Serv Res, 18(2 Pt 1), 183–208 [PMC free article] [PubMed] [Google Scholar]
  • [4].Banker RD, Charnes A, Cooper WW (1984) Some models for estimating technical and scale inefficiencies in data envelopment analysis. Manage Sci, 30(9), 1078–1092 [Google Scholar]
  • [5].Health Resources and Service Administration (2017) FY 2017 Annual Performance Report. Available online at https://www.hrsa.gov/about/budget/fy17annualperformancereport.pdf Accessed 22 February, 2018
  • [6].Rosenbaum S (2015) Will health centers go over the “Funding Cliff”? Milbank Quart, 93(1) 32–35 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].Marathe S, Wan TTH, Zhang J, Sherin K (2007) Factors influencing community health centers’ efficiency: a latent growth curve modeling approach. J Med Syst, 31, 365–374 [DOI] [PubMed] [Google Scholar]
  • [8].Amico PR, Chilingerian JA, van Hasselt M (2014) Community health center efficiency: the role of grant revenues in health center efficiency. Health Serv Res, 49(2), 666–682 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].VanderWielen LM, Ozcan YA (2015).An assessment of the health care safety net: performance evaluation of free clinics. Nonprof Volunt Sec Q, 44(3), 474–486 [Google Scholar]
  • [10].Lazarsfeld PF, Henry NW (1968) Latent Structure Analysis. Houghton Mifflin, Boston [Google Scholar]
  • [11].Agrell PJ, Farsi M, Filippini M, Koller M (2013) Unobserved heterogeneous effects in the cost efficiency analysis of electricity distribution systems. CERETH Economics working papers series 13/171, CER-ETH - Center of Economic Research (CER-ETH) at ETH Zurich
  • [12].Llorca M, Orea L, Pollitt MG (2014) Using the latent class approach to cluster firms in benchmarking: an application to the US electricity transmission industry. Oper Res Perspect, 1, 6–17 [Google Scholar]
  • [13].Shimshak D, Lenard ML (2007) A two-model approach to measuring operating and quality efficiency with DEA. INFOR, 45(3), 143–151 [Google Scholar]
  • [14].Sherman HD, Zhu J (2006) Benchmarking with quality-adjusted DEA (Q-DEA) to seek lower-cost high-quality service: evidence from a US bank application. Ann Oper Res, 145(1), 301–319 [Google Scholar]
  • [15].Emrouznejad A, Yang GL (2017) A survey and analysis of the first 40 years of scholarly literature in DEA: 1978–2016. Socio Econ Plan Sci, 61, 4–8 [Google Scholar]
  • [16].Hollingsworth B, Dawson PJ, Maniadakis N (1999) Efficiency measurement of health care: a review of non-parametric methods and applications. Health Care Manag Sc, 2(3), 161–172 [DOI] [PubMed] [Google Scholar]
  • [17].Worthington AC (2004) Frontier efficiency measurement in health care: a review of empirical techniques and selected applications. Med Care Res Rev, 61(2), 135–170 [DOI] [PubMed] [Google Scholar]
  • [18].Hollingsworth B (2008) The measurement of efficiency and productivity of health care delivery. Health Econ, 17(10), 1107–1128 [DOI] [PubMed] [Google Scholar]
  • [19].Ozcan YA (2014) Health care benchmarking and performance evaluation: An assessment using data envelopment analysis (DEA), 2nd ed., Springer, Boston [Google Scholar]
  • [20].Sherman HD, Zhu J (2006) Service productivity management: Improving service performance using data envelopment analysis (DEA), Springer, New York [Google Scholar]
  • [21].DePuccio M, Ozcan YA (2017) Exploring efficiency differences between medical home and non-medical home hospitals. Intl J Heathcare Manage, 10(3), 147–153 [Google Scholar]
  • [22].VanderWielen LM, Ozcan YA (2015) An Assessment of the Healthcare Safety Net: Performance Evaluation of Free Clinics. Nonprof Volunt Sec Q, 44(3), 474–486 [Google Scholar]
  • [23].Khushalani J, Ozcan YA (2017) Are hospitals producing quality care efficiently? An analysis using Dynamic Network Data Envelopment Analysis (DEA). Socio-Econ Plan Sci, 60, 15–23. [Google Scholar]
  • [24].Ozcan YA, Khushalani J (2017) Assessing efficiency of public health and medical care provision in OECD countries after a decade of reform. Cent Eur J Oper Res, 25(2), 325–343 [Google Scholar]
  • [25].Highfill T, Ozcan YA (2016) Productivity and quality of hospitals that joined the Medicare Shared Savings Accountable Care Organization Program, Intl J Heathcare Manage, 9(3), 210–217 [Google Scholar]
  • [26].Nayar P, Ozcan YA, Yu F, Nguyen AT (2013). Benchmarking urban acute care hospitals: Efficiency and quality perspectives. Health Care Manage R, 38(2), 137–145 [DOI] [PubMed] [Google Scholar]
  • [27].Mark BA, Jones CB, Lindley L, Ozcan YA (2009). An Examination of Technical Efficiency, Quality and Patient Safety on Acute Care Nursing Units. Policy Polit Nurs Pract, 10(3), 180–186 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [28].Rahman MA, Capitman JA (2012) Can more use of supporting primary care health practitioners increase efficiency of health clinics? Evidence from California’s San Joaquin Valley. J Health Care Financ, 38(3), 78–92 [PubMed] [Google Scholar]
  • [29].Hollingsworth B, Street A (2006) The market for efficiency analysis of health care organisations. Health Econ, 15(10), 1055–1059 [DOI] [PubMed] [Google Scholar]
  • [30].Giuffrida A, Gravelle H (2001) Measuring performance in primary care: econometric analysis and DEA. Applied Economics, 33, 163–175 [Google Scholar]
  • [31].Bastian ND, Kang H, Swenson ER, Fulton LV, Griffin PM (2016) Evaluating the impact of hospital efficiency on wellness in the military system. Military Medicine, 181(8), 827–834. [DOI] [PubMed] [Google Scholar]
  • [32].Collins LM, Lanza ST (2010) Latent class and latent transition analysis: With applications in the social, behavioral, and health sciences. Wiley, Hoboken, NJ [Google Scholar]
  • [33].McCutcheon AL (1987) Latent class analysis. Sage, Newbury Park, CA [Google Scholar]
  • [34].Agresti A (1990) Categorical data analysis. Wiley, New York [Google Scholar]
  • [35].Vermunt JK, Magidson J (2002) Latent class cluster analysis In: Hagenaars J, McCutcheon A (eds) Applied latent class analysis. Cambridge University Press, Cambridge, pp 89–106 [Google Scholar]
  • [36].Clogg CC (1995) Latent class models: Recent developments and prospects for the future In: Arminger G, Clogg CC, Sobel MW (eds), Handbook of statistical modeling for the social and behavioral sciences. Plenum Press, New York, pp 311–359 [Google Scholar]
  • [37].Nyland KL, Asparouhov T, Muthén BO (2007) Deciding on the number of classes in latent class analysis and growth mixture modeling: a Monte Carlo simulation study Struct Equ Modeling, 14(4), 535–69 [Google Scholar]
  • [38].Corbella S, Taschieri S, Del Fabbro M, Francetti L, Weinstein R, Ferrazzi E (2016) Adverse pregnancy outcomes and periodontitis: a systematic review and meta-analysis exploring potential association. Quintessence Int, 47(3), 193–204 [DOI] [PubMed] [Google Scholar]
  • [39].Grote NK, Bridge JA, Gavin AR, Melville JL, Iyengar S, Katon WJ (2010) A meta-analysis of depression during pregnancy and the risk of preterm birth, low birth weight, and intrauterine growth restriction. Arch Gen Psychiat, 67(10), 1012–1024 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [40].Kelly RH, Russo J, Holt VL, Danielsen BH, Zatzick DF, Walker E, Katon W (2002) Psychiatric and substance use disorders as risk factors for low birth weight and preterm delivery. Obstet Gynecol, 100(2), 297–304 [DOI] [PubMed] [Google Scholar]
  • [41].Hellström A, Smith LEH, Dammann O (2013) Retinopathy of prematurity. Lancet 382(9902), 1445–1457 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [42].Zhang S, Cardarelli K, Shim R, Ye J, Booker KL, Rust G (2013) Racial disparities in economic and clinical outcomes of pregnancy among Medicaid recipients. Matern Child Healt J, 17(8), 1518–1525 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [43].Chilingerian JA, Sherman HD (2011) Health-care applications: From hospitals to physicians, from productive efficiency to quality frontiers In: Cooper WW, Seiford LM, Zhu J (eds) Handbook on data envelopment analysis. Springer, Berlin, pp 445–490 [Google Scholar]
  • [44].Dziak JJ, Coffman DL, Lanza ST, Li R (2012) Sensitivity and specificity of information criteria. The Methodology at the Pennsylvania State University, Technical Report Series #12–119. https://methodology.psu.edu/media/techreports/12-119.pdf [Google Scholar]
  • [45].Chilingerian JA, Sherman HD (1990) Managing physician efficiency and effectiveness in providing hospital services. Health Serv Manage Res, 3(1), 3–15 [DOI] [PubMed] [Google Scholar]
  • [46].Centers for Disease Control (2017) National Center for Health Statistics, Birthweight and Gestation, https://www.cdc.gov/nchs/fastats/birthweight.htm Accessed 22 February 2018
  • [47].Luo X (2003) Evaluating the profitability and marketability efficiency of large banks An application of data envelopment analysis. Journal of Business Research, 56(8), 627–635 [Google Scholar]
  • [48].Amado CAF, Dyson RG (2008) On comparing the performance of primary care providers. Eur J Oper Res, 185(3), 915–932 [Google Scholar]

RESOURCES