Skip to main content
Annals of Surgery logoLink to Annals of Surgery
. 2006 May;243(5):636–644. doi: 10.1097/01.sla.0000216508.95556.cc

National Surgical Quality Improvement Program (NSQIP) Risk Factors Can Be Used to Validate American Society of Anesthesiologists Physical Status Classification (ASA PS) Levels

Daniel L Davenport *, Edwin A Bowe , William G Henderson , Shukri F Khuri §, Robert M Mentzer Jr *
PMCID: PMC1570549  PMID: 16632998

Abstract

Objective:

The purpose of this study was to determine the relationship between the American Society of Anesthesiologists’ Physical Status (ASA PS) classifications and the other National Surgical Quality Improvement Program (NSQIP) preoperative risk factors.

Background:

The ASA PS has been shown to predict morbidity and mortality in surgical patients but is inconsistently applied and clinically imprecise. It is desirable to have a method for validating ASA PS classification levels.

Methods:

The NSQIP preoperative risk factors, including ASA PS, were recorded from a random sample of 5878 surgical patients on 6 services between October 1, 2001 and September 30, 2003 at the University of Kentucky Medical Center. Mortality, morbidity, costs, and length of stay were obtained and compared across ASA PS levels. The ability of 1) ASA PS alone, 2) the other NSQIP risk factors, and, 3) all factors combined to predict outcomes was analyzed. A model using the other NSQIP risk factors was developed to predict ASA PS.

Results:

ASA PS alone was a strong predictor of outcomes (P < 0.01). However, the other NSQIP risk factors were better predictors as a group. There was significant interdependence between the ASA PS and the other NSQIP risk factors. Predictions of ASA PS using the other factors showed strong agreement with the anesthesiologists’ assignments.

Conclusions:

The NSQIP risk factors other than ASA PS can and should be used to validate ASA PS classifications.


The American Society of Anesthesiology's Physical Status Classification (ASA PS) is known to be a strong predictor of surgical outcomes. However, the ASA PS is limited because it lacks clinical precision and is inconsistently applied by different anesthesiologists. The National Surgical Quality Improvement Program tracks 60 preoperative risk factors, including the ASA PS. This study demonstrates that the other 59 risk factors are effective in predicting ASA PS and therefore can be used to validate ASA PS levels by hospital or anesthesiologist.

The American Society of Anesthesiology's Physical Status Classification (ASA PS) has been shown to be a significant predictor of morbidity and mortality in surgical patients.1,2 It has been used by clinicians,3 hospital and outpatient surgery administrators,4 government policymakers,5 and researchers6 to classify the severity of surgical patients and to evaluate the risk related to surgical outcomes. This recognition of the effectiveness of the ASA PS as a risk adjustor coincides with a need expressed by some researchers for a limited set of clinical factors to inexpensively add to administrative data sets to improve their risk-adjusting capability.7 Because of its predictive power and ubiquity, the ASA PS may present itself as one of these factors.

ASA PS was first proposed by Drs. Saklad, Taylor, and Rovenstein in 1941 as a method for categorizing surgical patients for study8 and developed essentially into its current form under recommendation by Dr. Dripps and colleagues in 1961.9 It has undergone slight modification by the ASA and is now a standard element of the anesthesiologist's preoperative assessment of surgical patients worldwide.10 The goal of the system is to assess the overall physical status of the patient prior to surgery and not to assess surgical risk per se because it neglects the impact of the surgery itself on the patient's outcomes.11 The definitions of the 6 physical status classes are relatively straightforward and are shown in Table 1. 12

TABLE 1. Definitions of American Society of Anesthesiology's Physical Status (ASA PS) Class Levels

graphic file with name 9TT1.jpg

The simplicity of these definitions is both the classification system's strength and weakness. It is easily applied and communicated, which makes it practical to use; however, ASA PS lacks specificity, which leads to inconsistent13,14 ratings between anesthesiologists and imprecise clinical interpretation.15 Indeed, the lack of precision for such a used measure has resulted in numerous inquiries to the ASA16 and may have resulted in the following statement found on the ASA web site12:

“These definitions appear in each annual edition of the ASA Relative Value Guide. There is no additional information that will help you further define these categories.”

If risk adjustment using ASA PS becomes more financially or clinically important in comparing healthcare providers, the lack of precision and poor interrater consistency may lead to inaccuracy. Worse, it could provide opportunity for artificial inflation and “gaming” of risk-adjusting systems.

Given the weaknesses and potential abuses of the ASA PS, it is important to develop a more objective method for validating and understanding the ASA PS. The NSQIP database, because of the robustness of its 60 prospectively tracked clinical preoperative risk variables (including the ASA PS) is well suited to the task. The Department of Surgery at UKMC was one of the alpha sites to implement NSQIP in a non-VA hospital. With several years of data from 6 surgical specialties, the Surgery Department had sufficient information to examine the ASA PS in regards to other surgical risk factors and in regards to outcomes. The purpose of this study, therefore, was 2-fold: 1) to examine the relationship between the ASA PS levels and the other NSQIP risk factors in predicting surgical outcomes, and 2) to determine whether the other NSQIP risk factors can be used to predict ASA PS levels to validate them.

METHODS

As part of the NSQIP, the UKMC Department of Surgery conducted a random sample of major surgical procedures performed by its General Surgery, Neurosurgery, Orthopaedics, Plastic Surgery, Thoracic Surgery, and Vascular Surgery services between October 1, 2001 and September 30, 2003. The NSQIP methodology in the private sector has been published elsewhere.12,17,18 Briefly, the NSQIP includes major surgery patients as defined by those procedures having general, epidural, or spinal anesthesia along with some monitored anesthetic care procedures. The NSQIP excludes the primary admission related to trauma and patients younger than 17 years. The program also limits the number of certain low-morbidity, high-volume procedures such as hernia repairs, lumpectomies, and transurethral prostatectomies to 5 operations in every 8-day cycle. Patient selection at UKMC was randomized by taking the first 70 patients from the operating room log every 8 days that matched the inclusion criteria. Using an 8-day cycle ensured that a different daily operating room schedule was included as the majority of cases in consecutive cycles.

For each patient, a clinical nurse recorded 60 preoperative risk factors, 18 intraoperative factors, and 22 postoperative complications, including death, for 30 days postoperatively. Information after discharge was obtained through hospital and clinic medical document review as well as follow-up contact by letter and phone. Morbidity was defined as a patient having 1 or more of the 21 specific NSQIP postoperative complications. Inpatient hospital variable direct costs were obtained from the hospital cost accounting system, TSI (Eclipsys Technologies Corporation, Boca Raton, FL).

The strength of the associations between ASA PS and the surgical outcomes of 30-day morbidity and mortality, hospital costs, and length of stay were analyzed. Next, the predictive power for these same outcomes was measured for 1) ASA PS alone, 2) the other NSQIP preoperative risk factors without ASA PS, and 3) a combination of ASA PS and all the other NSQIP risk factors. Finally, the strength of correlation between ASA PS and the other risk factors was assessed and the correlating risk factors were used to develop a model to predict ASA PS.

Statistical Analyses

The strength of the associations between ASA PS and 30 day morbidity and mortality, hospital costs and length of stay were assessed using the appropriate χ2 or analysis of variance (ANOVA) test. Logistic regressions for morbidity and mortality and linear regressions for costs and length of stay were performed. Costs and length of stay were transformed by taking the natural logarithm before regressing against ASA PS. In the logistic regression models, the χ2 likelihood ratio test was used to assess model significance and the c-index to assess model explanatory power. P values for the increase in c-indices were calculated using the Hanley and McNeil method.19 In the linear regression models, ANOVA was used to assess model significance and adjusted R2 to assess model explanatory power. Kendall's Tau-b nonparametric correlations were calculated between the ASA PS and the other risk factors. Finally, ordinal regression was then performed using the other NSQIP risk factors to predict ASA PS and the strength of the prediction measured by the agreement between predicted and assigned values.

RESULTS

A total of 5878 cases were available for analysis. The distribution by the 6 surgical services is listed in Table 2. The distribution by service reflects the NSQIP exclusion of pediatric, trauma, and “minor” surgery cases along with the relative case load of the services at UKMC. The sample represented 34% of all procedures performed by those services at UKMC during that period. Data collection for Plastic Surgery began in January 2002 and for Thoracic Surgery in February 2002, which reduced slightly their representation in this 2-year sample.

TABLE 2. Sample Size by Service

graphic file with name 9TT2.jpg

ASA PS Distribution and Association With Surgical Outcomes

The median ASA PS level in this sample was level II; only 4.2% of patients were assigned to a ASA PS IV or V. The distribution of ASA PS levels along with the associated mortality rates, morbidity rates, mean costs, and mean lengths of stay are shown in Table 3. For example, 997 (16.7%) of patients were assigned to ASA PS I. These patients had no deaths, 1.7% morbidity, mean variable direct costs of $1986, and a mean length of stay of 1.8 days. This compares to level V patients who had 70% mortality, 40% morbidity, mean costs of $22,889, and mean length of stay of 16.7 days. There was a significant relationship between ASA PS and each of these 4 outcomes.

TABLE 3. Mortality Rate, Morbidity Rate, Mean Variable Direct Costs, and Mean Length of Stay by ASA PS Level

graphic file with name 9TT3.jpg

Relationship Between ASA PS and the Other NSQIP Risk Factors

A comparison of the power of ASA PS alone versus the other NSQIP preoperative risk variables in predicting outcomes is shown in Table 4. The other NSQIP risk variables are stronger predictors of each outcome than ASA PS alone. Adding ASA PS back to the other NSQIP variables increased predictive power, but the increase was modest. This indicated strong interdependence between ASA PS and the other NSQIP preoperative risk variables in predicting outcomes. This interdependence was further demonstrated by the observation that the ASA PS correlated significantly with 57 of the other 59 NSQIP preoperative risk factors (P < 0.01). The risk factors that did not correlate with ASA PS were minority status (P = 0.057) and high hematocrit (P = 0.463).

TABLE 4. A Comparison of the Power of ASA PS Alone, the Other NSQIP Preoperative Risk Variables, and All Factors Combined in Predicting Outcomes: Regression Results

graphic file with name 9TT4.jpg

NSQIP Risk Factors Influencing ASA PS Levels

Ordinal regression of the other NSQIP risk factors produced a model that showed a good statistical fit in predicting ASA PS (P < 0.001). The risk factors that most influenced assigned ASA PS levels were age, current smoking, history of hypertension, morbid obesity, preoperative coma, dyspnea, preoperative impaired sensorium, low hematocrit, and history of a previous cardiac operation. The overall influence of a risk factor on ASA PS depended on its incidence rate and the likelihood that when it occurred there was an increase in ASA PS level (odds ratio). The 20 most influential risk factors in predicting ASA PS from the ordinal regression model are shown in Table 5 with their associated incidence rates and odds ratios. They are shown in descending order of odds ratio. For instance, preoperative coma occurred in 0.2% of these patients; and when it occurred, the patient was 18 times more likely to have a higher ASA PS than an equivalent patient without preoperative coma.

TABLE 5. The 20 Most Influential Risk Factors in Predicting ASA PS, Incidence Rates, and Odds Ratios for a Higher ASA PS Level

graphic file with name 9TT5.jpg

Anesthesiologists’ Assigned ASA PS Levels Versus NSQIP Predicted Levels

The NSQIP risk factor predictions of ASA PS agreed with the anesthesiologists’ assignments in 67.6% of the cases were within one level in 99.1% of the cases and disagreed by 2 or more levels in only 0.9% of the cases. The differences between predicted and assigned ASA PS levels are shown in Figure 1 and are normally distributed around agreement (difference = 0) with no apparent skew or significant outliers. Table 6 shows agreement by level. For example, of the patients who were predicted as level I, 344 (72.1%) were assigned a level I, 129 (27.0%) were assigned a level II, 4 (0.8%) were assigned to level III, and none were assigned to level IV or V. By contrast, for level V predictions, only 3 (18.8%) were assigned level V, 10 (62.5%) were assigned level IV, 2 (12.5%) were assigned level III, and 1 (6.3%) was assigned level II. In general, the NSQIP risk factors tended to upgrade assigned level I's to II's and tended to downgrade assigned level IV's to III's and level V's to IV's and III's.

graphic file with name 9FF1.jpg

FIGURE 1. The NSQIP risk factors other than ASA PS were used to predict ASA PS using an ordinal regression model. The agreement between the predictions and the anesthesiologist assignments is shown as the distribution of the difference between them for 5878 patients. The difference terms are normally distributed around agreement (0). Predictions disagreed by more than one level in only 0.8% of the patients.

TABLE 6. The Number of Patients Categorized by the NSQIP Risk Factor Predicted ASA PS Class Versus Anesthesiologist Assignments (Agreement = 67.6%, Kappa Statistic = 42.8%)

graphic file with name 9TT6.jpg

DISCUSSION

Considerable controversy exists regarding the use of ASA PS as a risk-adjustor for surgical outcomes. Investigators of the ASA PS as a clinical measure have repeatedly noted its poor interrater consistency. Reported levels of agreement between the ASA PS classes of different anesthesiologists reviewing standard patients on paper vary from 40% to 60%.13 In other words, they are equally likely to disagree as agree on the particular ASA class for a patient. Haynes and Lawler concluded that ASA class alone cannot satisfactorily describe the physical status of a patient because of the marked inconsistency of ratings.20 Owens et al concluded that, as a measure, it is useful but suffers from lack of scientific precision.15 Mack et al suggested that there is a need for a new, more precise scoring system,13 and Lema16 concluded, in a brief review of ASA PS, that it has little meaningful clinical application in today's practice of anesthesia.

These criticisms of the inconsistency and imprecision of the ASA PS classification system appear to conflict with the observations that: 1) ASA PS does provide useful information between anesthesiologists in the care of their patients,13 and 2) there are numerous studies that highlight the effectiveness of ASA PS as a risk adjuster.21–25 The NSQIP results1,2 from 44 Veteran's Administration Hospitals showed ASA PS to be one of the top 10 predictors (out of a possible 60 preoperative risk factors) for morbidity and mortality in 8 separate surgical specialty models and the second highest predictor in the all services combined models. The recent American College of Surgeons NSQIP report (unpublished) from 18 private sector hospitals shows ASA PS to be the single strongest predictor of morbidity and mortality for combined general and vascular surgery patients. One can only conclude that, imprecise and inconsistent as it may be, asking the anesthesiologist to summarize the preoperative physical status of the patient along a 5-point scale anchored by “healthy” at the one end and “moribund” at the other, provides a strong predictor of surgical outcomes.

This strong predictive utility of the ASA PS may be problematic in the current environment where government and other payers are looking for simple clinical measures with which to risk-adjust hospital and provider performance. ASA PS could be proposed as a single strong predictor for surgical outcomes, however, with a high potential for inaccuracy due to its imprecision and inconsistency. For instance, if a hospital's outcomes were to be risk-adjusted using ASA PS and then compared with a competitor that listed erroneously inflated ASA PS values, the former hospital would be at a disadvantage. Because of the potential financial and patient access impact of erroneous assessments of quality, a method for validating ASA PS levels is needed. The results of this study suggest that the clinically robust NSQIP data set is such a method.

In this study, ASA PS was indeed shown to be a single strong predictor of outcomes, however, the other NSQIP risk variables without ASA PS were better predictors than ASA PS alone. Adding ASA PS to the other risk factors did not significantly improve mortality prediction and only modestly improved prediction of morbidity, costs, and length of stay. These findings demonstrate the power of the other NSQIP risk factors in predicting outcomes and the interdependence between ASA PS and the other risk factors. This interaction was also demonstrated by the significant correlations between ASA PS and almost all of the other risk factors. This intercorrelation would be expected between a summary measure seeking to describe the overall physical status of the patient and individual more objective risk factors. These risk factors form the basis for development of a model with which to predict and evaluate ASA PS classification levels.

The agreement between the NSQIP model-predicted and anesthesiologist-assigned ASA PS levels in this study is better than that previously published. In only 0.9% of the cases at UKMC did the estimate exceed the actual by more than one level. The disagreement that did occur could very well have been due to the difference in assignments by different anesthesiologists at UKMC and represent the potential benefits of evaluating ASA PS classification levels using the other risk factors.

The percentage disagreement between the NSQIP predictions and the assigned ASA PS values was greatest at the low and high levels. This may be an artifact of the statistical modeling which tends to be more accurate around the mean occurrence but may also reflect real differences in model and anesthesiologists classifications. In practice, there is little difference to the anesthesiologist between level I or level II patients and therefore little incentive to reflect mild preoperative conditions in a level II. We hypothesize therefore that the modeled predictions include multiple minor clinical conditions which are insufficient to motivate a level II classification by the anesthesiologist in some cases. These conditions may, however, reflect “mild systemic disease” that justifies a level II classification. This would represent an area where the model could improve upon the anesthesiologists’ assignments.

With respect to the downgrading of some of the high-level anesthesiologist classifications, we hypothesize the opposite. That is, the anesthesiologist is able to judge the overall impact of acute conditions on the likelihood of death more accurately than the model. Our data confirms this somewhat by the fact that the mortality rate among anesthesiologist-assigned level V patients is 70%, while the rate for the model-predicted level V patients is 56%. The higher level classifications are relatively rare, however, and though this weakness in the model should be noted, it has a relatively small impact on the overall ability to assess ASA PS levels.

Limitations and Further Research

Data for this study were analyzed for a single site so generalizing the conclusions needs to be confirmed by a larger study with participation from multiple institutions. The inclusion of a random sample of all major surgical cases from 6 different surgical specialties ameliorated somewhat this limitation. However, the benefits of an ASA PS assessment model with which to evaluate hospital or provider ASA PS levels will need to come from a broad multisite study.

CONCLUSION

The clinical redundancy in the robust NSQIP risk factors does allow for modeling and validation of ASA PS. With policy makers’ current need for clinical variables to risk adjust surgical outcomes, the ASA PS offers predictive power and ubiquity. It appears to indicate acute clinical states of patients especially well at levels 4 and 5. However, it is important to acknowledge its imprecision and inconsistency and, in doing so, that use of the NSQIP can ensure unbiased risk adjustment of surgical outcomes among surgeons and hospitals.

ACKNOWLEDGMENTS

The authors thank the dedicated and skilled nurse coordinators, Ms. Mary Beth Rice and Ms. Devauna Riley, who performed data collection.

Discussions

Dr. Lewis M. Flint, Jr. (Tampa, Florida): This report is another contribution from Mr. Davenport and his colleagues which seeks to precisely quantify the relationship between risk scoring and outcome. In the coming era of “pay for performance,” precision in this area will be vitally important as policy makers begin to produce and make public data regarding risk-adjusted outcomes.

In Florida, our governor mandated that the trauma centers which comprise the in-hospital patient care capability of the state trauma system go to a “pay for performance” methodology to quality for state funding. The most convenient methodology, from the policy makers’ standpoint, was to use data from the statewide hospital discharge and financial database.

Dr. Joseph Tepas, a member of this Association, produced compelling data to show that this hospital discharge data, though readily available, was flawed. His work has gone far to convince our state that state trauma registry data, validated by comparisons with the National Trauma Data Bank, is a much more reliable means of arriving at risk-adjusted outcomes that provide a meaningful performance indicator.

Work by many in the trauma field has shown that combinations of anatomic, historical, and physiologic variables supply superior risk adjustment. The Veteran's Administration quality initiatives also demonstrated this fact, and this has led to the NSQIP data set that is now in use.

The biggest problem for these data sets is the cost of maintaining them. For example, to enter and validate the trauma registry data for my level 1 trauma center, which admits about 3000 injured patients annually, requires four full-time equivalent nurse clinicians who gather and validate the data and 2.5 full-time equivalent certified registrars to enter the data and maintain the registry.

The main points that Mr. Davenport and his colleagues make is that clinicians assigning risk scores from a clinical “snapshot” of a patient at an isolated point in time are not likely to be as dependable as experienced risk assignment nurses working with combinations of data points which include physiologic data. This parallels, exactly, our experience in the trauma field.

Now that we know this, what do we do with it? As has been amply demonstrated, government policy makers are likely to prefer a single score. How will we respond when one of these well-meaning individuals points out that your data show that the most likely area where scores are incorrectly assigned is in the less sick cohort? They will go on to point out that doctors assigning the ASA PS are very likely to overscore the sickest patients. They will conclude that the ASA PS, if anything, will work to make the doctors and the healthcare facility look better not worse. How will we respond to that?

I have the following additional questions: Will you recommend a basic data set for each doctor and each hospital that will be required for participation in the “pay for performance” program? What will be the size of the data set? Finally, how much time, effort, and cost will be required to maintain it?

Dr. Daniel L. Davenport (Lexington, Kentucky): In regard to our response to the well-meaning policy maker who suggests using ASA PS alone because it is “good enough,” this study clearly indicates that it has significant weaknesses and needs to be validated by more objective risk factors. In terms of the underscoring of level 5 patients by the model, this will need to be validated in a multisite study. It is important to note, however, that level 5 patients are moribund and represent a small percentage of the patient population. Therefore, this finding doesn't change the overall conclusions regarding validation of ASA class.

In response to the question regarding which data set to use for physician or site “pay for performance,” our experience with the NSQIP suggests that the sample size is too limited for analysis at the physician level. For this reason, we would not recommend physician-level application. Physician-level application ignores the reality that patient care is provided through a complex healthcare center.

In regards to evaluating hospitals, the American College of Surgeons and our single-site experience have shown that systems like the NSQIP can be applied throughout the country. Since the top 10 or so risk factors have been fairly consistent, it may be possible to reduce the size of the data set and provide effective risk adjustment at the national level. If this were done, however, it would still be necessary to have a subgroup of centers that would continue to review and revise a larger data set as populations and practice patterns change.

With respect to costs, the value of the NSQIP can be found in two areas. First, as noted in the presentation and documented in the manuscript, there is strong association between costs and the preoperative state of the patient. It is our belief that, if acted upon, the clinical data available in the NSQIP will allow for accurate targeting of clinical process improvements that will ultimately result in the system paying for itself.

Secondly, in a national “pay for performance” environment, there is significant marketing and public relations value to a hospital avoiding an erroneous label as an outlier site for mortality. There is also value nationally to CMS in not limiting access by steering patients away from hospitals that are erroneously labeled as poor performers. In these contexts, the cost of using clinically robust versus administrative data sets is more than justified.

Dr. J. Bradley Aust (San Antonio, Texas): The authors have presented a careful statistical analysis of the value of the ASA classification for evaluation of surgical risk. They have compared it to a composite of preoperative values collected in a NSQIP database from their primary hospital. Let me point out the differences between the two variables.

The ASA PS classification is a scaled classification on the severity of illness that the patient has. And while it is very simple, examples of the cases as to how they fit into the classification are helpful.

Class I is a healthy patient. Class II has one systemic illness which is controlled. Class III has multiple systemic illnesses which are controlled. Class IV is a serious disease, which in our environment represents a patient who is not going to live 6 months without improving their health status. This would include cardiac failure, cirrhosis of the liver with ascites, or advanced cancer. Class V is a moribund patient who is not expected to live unless the operation is successful. Class IV and V account for less than 5% of all the cases that are evaluated.

On the other side, the NSQIP composite represents primarily discreet observations. The patient is in coma or isn't in coma, has COPD or doesn't have COPD, has hypertension or doesn't have hypertension. There are very few scaled components; functional status is one of these.

The value of the ASA has been questioned because of its interrater inconsistency. The scaling from 1 to 5 is based on overall anesthetic risk. The risk factors are included in the NSQIP database but not scaled. It is not surprising that the collection of 57 or 60 variables would equal or exceed the accuracy of the ASA in risk evaluations since they are additive parts of the same database. The surprising fact to me is that the simple relatively subjective evaluation, namely, ASA, is almost as strong a predictor as the composite, thereby allowing us to calculate risk by simplified formulas.

The paper also suggests a technique for validating ASA across groups, hospitals, by comparing the estimates of the composite of preoperative variables with the ASA evaluations from the same groups. This evaluation would curb the perceived effect of gaming the ASA values to distort potential operative risk for any group.

Consider IV and V classes of ASA. These are going to be very difficult to game because there are only 5% of the people that are going to fall in these classifications. A group that starts coding 6% to 7% of their population as class IV and V survey is suspect for gaming.

Gaming is counterproduction. Is it desirable for a group to upgrade to any class? Hardly. The overall mortality is unchanged. Putting lesser-risk patients into a higher class dilutes the higher class and lowers the mortality of the higher class but increases the mortality of the lower class. This exposes a higher mortality for lower-risk patient's flagging the lower groups for further scrutiny. Quality control is primarily concerned with ensuring that healthy patients have low mortality. Upgrading patients from these lower classes increases their percent of deaths. This is a bad consequence of upward gaming to reduce percent mortality in class IV and V. Of interest, too, is that in your own database, the IV and V classes were better evaluated by the ASA class than they were by the composite.

I note that 12 to 14 of the first high scoring variables relate to respiratory, neurologic, endocrine, and cardiac factors. Clearly, these factors are dominant as anesthesia risk factors. Age and albumin, the other two major components of surgical risk, are lower in the listed ranking, suggesting their contribution to ASA risk lies outside the ASA evaluation. I would be interested in knowing whether emergency or cancer surgery, the other main components of operative risk, were considered among preoperative variables since both are preoperative evaluations.

In summary, the authors are to be congratulated for this careful analysis of the value and potential limitation of ASA in surgical risk evaluation and for suggesting a mechanism for ASA validation using a composite of NSQIP Risk Factors. But don't throw out the ASA yet.

Dr. Daniel L. Davenport (Lexington, Kentucky): Yes, the emergency status of the operation and disseminated cancer were included in the risk factors that were used to model ASA class levels in the study. Although they were not among the top 20 predictors of ASA class, they, along with almost all of the other risk factors that were tracked, did influence ASA class levels. This is not surprising since the ASA class is a summary measure of the physical status.

Dr. Michael J. Zinner (Boston, Massachusetts): I want to congratulate Drs. Mentzer, Khuri and Henderson for the VA NSQIP development and bringing it to the private sector. We have admired this effort and also been early adapters of this program in our private sector.

In that transition from VA to private sector, I have a couple of concerns. One of them has to do with the risk model. In the current private application of NSQIP, ASA turns out to be the highest predictor of morbidity and mortality in the risk-adjusted model. We looked at our own data and became concerned because it is clear that the anesthesiologists who are doing the classification have no vested interest in the outcome. That is, they do not use ASA for any quality measure, for any billing measure, for any outside measure at all. So they only use it as a rough guide and don't have feedback for what they put down.

We looked at our Partners System and unblinded our system hospitals. In Partners, there are five hospitals: two downtown tertiary care hospitals, the MGH and the Brigham, and three smaller community hospitals. When we first unblinded it, the highest ASA ratings (ie, sickest patients) were from the smallest community hospital in the system. That clearly was a problem.

We then looked back at our own data and did an internal audit at the Brigham. It showed that we had a 17% error rate between a senior anesthesiologist looking over the record and the one recorded on the day of operation and a 20% error rate between the preoperative testing center visit and the day of surgery. Interestingly our data had a lower error rate than yours but still a high error rate. In your abstract, one third of the patients had disagreement on the ASA.

So my concern remains that we are using an imprecise measure as one of the most important predictors of risk-adjusted outcome. I shared these concerns with the Executive Committee of the NSQIP and made the recommendation that they consider (after the NSQIP is rolled out) a “risk model committee.” And that it could be used on an ongoing basis, as kind of a CQI for risk modeling in the NSQIP.

As an additional comment, over the last year, Dr. Atul Gawande and I (from our institution) have been developing and validating what would be the equivalent of an intraoperative APGAR score. We hope to be able to use this as a quantitative measure of what outcomes could be expected from simple operating room measures to take out the subjectivity and add some objective precision in the operating room.

Finally, my question to you is, now that you have reviewed this group of patients, when you take this information back to your anesthesiologists, what do they say or what did they do to change their ASA scores going forward?

Dr. Daniel L. Davenport (Lexington, Kentucky): Dr. Bowe, who is one of the co-authors on this paper, is the chairman of the Department of Anesthesiology at the University of Kentucky. We did not collect data regarding specific anesthesiologists in our data, so we were not able to show individuals how they compared. Our objective with this study was to demonstrate the ability of the NSQIP to evaluate ASA class in response to the concerns you raise regarding the problem of interrater reliability. As we mentioned, this issue has clearly been identified as a problem in the literature.

Dr. Matthew M. Hutter (Boston, Massachusetts): I would like to congratulate the authors for examining this interesting question about the validity of using ASA classifications. I would also like to congratulate you on your use of such technically demanding statistical models which include non-parametric regression.

I am from the Massachusetts General Hospital, and we at the Partners hospitals in the Boston area, when comparing our data between the community and academic hospitals, came upon some serious concerns about ASA classifications. Surgeons would like variables in a risk-adjusted model to be as objective as possible, and when a variable like ASA class, which is perceived by some to be subjective, shows up as the number one predictive factor, this causes great concern. ASA class is perceived to be subjective and you show that it adds little if anything to the key index of the overall model. Is this something you think that we should be using for base validity?

Dr. Daniel L. Davenport (Lexington, Kentucky): The ASA class is one of 60 risk factors that are tracked by the NSQIP. The modeling is repeated now in the ACS NSQIP on a 6-month basis using a rolling 2-year time period of data. In these models, ASA class keeps showing up at the top of the list as a major predictor of outcomes.

There is a significant amount of clinical redundancy in the NSQIP risk factors as it pertains to risk adjustment. So, to keep or drop the ASA class is a good question. I think the information it provides is useful, especially for the level IV and V patients, and that in the context of a robust clinical information system like NSQIP, the risks due to the subjective nature of the measure can be curtailed. Certainly, using it alone or in a smaller subset of data increases the risk of biased evaluations, and we believe it would be inadvisable to do so.

Dr. Michael L. Hawkins (Augusta, Georgia): I think perhaps we are comparing apples and oranges. We are comparing a straightforward, simple evaluation of surgical risk done before the surgery with a well-defined, extensive, retrospective analysis of risk that occurred when all the data are gathered.

To me, the ASA value potentially, although it is certainly not achieved all the time, would be an anesthesiologist recognition that this patient is at significant risk and maybe the preoperative/intraoperative care would be done somewhat differently if their risk is assigned to be higher.

Secondly, I am curious, I guess on my own personal bias, why you excluded trauma in your analysis and whether or not you included or excluded other emergency cases, or were all these cases elective, scheduled surgeries?

Dr. Daniel L. Davenport (Lexington, Kentucky): In terms of trauma, the NSQIP started in the VA; and because of the limited size of the trauma population, it was excluded early on. When the ASC NSQIP was implemented in the private sector, trauma patients were excluded because of the added complexity of this patient population and the lack of comparative data from the historical NSQIP. So our study, following the NSQIP protocol, did not include trauma patients.

I agree with you that the ASA class is subjective in nature compared to the other more objective risk factors. They are all obtained preoperatively and the ASA class is a summary measure of the physical status of the patient by the anesthesiologist. So it is a bit apples and oranges. But, as was shown in one of our first slides, it does provide additional clinical information that impacts estimation of outcomes.

Dr. R. Scott Jones (Charlottesville, Virginia): I would just like to make a brief comment in that this particular issue has been under active consideration by the Steering Committee of the ACS NSQIP.

And as a follow-up to Dr. Zinner's comments, we have had active communication with that group about this matter last Tuesday. We discussed it in some detail, and have developed a Modeling Committee, which will include an external biostatistician to look at this particular matter: ASA on the risk adjustment calculation as well as other matters. So this is being attended to and will be analyzed and handled in a biostatistically and scientifically correct manner.

Footnotes

Supported in part by a grant from the Agency for Healthcare Research and Quality through the American College of Surgeons under the direction of the National Surgical Quality Improvement Program.

Reprints: Daniel L. Davenport, PhD, Office of Decision Support, Department of Surgery, University of Kentucky, 800 Rose Street, Lexington, KY 40536-0298. E-mail: dldave0@email.uky.edu.

REFERENCES

  • 1.Khuri SF, Daley J, Henderson WG, et al. Risk adjustment of the postoperative mortality rate for the comparative assessment of the quality of surgical care: results of the National Veterans Affairs Surgical Risk Study. J Am Coll Surg. 1997;185:325–338. [PubMed] [Google Scholar]
  • 2.Daley J, Khuri SF, Henderson WG, et al. Risk adjustment of the postoperative morbidity rate for the comparative assessment of the quality of surgical care: results of the National Veterans Affairs Surgical Risk Study. J Am Coll Surg. 1997;185:339–352. [PubMed] [Google Scholar]
  • 3.Mantilla CB, Horlocker TT, Schroeder DR, et al. Risk factors for clinically relevant pulmonary embolism and deep venous thrombosis in patients undergoing primary hip or knee arthroplasty. Anesthesiology. 2003;99:552–560. [DOI] [PubMed] [Google Scholar]
  • 4.Texas Department of Health. 24. 3.2. 5 Day Surgery Services. In: 2001 Texas Medicaid Provider Procedures Manual. Texas Department of Health, 2001:20–24. [Google Scholar]
  • 5.Turpin SD. Summary of 2001 Sate Legislative and Regulatory Activities, ASA Newsletter December 65(12) accessed online at http://www.asahq.org/Newsletters/2001/12_01/turpin.htm on October 12, 2005.
  • 6.Han KR, Kim HL, Pantuck AJ, et al. Use of American Society of Anesthesiologists Physical Status Classification to assess perioperative risk in patients undergoing radical nephrectomy for renal cell carcinoma. Urol. 2004;63:841–846;discussion 846–847. [DOI] [PubMed]
  • 7.Gordon HS, Johnson ML, Wray NP, et al. Mortality after noncardiac surgery: prediction from administrative versus clinical data. Med Care. 2005;43:159–167. [DOI] [PubMed] [Google Scholar]
  • 8.Saklad M. Grading of patients for surgical procedures. Anesthesiology. 1941;2:281–284. [Google Scholar]
  • 9.Dripps RD, Lamont A, Eckenhoff JE. The role of anesthesia in surgical mortality. JAMA. 1961;178:261–266. [DOI] [PubMed] [Google Scholar]
  • 10.Miller RD, ed. Miller's Anesthesia. Philadelphia: Elsevier Churchill Livingstone, 2005. [Google Scholar]
  • 11.Owens WD. American Society of Anesthesiologists Physical Status classification system is not a risk classification system [correspondence]. Anesthesiology. 2001;94:378. [DOI] [PubMed] [Google Scholar]
  • 12.American Society of Anesthesiologists, ASA Physical Status Classification System, Accessed at http://www.asahq.org/clinical/physicalstatus.htm on September 21, 2005.
  • 13.Mak PH, Campbell RC, Irwin MG. The ASA Physical Status Classification: inter-observer consistency. Anaesth Intensive Care. 2002;30:633–640. [DOI] [PubMed] [Google Scholar]
  • 14.Haynes SR, Lawler PGP. An assessment of the consistency of ASA Physical Status Classification allocation. Anaesthesia. 1995;50:195. [DOI] [PubMed] [Google Scholar]
  • 15.Owens WD, Felts JA, Spitznagel EL Jr. ASA Physical Status Classifications: a study of consistency of ratings. Anesthesiology. 1978;49:239–243. [DOI] [PubMed] [Google Scholar]
  • 16.Lema MJ. Using the ASA Physical Status Classification may be risky business. ASA Newsletter: Ventilations 2002;66(9) accessed online at http://www.asahq.org/Newsletters/2002/9_02/vent_0902. htm on October 12, 2005.
  • 17.Fink AS, Campbell DA, Mentzer RM, et al. The National Surgical Quality Improvement Program in non-Veterans Administration hospitals: initial demonstration of feasibility. Ann Surg 2002;236:344–353;discussion 353–354. [DOI] [PMC free article] [PubMed]
  • 18.Davenport DL, Henderson WG, Khuri SF, et al. Preoperative risk factors and surgical complexity are more predictive of costs than postoperative complications: a case study using the National Surgical Quality Improvement Program (NSQIP) database. Ann Surg. 2005;242:463–471. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Hanley JA, McNeil BJ. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology. 1983;148:839–843. [DOI] [PubMed] [Google Scholar]
  • 20.Haynes SR, Lawler PGP. An assessment of the consistency of ASA Physical Status Classification allocation. Anaesthesia. 1995;50:195. [DOI] [PubMed] [Google Scholar]
  • 21.Reich DL, Hossain S, Krol M, et al. Predictors of hypotension after induction of general anesthesia. Anesth Analg. 2005;101:622–628. [DOI] [PubMed] [Google Scholar]
  • 22.Brock MV, Kim MP, Hooker CM, et al. Pulmonary resection in octogenarians with stage I nonsmall cell lung cancer: a 22-year experience. Ann Thorac Surg. 2004;77:271–277. [DOI] [PubMed] [Google Scholar]
  • 23.Castellano P, López-Escámez JA. American Society of Anesthesiology Classification may predict severe post-tonsillectomy haemorrhage in children. J Otolaryngol. 2003;32:302–307. [DOI] [PubMed] [Google Scholar]
  • 24.Manku K, Bacchetti P, Leung JM. Prognostic significance of postoperative in-hospital complications in elderly patients: long-term survival. Anesth Analg. 2003;96:583–589. [DOI] [PubMed] [Google Scholar]
  • 25.Giannice R, Foti E, Poerio A, et al. Perioperative morbidity and mortality in elderly gynecological oncological patients (>/= 70 years) by the American Society of Anesthesiologists physical status classes. Ann Surg Oncol. 2004;11:219–225. [DOI] [PubMed] [Google Scholar]

Articles from Annals of Surgery are provided here courtesy of Lippincott, Williams, and Wilkins

RESOURCES