This study evaluates whether thyroidectomy-specific outcomes vary among hospitals, whether the addition of thyroidectomy-specific variables affects risk adjustment, and whether differences in hospital performance are associated with thyroidectomy-specific care processes.
Key Points
Question
Can thyroidectomy-specific outcomes be used for national hospital quality improvement metrics?
Findings
This cohort study detected significant variation in hospital performance after thyroidectomy for hypocalcemia and recurrent laryngeal nerve injury but not for hematoma. Whether thyroidectomy-specific variables were used for risk adjustment did not substantively affect hospital rankings; however, compared with the worst-performing hospitals, the best-performing hospitals more frequently used intraoperative nerve monitoring and energy devices and prescribed oral calcium or vitamin D supplement at discharge.
Meaning
Hypocalcemia and recurrent laryngeal nerve injury, but not hematoma, after thyroidectomy may be used as national hospital quality improvement metrics.
Abstract
Importance
Current surgical quality metrics might be insufficient to fully judge the quality of certain operations because they are not procedure specific. Hypocalcemia, recurrent laryngeal nerve (RLN) injury, and hematoma are considered to be the most relevant outcomes to measure after thyroidectomy. Whether these outcomes can be used as hospital quality metrics is unknown.
Objectives
To evaluate whether thyroidectomy-specific outcomes vary among hospitals, whether the addition of thyroidectomy-specific variables affects risk adjustment, and whether differences in hospital performance are associated with thyroidectomy-specific care processes.
Design, Setting, and Participants
In this retrospective cohort study, patients undergoing thyroidectomies from January 1, 2013, through December 31, 2015, at hospitals participating in the American College of Surgeons’ National Surgical Quality Improvement Program were studied.
Exposure
Thyroidectomy-related care.
Main Outcomes and Measures
Clinically severe hypocalcemia, RLN injury, and clinically significant hematoma within 30 days of thyroid surgery and hospital-level performance variation, change in risk adjustment, and association with processes.
Results
Overall, 14 540 patients (mean [SD] age, 52.1 [15.0] years; 11 499 [79.1%] female) underwent operations at 98 hospitals. Because operations missing thyroidectomy-specific outcomes were excluded, the numbers of operations and hospitals analyzed differed by outcome. Of 14 540 operations included, clinically severe hypocalcemia occurred in 450 patients (3.3% overall, 0.6% after partial, and 4.7% after subtotal or total thyroidectomy), RLN injury in 755 patients (5.7% overall, 4.2% after partial, and 6.6% after subtotal or total thyroidectomy), and hematoma in 175 patients (1.3%). Hospital performance varied for hypocalcemia and RLN injury but not for hematoma. Hospital performance rankings were largely unaffected by the inclusion of thyroidectomy-specific data in risk adjustment. With regard to processes, patients undergoing thyroidectomies at the best-performing vs worst-performing hospitals less frequently had their postoperative parathyroid hormone level measured (593 [19.9%] vs 457 [31.7%], P < .001) and more often were prescribed oral calcium, vitamin D, or both (2281 [76.6%] vs 962 [66.8%], P < .001). When profiled by RLN injury, use of energy devices (1517 [69.1%] vs 507 [55.2%], P < .001) and intraoperative nerve monitoring (1223 [55.7%] vs 346 [37.7%], P < .001) were more prevalent at the best- compared with the worst-performing hospitals.
Conclusions and Relevance
Postoperative hypocalcemia and RLN injury, but not hematoma, potentially could be used as thyroidectomy-specific national hospital quality improvement metrics. Strategies aimed at reducing these complications after thyroidectomy may improve the care of these patients.
Introduction
Outcomes such as mortality and surgical site infections (SSIs) are broadly applicable for surgical quality improvement across all procedures. However, measuring procedure-specific outcomes may be necessary to advance surgical quality improvement in some realms. Recognizing that certain procedures have unique processes and outcomes that deserve more specificity and granularity, the American College of Surgeons’ National Surgical Quality Improvement Program (ACS-NSQIP) augmented its core program with the option to collect procedure-specific data.
More than 70 000 thyroidectomies are performed annually in the United States. Because complications, such as mortality and SSIs, are rare after thyroidectomy, many consider hypocalcemia (ie, hypoparathyroidism), recurrent laryngeal nerve (RLN) injury, and cervical hematoma to be the most relevant complications after thyroidectomy. Permanent hypocalcemia reportedly occurs in up to 3% of patients after their operation, RLN injury in 3% to 11%, and hematoma in 2%. These outcomes are vital in thyroid surgery, having direct implications for postoperative care, resource use, and patient quality of life. Furthermore, certain care processes, such as the use of intraoperative nerve monitoring (IONM) or measuring perioperative serum parathyroid hormone (PTH) levels, are thought by many experts to influence postthyroidectomy outcomes. In January 2013, the ACS-NSQIP launched a thyroidectomy-specific module to support measurement of factors, processes, and outcomes for quality improvement purposes.
It remains unclear whether thyroidectomy-specific outcomes can feasibly be used for hospital performance improvement. Using these novel thyroidectomy-specific data from the ACS-NSQIP, our objectives were to evaluate (1) whether thyroidectomy-specific outcomes vary among hospitals, (2) whether the addition of thyroidectomy-specific variables affects risk adjustment, and (3) whether thyroidectomy-specific processes of care are associated with differences in hospital performance.
Methods
Data Source and Patient Population
This cohort study included thyroidectomies performed at ACS-NSQIP hospitals from January 1, 2013, through December 31, 2015 (eAppendix in the Supplement). The ACS-NSQIP data collection and validation processes have been previously described. In brief, dedicated, trained surgical clinical reviewers collect patient characteristics, operative details, and postoperative outcomes following standardized definitions within 30 days of the index operation irrespective of patient discharge status. Data are obtained from the medical record, through discussions with treating physicians, and, when necessary, by contacting patients directly. Data collected are adherent with the Health Insurance Portability and Accountability Act and are routinely audited. Because this study analyzed preexisting, deidentified data, the Chesapeake Institutional Review Board deemed it exempt from oversight; therefore, no informed consent was necessary.
Thirty-Day Outcomes
Three thyroidectomy-specific outcomes (hypocalcemia, RLN injury, and hematoma) and 3 standard ACS-NSQIP outcomes (morbidity, SSI, and readmission) were studied. Clinically severe postoperative hypocalcemia was deemed to have occurred if the patient was evaluated in the emergency department or in an office setting because of signs and symptoms potentially related to low calcium levels; was admitted to a health care facility for the signs, symptoms, or treatment of low calcium levels; or was prescribed intravenous calcium. Injury or dysfunction of the RLN occurred if there was evidence of persistent hoarseness or vocal cord dysfunction beyond the first postoperative day. This timeframe was chosen to minimize the misclassification of patients who have soreness or hoarseness immediately after extubation from endotracheal anesthesia. Clinically significant postoperative hematoma occurred if a patient developed a cervical hematoma that resulted in increased length of stay, readmission, or intervention (eg, open evacuation, bedside aspiration). Standard ACS-NSQIP outcomes included morbidity, SSI (superficial incisional, deep incisional, or organ/space), and unplanned readmission (eAppendix in the Supplement).
Risk Adjustment Variables
Standard ACS-NSQIP variables (patient demographics, comorbidities, and laboratory values) and thyroidectomy-specific variables were considered for risk adjustment. Thyroidectomy-specific variables included operative indication (single nodule or neoplasm, multinodular goiter, Graves disease, differentiated malignant tumor, poorly differentiated malignant tumor, other malignant tumor, or other indication), presence of thyrotoxicosis, and prior anterior neck surgery.
Available intraoperative and postoperative information was also considered: whether an alternative exposure approach (eg, robotic) was used, whether a central neck dissection was performed, and whether parathyroid autotransplantation was performed. For malignant tumors, multifocality (multifocal and unilateral, multifocal and bilateral) and American Joint Committee on Cancer’s Cancer Staging Manual, seventh edition, pathologic TNM classification were incorporated.
Thyroidectomy-Specific Processes of Care
Because care processes might reflect differences in hospital performance, they were not considered for risk adjustment. Instead, 7 thyroidectomy-specific care processes were examined to generate hypotheses about whether their use was associated with hospital performance differences: (1) preoperative needle biopsy performed, (2) an energy device (eg, LigaSure, Harmonic) used, (3) IONM used, (4) surgical drain placed, (5) serum calcium or (6) PTH levels measured before discharge, and (7) patient given oral calcium or vitamin D replacement during the hospital stay or discharged with these medications.
Statistical Analysis
Continuous variables were compared using the 2-tailed, unpaired t test. Pearson χ2 test for association or Fisher exact test was used to compare categorical variables, when appropriate. In accordance with standard ACS-NSQIP processes, missing data were imputed using maximum likelihood estimation. Operations missing thyroidectomy-specific outcomes were excluded from analyses. All tests of statistical significance were 2-sided with α = .05. SAS statistical software, version 9.4 (SAS Institute Inc) was used.
First, to assess whether hospital performance varied by thyroidectomy-specific outcomes, mixed effects regression models were constructed for each outcome using previously reported methods, whereby hospitals acted as random intercepts and risk adjustment variables as fixed effects. A forward selection process with entry set at P < .05 determined the optimal variables with the greatest amount of explanatory power for risk adjustment; both thyroidectomy-specific and standard ACS-NSQIP variables were available for selection. Odds ratios (ORs) with 95% CIs then were generated for each hospital, representing that hospital’s performance relative to the statistically estimated average hospital performing thyroidectomies on the same types of patients. If both the lower and upper bounds of the CIs were less than 1.0, hospitals were better than average. If both the lower and upper bounds of the CIs were greater than 1.0, hospitals were worse than average. The Wald test for the covariance matrix was interpreted to determine whether statistically significant variability existed among hospitals (ie, whether the outcome could theoretically be used to profile hospitals).
Second, to assess whether hospital performance on thyroidectomy-specific outcomes are affected by the addition of thyroidectomy-specific variables to standard ACS-NSQIP variables for risk adjustment compared with using standard ACS-NSQIP variables alone for risk adjustment, the aforementioned analyses were repeated but with only the standard ACS-NSQIP variables available in the forward selection process. Changes in hospital performance attributable to the availability of thyroidectomy-specific variables for risk adjustment were then assessed in 3 ways. First, hospital rankings for each outcome when thyroidectomy-specific variables were and were not available for risk adjustment were plotted. Pearson correlation coefficient was computed to reflect the change in hospital performance attributable to the inclusion of thyroidectomy-specific variables for risk adjustment. Second, changes in hospital outlier status for each outcome when thyroidectomy-specific variables were and were not available for risk adjustment were assessed using the Cohen weighted κ. Model fit statistics (C statistic, Hosmer-Lemeshow goodness of fit test, and Brier score) were computed and visually compared.
Because thyroidectomy-specific variables also might affect hospital performance on standard ACS-NSQIP outcomes, we repeated these analyses using morbidity, SSI, and readmission as outcomes. In sum, 12 models were constructed (3 thyroidectomy-specific outcomes with thyroidectomy-specific variables and 3 without and 3 standard outcomes with thyroidectomy-specific variables and 3 without).
Third, we compared the frequency of certain processes of care between the best- and worst-performing hospitals to examine whether differences in hospital performance appeared to be associated with certain thyroidectomy-specific care processes. Hospitals were sorted into deciles, such that hospitals in the first decile represented the best-performing hospitals and those in the 10th decile represented the worst. Frequencies of thyroidectomy-specific care processes were compared between these 2 performance deciles.
Results
Overall, 14 540 patients (mean [SD] age, 52.1 [15.0] years; 11 499 [79.1%] female) underwent operations at 98 hospitals. Because operations missing thyroidectomy-specific outcomes were excluded, the numbers of operations and hospitals analyzed differed by outcome. There were 13 242 operations with complete hypocalcemia data, 13 144 operations with complete RLN injury data, and 13 197 operations with complete hematoma data. These numbers represented 96 hospitals (median hospital case volume, 89.5; interquartile range [IQR], 20-183) profiled by hypocalcemia, 95 hospitals (median case volume, 92; IQR, 20-187) by RLN injury, and 95 (median case volume, 93; IQR, 20-188) by hematoma. Clinically severe hypocalcemia occurred in 450 patients (3.3% overall, 0.6% after partial, and 4.7% after subtotal or total thyroidectomy), RLN injury in 755 patients (5.7% overall, 4.2% after partial, and 6.6% after subtotal or total thyroidectomy), and hematoma in 175 patients (1.3%) (eTable 1 in the Supplement).
Determinants of Outcomes
Significant patient- and thyroidectomy-specific factors associated with clinically severe hypocalcemia, RLN injury, and clinically relevant hematoma are given in Table 1. Renal failure and extensive operations were positively associated with hypocalcemia, whereas older age was protective. Larger tumors were associated with developing RLN injury. Having an elevated preoperative hematocrit, bleeding disorder, or contaminated wound was associated with postoperative hematomas; no thyroidectomy-specific factors were selected during the model-building process. eTable 2 in the Supplement reports factors associated with thyroidectomy-specific outcomes when thyroidectomy-specific variables were unavailable.
Table 1. Odds Ratios of Fixed Effects Included in Hierarchical Models to Profile Hospitals by Thyroidectomy-Specific Outcomes With Thyroidectomy-Specific Variables Availablea.
Variable | Odds Ratio (95% CI) |
---|---|
Hypocalcemia | |
Step 1. Thyroidectomy type | |
Partial | 0.16 (0.11-0.24) |
Total, simple | 1 [Reference] |
Completion | 0.66 (0.40-1.11) |
Total, neck dissection | 1.01 (0.72-1.41) |
Substernal goiter, cervical | 0.79 (0.46-1.36) |
Substernal goiter, thoracic | 0.79 (0.22-2.86) |
Step 2. Alkaline phosphatase level >125 U/L | 2.41 (1.60-3.64) |
Step 3. Age | 0.98 (0.98-0.99) |
Step 4. Renal failure | 4.09 (1.68-9.95) |
Step 5. Systemic inflammatory response syndrome or sepsis | 3.57 (1.49-8.53) |
Step 6. Outpatient | 0.45 (0.35-0.58) |
Step 7. Female | 1.96 (1.45-2.66) |
Step 8. pN classification | |
Not applicable | 1 [Reference] |
Nx | 1.58 (1.19-2.11) |
N0 | 1.33 (0.96-1.85) |
N1 | 1.91 (1.30-2.81) |
Step 9. Not admitted from home | 2.68 (1.14-6.28) |
Step 10. Dyspnea | 1.41 (0.97-2.04) |
Step 11. Central neck dissection performed | 1.31 (1.00-1.72) |
Step 12. Indication | |
Single nodule | 1 [Reference] |
Multinodular goiter | 1.24 (0.92-1.66) |
Graves disease | 1.59 (1.06-2.39) |
Differentiated malignant tumor | 1.18 (0.83-1.68) |
Poorly differentiated or other malignant tumor | 1.34 (0.58-3.09) |
Other | 1.23 (0.63-2.38) |
Step 13. Creatinine concentration >1.0 mg/dL | 1.56 (0.92-2.66) |
Recurrent Laryngeal Nerve Injury | |
Step 1. pT classification | |
Not applicable | 1 [Reference] |
Tx | 0.97 (0.45-2.08) |
T0 | 1.74 (0.77-3.93) |
T1 | 1.00 (0.56-1.79) |
T2 | 0.99 (0.52-1.91) |
T3 | 1.20 (0.64-2.24) |
T4 | 2.63 (1.20-5.74) |
Step 2. Race | |
White | 1 [Reference] |
Asian | 1.56 (1.09-2.23) |
African American | 1.45 (1.16-1.81) |
Other or unknown | 1.24 (0.90-1.71) |
Step 3. Age | 1.02 (1.01-1.02) |
Step 4. Indication | |
Single nodule | 1 [Reference] |
Multinodular goiter | 1.16 (0.94-1.43) |
Graves disease | 0.96 (0.65-1.41) |
Differentiated malignant tumor | 1.32 (1.01-1.73) |
Poorly differentiated or other malignant tumor | 0.65 (0.29-1.47) |
Other | 1.48 (0.95-2.31) |
Step 5. Thyroidectomy type | |
Partial | 1 [Reference] |
Total (or subtotal), simple | 1.45 (1.17-1.79) |
Completion | 1.59 (1.11-2.28) |
Total (or subtotal), neck dissection | 1.31 (0.94-1.82) |
Substernal goiter, cervical | 2.15 (1.49-3.10) |
Substernal goiter, thoracic | 1.78 (0.66-4.80) |
Step 6. Thyrotoxicosis | 1.45 (1.08-1.95) |
Step 7. Hispanic ethnicity | |
No | 1 [Reference] |
Yes | 1.14 (0.79-1.65) |
Unknown | 0.83 (0.58-1.18) |
Step 8. pM classification | |
Not applicable | 1 [Reference] |
pMx/M0 | 1.03 (0.79-1.35) |
pM1 | 2.02 (0.85-4.82) |
Step 9. Central neck dissection performed | 0.94 (0.74-1.19) |
Step 10. pN classification | |
Not applicable | 1 [Reference] |
pNx | 1.22 (0.69-2.16) |
pN0 | 1.15 (0.63-2.08) |
pN1 | 1.88 (0.99-3.55) |
Hematoma | |
Step 1. Outpatient | 0.44 (0.31-0.63) |
Step 2. American Society of Anesthesiologists class | |
1-2 | 1 [Reference] |
3 | 1.33 (0.95-1.87) |
≥4 | 1.96 (0.84-4.55) |
Step 3. Race | |
White | 1 [Reference] |
Asian | 1.82 (0.90-3.69) |
African American | 2.04 (1.40-2.95) |
Other or unknown | 1.34 (0.82-2.18) |
Step 4. Hematocrit >45 | 2.02 (1.28-3.20) |
Step 5. Age | 1.01 (1.00-1.02) |
Step 6. Bleeding disorderb | 2.48 (1.10-5.62) |
Step 7. Wound classification | |
Clean or clean contaminated | 1 [Reference] |
Contaminated or dirty | 4.43 (1.01-19.46) |
SI conversion factors: to convert alkaline phosphatase to microkatals per liter, multiply by 0.0167; creatinine to micromoles per liter, multiply by 88.4; and hematocrit to proportion of 1.0, multiply by 0.01.
Age is a continuous variable. The CIs account for clustering of patients and outcomes within hospitals. Steps represent point when variable was selected in a logistic, forward stepwise regression process (entrance criteria, P < .05). All standard American College of Surgeons’ National Surgical Quality Improvement Program and thyroidectomy-specific variables were available for selection. Selected variables were included as fixed effects for risk adjustment when evaluating hospital performance for each outcome.
A preoperative bleeding disorder was present in patients with an underlying hematologic disorder or receiving long-term anticoagulation treatment.
Hospital Performance
Using thyroidectomy-specific variables, hospital performance varied for hypocalcemia and RLN injury, as reflected by outliers in Figure 1A and B and Table 2. When profiled by hypocalcemia, 4 hospitals were low and 7 were high outliers (median risk-adjusted OR, 0.95; range, 0.29-6.42) (Figure 1A). When profiled by RLN injury, 8 hospitals were low and 14 were high outliers (median risk-adjusted OR, 0.98; range, 0.16-18.2) (Figure 1B). No significant variability was found in between-hospital performance for hematoma (Figure 1C).
Table 2. Changes in Hospital Outlier Status by Outcome With and Without Thyroidectomy-Specific Variables Available for Risk Adjustment for 6 Outcomesa.
With Thyroidectomy-Specific Variables | Without Thyroidectomy-Specific Variables | ||||||||
---|---|---|---|---|---|---|---|---|---|
Hypocalcemia (n = 96) | RLN Injury (n = 95) | Hematoma (n = 95) | |||||||
Low | Average | High | Low | Average | High | Low | Average | High | |
Low performance | 3 | 1 | 0 | 7 | 1 | 0 | 0 | 0 | 0 |
Average performance | 0 | 82 | 3 | 1 | 70 | 2 | 0 | 95 | 0 |
High performance | 0 | 0 | 7 | 0 | 0 | 14 | 0 | 0 | 0 |
κ (95% CI)b | 0.82 (0.65-0.99) | 0.90 (0.80-1.00) | NA | ||||||
With Thyroidectomy-Specific Variables | Morbidity (n = 98) | SSI (n = 98) | Readmission (n = 98) | ||||||
Low | Average | High | Low | Average | High | Low | Average | High | |
Low performance | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Average performance | 0 | 96 | 0 | 0 | 97 | 0 | 0 | 98 | 0 |
High performance | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 |
κ (95% CI)b | 1.00 (1.00-1.00) | NA | NA |
Abbreviations: NA, not applicable; RLN, recurrent laryngeal nerve; SSI, surgical site infection.
Changes in hospital outlier status when thyroidectomy-specific variables were (with) and were not (without) included for risk adjustment. Hospital outlier status represents statistically better (low) or worse (high) performance relative to the average hospital using the 95% CI for each hospital. Hospitals with 95% CIs completely below 1.0 represent low outliers (better performance), hospitals with 95% CIs completely above 1.0 represent high outliers (worse performance), and all other hospitals were designated average.
κ Statistics with 95% CIs were computed where possible.
With use of thyroidectomy-specific variables, performance among hospitals also varied for standard ACS-NSQIP outcomes of morbidity (Figure 1D) and SSI (Figure 1E), although there were still few outliers (Table 2). No between-hospital variability for readmissions (Figure 1F) was detected.
Thyroidectomy-Specific Variables in Risk Adjustment
Figure 2 depicts the result of profiling hospitals with and without thyroidectomy-specific variables available for risk adjustment across all outcomes studied. Changes in hospital rankings were largely unaltered across the outcomes; Pearson correlation coefficients ranged from 0.99 to 1.00 (all P < .001), reflecting near-perfect correlation.
Table 2 also indicates how the availability of thyroidectomy-specific variables for risk adjustment influenced hospital outlier status. Of 6 outcomes, only hypocalcemia and RLN injury had any influence; for the other outcomes, thyroidectomy-specific adjusters had no influence. Status changed for only 4 hospitals when profiled by hypocalcemia and RLN injury, and in no case did a hospital move farther than the adjacent class. For example, when profiled by hypocalcemia without thyroidectomy-specific variables available for risk adjustment, 10 hospitals were deemed to be high outliers; however, when thyroidectomy-specific variables were available, 3 of these hospitals became average, whereas 7 remained high. No hospitals changed from high to low or low to high. Agreement in hospital rank was near complete (weighted κ range, 0.82-1.00, when available). Inclusion of thyroidectomy-specific variables had minimal influence on model fit statistics (eTable 3 in the Supplement).
Association of Care Processes With Hospital Performance
To explore whether certain thyroidectomy-specific care processes were associated with differences in hospital performance, Table 3 presents the results of patients undergoing thyroid surgery at hospitals in the highest decile (worst performing) compared with those in the lowest decile (best performing). Because no hospital variability was detected for hematomas, care processes were not explored for this outcome. These associations are not assumed to be causal and were interrogated to provoke thought only.
Table 3. Associations Between Care Processes and Hospital Performancea.
Process Metric | Hypocalcemia | RLN Injury | ||||
---|---|---|---|---|---|---|
Best-Performing Hospitals (n = 2979) |
Worst-Performing Hospitals (n = 1441) |
P Value | Best-Performing Hospitals (n = 2194) |
Worst-Performing Hospitals (n = 919) |
P Value | |
Preoperative needle biopsy performed | 2290 (76.9) | 1013 (70.3) | <.001 | 1586 (72.3) | 632 (68.8) | .05 |
Energy device used | 1771 (59.5) | 928 (64.4) | .002 | 1517 (69.1) | 507 (55.2) | <.001 |
Intraoperative nerve monitoring used | 1190 (40.0) | 542 (37.6) | .14 | 1223 (55.7) | 346 (37.7) | <.001 |
Surgical drain used | 458 (15.4) | 292 (20.3) | <.001 | 510 (23.3) | 172 (18.7) | .005 |
Postoperative calcium level checked | 2114 (71.0) | 986 (68.4) | .09 | 1435 (65.4) | 648 (70.5) | .006 |
Postoperative PTH level checked | 593 (19.9) | 457 (31.7) | <.001 | 462 (21.1) | 383 (41.7) | <.001 |
Postoperative calcium and/or vitamin D prescribed | 2281 (76.6) | 962 (66.8) | <.001 | 1281 (58.4) | 572 (62.2) | .05 |
Abbreviations: PTH, parathyroid hormone; RLN, recurrent laryngeal nerve.
Data represent the number (percentage) of patients who experienced the care process at the best-performing and worst-performing hospitals for each outcome.
With regard to hospital performance on postoperative hypocalcemia, differences were not associated with whether postoperative calcium levels were measured (best-performing vs worst-performing hospitals, 71.0% vs 68.4%; P = .09). However, patients who underwent thyroidectomies at the best-performing hospitals less frequently had their postoperative PTH level measured compared with those who underwent their thyroidectomies at the worst-performing hospitals (593 [19.9%] vs 457 [31.7%], P < .001). Patients who underwent thyroidectomies at the best-performing hospitals were more often prescribed postoperative calcium, vitamin D, or both compared with those at the worst-performing hospitals (2281 [76.6%] vs 962 [66.8%], P < .001).
With respect to RLN injury, IONM (1223 [55.7%] vs 346 [37.7%], P < .001) and energy devices (1517 [69.1%] vs 507 [55.2%], P < .001) were more frequently used at the best-performing hospitals compared with the worst.
Discussion
The first objective of this study was to evaluate whether thyroidectomy-specific outcomes vary among hospitals. Current quality metrics might be insufficient because they are not adequately procedure specific. Using thyroidectomy-specific ACS-NSQIP data, we detected significant hospital performance differences for clinically severe postoperative hypocalcemia and RLN injury but not for clinically significant cervical hematoma. When hospitals were evaluated based on standard ACS-NSQIP outcomes, hospital performance varied significantly for morbidity and SSI (although with few outliers) but not for readmission. These results suggest that continued quality improvement efforts that target hypocalcemia, RLN injury, morbidity, and SSI after thyroidectomy are warranted.
The second objective was to assess whether adding thyroidectomy-specific variables to standard ACS-NSQIP variables affects risk adjustment on hospital rankings. Risk adjustment helps ensure that fair comparisons are made among hospitals by accounting for immutable patient factors outside the surgeon’s control and improves insight into improvement targets. No substantive changes in hospital rankings when thyroidectomy-specific data were included in the risk adjustment process were found. The thyroidectomy-specific data, as currently implemented, did not provide additional statistical explanatory power beyond variables currently routinely collected in the ACS-NSQIP for hospital-level performance measurement. Therefore, risk adjustment using currently collected ACS-NSQIP data are acceptable when comparing hospitals by thyroidectomy-specific and standard outcomes, but potential for improvement remains. New thyroidectomy-specific variables with more explanatory power are needed (eg, prior anterior neck irradiation, concomitant parathyroidectomy for primary hyperparathyroidism), or current thyroidectomy-specific variables require revisions (eg, assess outcomes beyond 30 days).
The third objective was to evaluate whether potential processes of care were associated with hospital performance differences in thyroidectomy-specific outcomes to highlight potential actionable items for improvement and to generate insight and hypotheses to inform future work, not for causality. The best-performing hospitals in 30-day postoperative hypocalcemia more often prescribed oral calcium, vitamin D, or both and less often measured postoperative PTH levels before discharge. Some surgeons routinely prescribe oral supplements, whereas others favor a selective approach using postoperative levels of serum calcium, PTH, or both. These data suggest that more frequently prescribing oral supplementation is associated with decreased incidence of clinically severe hypocalcemia (ie, cases resulting in clinic or emergency department evaluations or readmission). A previous study found that routine calcium and vitamin D supplementation is cost-effective because it decreases health care resource use, whereas another study argued that selective supplementation is equally cost-effective. Removal of underlying biases from these results is difficult. High surgeon confidence in the operative quality may have contributed to fewer postoperative laboratory measurements. Alternatively, routinely prescribing oral supplementation may allow surgeons to circumvent the need to measure PTH levels. Given the lack of consensus on the optimal approach, future studies are needed to better define the association between supplementation and hypocalcemia.
When profiled by risk-adjusted RLN injury occurrences, the best-performing hospitals—or, more precisely, surgeons at these hospitals—more frequently used energy devices and IONM compared with the worst-performing hospitals. Benefits of IONM include improved identification of RLN anatomy and immediate intraoperative assessment of RLN function. However, whether the use of IONM is associated with decreased rates of RLN injury compared with direct visualization or whether intraoperative signal loss indicates postoperative vocal cord dysfunction remains controversial. In this study, the use of IONM was associated with improved hospital performance. Data on whether the RLN was directly visualized intraoperatively and whether an IONM signal was present at the end of the operation were unavailable and thus could not be assessed. These and other data might be needed to better understand these findings.
Surgeons at hospitals with the lowest aggregate (ie, hospital) rates of RLN injury more frequently used energy devices. Although it might be counterintuitive that energy device use is protective, bias could exist. The effect of energy devices on outcomes remains ambiguous. Hospitals with surgeons who more commonly use IONM might simply have more resources and thus more availability of energy devices. Alternately, perhaps surgeons who on average have superior outcomes are more attuned to these technology investments and their limitations. Care processes such as those studied are generally more attributable to the surgeons, and higher-quality hospitals may reflect higher-quality surgeons. These data could not directly examine how individual surgeon care practices and performance affect overall hospital performance on these outcomes. Those insights should be pursued in future work.
The ACS-NSQIP tracks outcomes up to 30 days postoperatively. Although this follow-up period is appropriate for many surgical quality metrics, certain procedure-specific outcomes might require longer periods of follow-up to be appropriately assessed. This issue of follow-up horizon is particularly salient for thyroidectomies because hypocalcemia and RLN injury can be transient with only a subgroup becoming permanent. Methodologically, low rates of permanent hypocalcemia and RLN injury would likely prevent reliable comparisons in performance. Moreover, assessing the validity of the outcome involves more than the chosen follow-up period. For instance, our definition of hypocalcemia represents clinically severe circumstances resulting in increased health care resource use. Although a 30-day assessment of hypocalcemia may appear on the surface to be incomplete, increased health care resource use related to hypocalcemia of any duration is important to measure from quality improvement and societal perspectives.
Limitations
Other limitations must be recognized. This study’s definition of RLN injury is more inclusive than the criterion standard of directly examining postoperative vocal cord function, and our results might represent overestimations. The current data were the first iteration of thyroidectomy-specific data collection in the ACS-NSQIP, and a relatively small cohort of hospitals (approximately 13%) participated. Therefore, the generalizability of our findings to non–ACS-NSQIP hospitals might be limited. Many hospitals in this cohort were academic institutions with high thyroidectomy volumes. Although this mitigates concerns about the influence of unmeasured volume-outcome variables, homogeneity might explain the lack of detectable performance variation in postoperative hematomas. However, we can expect wider performance variation for hypocalcemia and RLN injury with the inclusion of additional, less expert hospitals, suggesting that thyroidectomy-specific outcomes are more important to measure for quality improvement purposes. In the ACS-NSQIP, collected data and variable definitions are continuously reviewed and revised with input from participants and experts. Subsequent iterations of thyroidectomy-specific data collection rely on analyses such as these to make improvements.
Conclusions
Short-term, clinically severe hypocalcemia and RLN injury, but not hematoma, after thyroidectomy may be valuable quality improvement metrics. The detectable performance differences suggest that the ACS-NSQIP may be well suited for hospital benchmarking using thyroidectomy-specific outcomes. Current ACS-NSQIP data used for risk adjustment provides adequate distinctions of quality when profiling hospitals by thyroidectomy-specific and traditional outcomes. Routine prescription of calcium and vitamin D supplementation and use of energy devices and IONM were associated with differences in hospital performance, warranting further investigation.
References
- 1.Lim H, Devesa SS, Sosa JA, Check D, Kitahara CM. Trends in thyroid cancer incidence and mortality in the United States, 1974-2013. JAMA. 2017;317(13):1338-1348. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Sosa JA, Hanna JW, Robinson KA, Lanman RB. Increases in thyroid nodule fine-needle aspirations, operations, and diagnoses of thyroid cancer in the United States. Surgery. 2013;154(6):1420-1426. [DOI] [PubMed] [Google Scholar]
- 3.Loehrer AP, Murthy SS, Song Z, Lubitz CC, James BC. Association of insurance expansion with surgical management of thyroid cancer. JAMA Surg. 2017;152(8):734-740. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Sippel RS, Chen H. Limitations of the ACS NSQIP in thyroid surgery. Ann Surg Oncol. 2011;18(13):3529-3530. [DOI] [PubMed] [Google Scholar]
- 5.Edafe O, Antakia R, Laskar N, Uttley L, Balasubramanian SP. Systematic review and meta-analysis of predictors of post-thyroidectomy hypocalcaemia. Br J Surg. 2014;101(4):307-320. [DOI] [PubMed] [Google Scholar]
- 6.Vasileiadis I, Karatzas T, Charitoudis G, Karakostas E, Tseleni-Balafouta S, Kouraklis G. Association of intraoperative neuromonitoring with reduced recurrent laryngeal nerve injury in patients undergoing total thyroidectomy. JAMA Otolaryngol Head Neck Surg. 2016;142(10):994-1001. [DOI] [PubMed] [Google Scholar]
- 7.Campbell MJ, McCoy KL, Shen WT, et al. . A multi-institutional international study of risk factors for hematoma after thyroidectomy. Surgery. 2013;154(6):1283-1289. [DOI] [PubMed] [Google Scholar]
- 8.Pisanu A, Porceddu G, Podda M, Cois A, Uccheddu A. Systematic review with meta-analysis of studies comparing intraoperative neuromonitoring of recurrent laryngeal nerves versus visualization alone during thyroidectomy. J Surg Res. 2014;188(1):152-161. [DOI] [PubMed] [Google Scholar]
- 9.White MG, James BC, Nocon C, et al. . One-hour PTH after thyroidectomy predicts symptomatic hypocalcemia. J Surg Res. 2016;201(2):473-479. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Ko CY, Hall BL, Hart AJ, Cohen ME, Hoyt DB. The American College of Surgeons National Surgical Quality Improvement Program: achieving better and safer surgery. Jt Comm J Qual Patient Saf. 2015;41(5):199-204. [DOI] [PubMed] [Google Scholar]
- 11.Maggard-Gibbons M. The use of report cards and outcome measurements to improve the safety of surgical care: the American College of Surgeons National Surgical Quality Improvement Program. BMJ Qual Saf. 2014;23(7):589-599. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Cohen ME, Ko CY, Bilimoria KY, et al. Optimizing ACS NSQIP modeling for evaluation of surgical quality and risk: patient risk adjustment, procedure mix adjustment, shrinkage adjustment, and surgical focus. J Am Coll Surg. 2013;217(2):336-346, e331. [DOI] [PubMed]
- 13.Chesapeake IRB. 2017; https://www.chesapeakeirb.com/. Accessed July 5, 2017.
- 14.Iezzoni LI. Risk Adjustment for Measuring Health Care Outcomes. 4th ed. Chicago, IL: Health Administration Press; 2013. [Google Scholar]
- 15.Allison P. Handling Missing Data by Maximum Likelihood. Haverton, PA: Statistical Horizons; 2012. SAS Global Forum 2012. Statistics and Data Analysis. Paper 312-2012. [Google Scholar]
- 16.Liu JB, Weber SM, Berian JR, et al. . Role of operative complexity variables in risk adjustment for patients with cancer. JAMA Surg. 2016;151(11):1084-1086. [DOI] [PubMed] [Google Scholar]
- 17.Merkow RP, Kmiecik TE, Bentrem DJ, et al. . Effect of including cancer-specific variables on models examining short-term outcomes. Cancer. 2013;119(7):1412-1419. [DOI] [PubMed] [Google Scholar]
- 18.Burdick RK, Graybill FA. Confidence Intervals on Variance Components. New York, NY: M Dekker; 1992. [Google Scholar]
- 19.Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159-174. [PubMed] [Google Scholar]
- 20.Cohen ME, Liu Y, Ko CY, Hall BL. An examination of American College of Surgeons NSQIP surgical risk calculator accuracy. J Am Coll Surg. 2017;224(5):787-795. [DOI] [PubMed] [Google Scholar]
- 21.Wang TS, Roman SA, Sosa JA. Postoperative calcium supplementation in patients undergoing thyroidectomy. Curr Opin Oncol. 2012;24(1):22-28. [DOI] [PubMed] [Google Scholar]
- 22.Landry CS, Grubbs EG, Hernandez M, et al. . Predictable criteria for selective, rather than routine, calcium supplementation following thyroidectomy. Arch Surg. 2012;147(4):338-344. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Wang TS, Cheung K, Roman SA, Sosa JA. To supplement or not to supplement: a cost-utility analysis of calcium and vitamin D repletion in patients after thyroidectomy. Ann Surg Oncol. 2011;18(5):1293-1299. [DOI] [PubMed] [Google Scholar]
- 24.Bergenfelz A, Salem AF, Jacobsson H, Nordenström E, Almquist M; Steering Committee for the Scandinavian Quality Register for Thyroid, Parathyroid and Adrenal Surgery (SQRTPA) . Risk of recurrent laryngeal nerve palsy in patients undergoing thyroidectomy with and without intraoperative nerve monitoring. Br J Surg. 2016;103(13):1828-1838. [DOI] [PubMed] [Google Scholar]
- 25.Moris D, Vernadakis S, Felekouras E. The role of intraoperative nerve monitoring (IONM) in thyroidectomy: where do we stand today? Surg Innov. 2014;21(1):98-105. [DOI] [PubMed] [Google Scholar]
- 26.Dralle H, Sekulla C, Haerting J, et al. . Risk factors of paralysis and functional outcome after recurrent laryngeal nerve monitoring in thyroid surgery. Surgery. 2004;136(6):1310-1322. [DOI] [PubMed] [Google Scholar]
- 27.Chandrasekhar SS, Randolph GW, Seidman MD, et al. ; American Academy of Otolaryngology-Head and Neck Surgery . Clinical practice guideline: improving voice outcomes after thyroid surgery. Otolaryngol Head Neck Surg. 2013;148(6)(suppl):S1-S37. [DOI] [PubMed] [Google Scholar]
- 28.De Palma M, Rosato L, Zingone F, et al. . Post-thyroidectomy complications: the role of the device: bipolar vs ultrasonic device: collection of data from 1,846 consecutive patients undergoing thyroidectomy. Am J Surg. 2016;212(1):116-121. [DOI] [PubMed] [Google Scholar]
- 29.Yao HS, Wang Q, Wang WJ, Ruan CP. Prospective clinical trials of thyroidectomy with LigaSure vs conventional vessel ligation: a systematic review and meta-analysis. Arch Surg. 2009;144(12):1167-1174. [DOI] [PubMed] [Google Scholar]
- 30.Zhang L, Li N, Yang X, Chen J. A meta-analysis comparing the outcomes of LigaSure Small Jaw versus clamp-and-tie technique or Harmonic Focus Scalpel in thyroidectomy. Medicine (Baltimore). 2017;96(11):e6141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Garas G, Okabayashi K, Ashrafian H, et al. . Which hemostatic device in thyroid surgery? a network meta-analysis of surgical technologies. Thyroid. 2013;23(9):1138-1150. [DOI] [PubMed] [Google Scholar]
- 32.Rocke DJ, Goldstein DP, de Almeida JR. A cost-utility analysis of recurrent laryngeal nerve monitoring in the setting of total thyroidectomy. JAMA Otolaryngol Head Neck Surg. 2016;142(12):1199-1205. [DOI] [PubMed] [Google Scholar]
- 33.Jha CK, Bichoo RA, Yadav SK. Comment on: can we consider immediate complications after thyroidectomy as a quality metric of operation? [published online January 18, 2017]. Surgery. [DOI] [PubMed] [Google Scholar]
- 34.Lifante JC, Payet C, Ménégaux F, et al. ; CATHY Study Group . Can we consider immediate complications after thyroidectomy as a quality metric of operation? Surgery. 2017;161(1):156-165. [DOI] [PubMed] [Google Scholar]
- 35.Adam MA, Thomas S, Youngwirth L, et al. . Is there a minimum number of thyroidectomies a surgeon should perform to optimize patient outcomes? Ann Surg. 2017;265(2):402-407. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.