Abstract
Objective:
To assess the appropriateness of hysterectomies performed at a large tertiary health system using the 1997 RAND appropriateness classification system and an updated algorithm.
Design:
We abstracted structured and unstructured data from electronic medical records on patient demographics, primary indication(s) for hysterectomy, diagnosis codes associated with the hysterectomy, previous treatments, and laboratory results.
Subjects:
Patients aged 18–44 years.
Exposure:
Receipt of hysterectomy for benign and nonobstetric conditions from October 2014 to December 2017.
Main Outcome Measures:
Using these data, we provided a RAND-based (dichotomous: inappropriate/appropriate) and Wright-based (3-level: inappropriate/ambiguous/appropriate) appropriateness rating and characterized missing information patterns associated with inappropriate ratings.
Results:
We analyzed 1,829 hysterectomies across 30 nonmutually exclusive primary indications for surgery. Nearly a third (32.8%) of surgeries had only one primary indication for surgery. Using the RAND-based classifier, 31.3% of hysterectomies were rated as appropriate and 68.7% as inappropriate. Using the Wright-based algorithm, 58.1% of hysterectomies were rated as appropriate, 15.7% as ambiguous, and 26.2% as inappropriate. Missing information on diagnostic procedures was the most common characteristic related to both RAND-based (46.1%) and Wright-based (51.2%) inappropriate ratings.
Conclusions:
The 1997 RAND classification lacked guidance for several contemporary indications, including gender-affirming care. RAND also has an outdated requirement for diagnostic surgeries such as laparoscopies, which have decreased in practice as diagnostic imaging has improved. Sensitivity analyses suggest that inappropriate surgeries cannot all be attributed to bias from missing electronic medical record data. Accurately documenting care delivery for benign gynecological conditions is key to ensuring quality and equity in gynecological care.
Keywords: hysterectomy, gynecology, fibroids, women’s health, benign, abnormal uterine bleeding, endometriosis, quality metrics, prophylactic hysterectomy, quality improvement
Introduction
Benign gynecological conditions are prevalent diseases and disorders that affect one or more gynecological organs. While benign conditions typically do not result in mortality, they have serious symptoms and greatly affect quality of life.1 Treatment of benign gynecological conditions is nuanced, guided by both the degree the patient is bothered by the symptoms and the tolerability of available interventions.2–6 In addition, as many of the symptoms can resolve with menopause, estimated time until menopause is an important consideration. Therefore, treatment plans are often individualized per patient preference.7–10
At least 400,000 hysterectomies, the leading treatment for benign gynecological conditions, occur annually in the United States.11,12 In the past few decades, gynecological medicine has introduced multiple treatment options as alternatives to hysterectomy that vary in degree of invasiveness, efficacy, and side effects.4,5,10,13–16 However, hysterectomy can provide the most definitive resolution of symptoms for conditions such as uterine leiomyomas (fibroids), and yet, the procedure is invasive, permanently ends patients’ ability to carry a child, and may negatively affect long-term cardiovascular health.17–22
Due to the invasiveness and permanence of hysterectomy, the complexity of decision-making for benign gynecological conditions, including the limitations of hysterectomy treatment in addressing the multifactorial sources of pain for conditions such as chronic pelvic pain,23,24 and the history of overutilization of sterilization, especially among marginalized populations,25,26 it is important to ensure that hysterectomy is performed appropriately. However, defining appropriateness of hysterectomy is challenging. Patient preferences, individual patient and clinician characteristics, hospital resources, knowledge and acceptability of alternative treatment options, and structural factors, such as insurance access and affordability, geographic access to a skilled surgical provider, or access to postoperative support, can all affect treatment decisions.
In 1997, the RAND Corporation published guidance for classifying the appropriateness of hysterectomy performance. This guidance was based on an expert panel’s review of indications that might be seen in a typical clinical practice.27–29 In addition to indication, the expert panel considered case profiles defined by patient age, symptoms, medical or family history, and results of previous diagnostic tests and procedures. Specifically, the authors described their recommendations as follows, “a recommendation of ‘inappropriate’ represents the unanimous agreement of the expert panel that it is wrong to perform hysterectomy in the usual patient with the condition. All other women are considered possible candidates for hysterectomy. The category of possible candidate for hysterectomy is not meant to imply that hysterectomy is indicated in these women, merely it may be a treatment option.”29
However, since the RAND publications, gynecological care has evolved. Clinical and scientific knowledge about benign gynecological conditions has progressed, with less invasive and fertility sparing alternatives to hysterectomy becoming more available. Thus, the objectives of this study are to apply the RAND appropriateness classification system in a contemporary clinical context and compare them to an updated appropriateness algorithm that incorporates new indications for hysterectomy, changes in presurgery diagnostic procedure requirements, and changes to clinical care of gynecological symptoms.
Materials and Methods
Data for this study were obtained from the Carolina Hysterectomy Cohort (CHC), a case series of patients who received a hysterectomy for benign gynecological conditions at a large not-for-profit health system in North Carolina between October 2, 2014, and December 31, 2017. CHC includes structured data from electronic medical records (EMR) and unstructured free text manually abstracted from EMR notes. We used Common Procedure Terminology (CPT), International Classification of Diseases-9 Clinical Modification (ICD-9-CM), and International Classification of Diseases-10 Procedure Coding System (ICD-10-PCS) procedure codes30 (Supplementary Table S1) to identify patients who had a hysterectomy between the ages of 18–44 years (n = 2,205). We excluded patients with pregnancy, or cancer (breast, uterine, ovarian, or colorectal) ICD-9-CM or ICD-10-CM diagnosis codes (n = 331) at the time of surgery (Supplementary Table S2); those incarcerated at the time of surgery (n = 28); nonresidents of North Carolina (n = 13); and patients with incomplete abstraction records (n = 4). The final analytic sample for this study included 1,829 patients.
The primary indication for surgery was abstracted from preoperative notes in the EMR. The primary indications for surgery are listed in Table 1. A patient could have more than one primary indication for surgery.
Table 1.
Primary Indications for Surgery in Carolina Hysterectomy Cohort for 1,829 Noncancerous and Nonobstetric Hysterectomies Received Between October 2, 2014, and December 31, 2017, Among Patients 18-44 Years of Age
| Primary indications for surgerya | N (%) |
|---|---|
| Uterine Leiomyomas | 683 (37.3) |
| Menorrhagia | 638 (34.9) |
| Abnormal Uterine Bleeding | 458 (25.0) |
| Chronic Pelvic Pain | 416 (22.74) |
| Dysmenorrhea | 322 (17.61) |
| Endometriosis | 143 (7.8) |
| Ovarian Cyst or Pelvic mass | 141 (7.7) |
| Anemiab,c | 139 (7.6) |
| Prolapse | 105 (5.7) |
| Cervical Dysplasia | 75 (4.1) |
| Incontinenceb,c | 68 (3.7) |
| Dyspareunia | 52 (2.8) |
| Prophylactic for Cancer | 45 (2.5) |
| Adenomyosis | 42 (2.3) |
| Metrorrhagia | 41 (2.2) |
| Endometrial Hyperplasia | 38 (2.1) |
| Bulk Symptomsb,c | 35 (1.9) |
| Gender Affirming Careb | 35 (1.9) |
| Other Pain | 27 (1.5) |
| Post-Ablation Tubal Sterilization Syndromeb | 19 (1.0) |
| Polypsc | 14 (0.8) |
| Structural Abnormalitiesc | 11 (0.6) |
| Failed Medical Managementb | 10 (0.6) |
| Infection (PID, TOA) | 8 (0.4) |
| Polycystic Ovarian Syndromeb,c | 8 (0.4) |
| Infertilityb,c | 7 (0.4) |
| Premenstrual Syndrome/Premenstrual Dysphoric Disorder (PMDD) | 6 (0.3) |
| Low Grade Cervical Dysplasia | 5 (0.3) |
| Sterilizationc | d |
| Othere | d |
| No Primary Indication | 83 (4.5) |
Primary indications for surgery are not mutually exclusive.
Primary indications for surgery not described in RAND.
Primary indications for surgery never the lone primary indication.
Table cells with sample sizes <5 have been suppressed.
Muscle spasm and female genital organ symptoms.
Wright appropriateness algorithm
We convened an expert panel of three practicing obstetrician-gynecologists and clinician scientists to create the Wright appropriateness algorithm, an updated adaptation of the 1997 RAND appropriateness classification system.29 The expert panel provided updated appropriateness ratings of the primary indications described in the 1997 RAND recommendations29 and the additional primary indications available in the CHC that were not described by RAND. In addition, the Wright algorithm added an ambiguous category to account for surgeries that were missing examples of clinical care that were considered good practice but not absolutely necessary in extenuating circumstances or when a patient declines care. After providing independent appropriateness classification recommendations, the expert panel convened virtually to discuss any discrepancies until consensus was met. The purpose of the Wright appropriateness algorithm is not to be applied in clinical settings, but to interrogate ideas about appropriateness of hysterectomy in the current clinical context.
Protocol to operationalize appropriateness ratings
The RAND classification system has 2 levels (inappropriate/appropriate), and the Wright appropriateness algorithm has 3 levels (inappropriate/ambiguous/appropriate). For both, if all criteria are not met for a surgery with a given indication, then the surgery can receive an inappropriate or ambiguous rating. For most indications, the criteria include documentation of specific diagnoses, symptoms, and diagnostic procedures to assess the origin of symptoms. For indications related to specific symptoms like abnormal uterine bleeding, the criteria could include certain levels of impairment and documentation of laboratory values. Finally, some indications are required to have documentation of previous medication use or receipt of operative procedures such as bilateral oophorectomy at the time of hysterectomy to receive an appropriate rating. Primary indications not described in the 1997 RAND recommendations29 received an inappropriate rating in the RAND classification system.
The number and type of criteria differ between the RAND classification system and the Wright appropriateness algorithm. For example, diagnostic procedures described in the RAND classification system include laparoscopy or endometrial sampling, whereas the Wright appropriateness algorithm relies on diagnostic imaging by ultrasound, computerized tomography, or magnetic resonance imaging. In addition, the RAND classification system requires certain levels of impairment and/or laboratory values for the surgery to be classified as appropriate, whereas documentation of any symptoms regardless of impairment is sufficient for the Wright appropriateness algorithm (Supplementary Table S3).
We were able to match the elements of the RAND classification system and the Wright appropriateness algorithm to both structured data from the EMR on variables such as age, diagnoses, laboratory values, and procedures performed at time of hysterectomy and unstructured manually abstracted data on symptoms, imaging, previous treatments, and prior diagnostic procedures.31 In rare cases, when no data were available pertaining to a criterion, we removed that criterion from the classification system or algorithm (Supplementary Table S3).
The RAND classification system and Wright appropriateness algorithm are conservative in two main ways. First, if information on severity of disease for a certain primary indication is required and we did not have detailed information on disease progression, we assumed that any patient diagnosed with that condition had the level of disease sufficient to meet the criteria for an appropriate rating. For example, the RAND classification system has different criteria depending on the severity of a patient’s endometriosis disease (mild, moderate, or severe), on which we often lacked detailed data. We assumed that patients that received a hysterectomy for endometriosis had a sufficient level of disease to receive a RAND-based appropriate rating (Supplementary Fig. S4). Second, patients in the CHC can have multiple primary indications related to their hysterectomy. If one primary indication for hysterectomy is rated as inappropriate and any other indication for that hysterectomy is rated as appropriate, the hysterectomy was rated as appropriate. We erred toward the most beneficial rating.
Statistical Analysis
We estimated the prevalence of each level of appropriateness ratings in the CHC using the RAND classification system and the Wright appropriateness algorithm. We also estimated prevalence of appropriateness ratings stratified by primary indication for surgery. In addition, we estimated the agreement of the appropriateness ratings between the RAND classification system and the Wright appropriateness algorithm using a weighted kappa statistic with 95% confidence intervals (CIs). The weighted kappa statistic ranges from −1 (perfect disagreement) to 1 (perfect agreement), with zero representing agreement equivalent to chance expectation.
We also characterized the missing criteria associated with a RAND-based or Wright-based inappropriate rating. There are 6 nonmutually-exclusive pathways that led to an inappropriate rating (Table 3, Fig. 1). SAS version 9.4 (SAS Institute Inc.; Cary, NC USA) was used for all analyses.
Table 3.
Description of Inappropriateness Criteria for 1,256 RAND-Based and 479 Wright-Based Inappropriate Rated Noncancerous and Nonobstetric Hysterectomies in the Carolina Hysterectomy Cohort Received Between October 2, 2014, and December 31, 2017, Among Patients Aged 18–44 Yearsa,b
| Description | RAND classification | Wright classification |
|---|---|---|
| N (%) | N (%) | |
| Inappropriate Indication | 150 (3.8) | 141 (9.1) |
|
141 (3.3) | 141 (9.1) |
|
6 (0.1) | N.A.c |
|
d | N.A. |
| Missing Diagnostic Procedures | 1959 (46.1) | 793 (51.2) |
|
634 (14.9) | N.A. |
|
1071 (25.2) | N.A |
|
99 (2.3) | 59 (3.8) |
|
d | 6 (0.4) |
|
200 (4.7) | 728 (47.0) |
| Missing Medical Management | 1294 (29.0) | 461 (29.7) e |
|
1033 (30.4) | 461 (29.7) |
|
260 (6.1) | N.A |
| Missing adequate identification of a condition and/or level of severity | 721 (17.0) | 149 (9.6) |
|
68 (1.6) | 58 (3.7) |
|
63 (1.5) | 91 (5.9) |
|
616 (14.5) | N.A |
| Missing operative procedure | 95 (2.2) | d |
|
||
| Ineligible due to individual patient characteristics | 34 (0.8) | 2 (0.1) |
|
24 (0.5) | N.A |
|
10 (0.2) | d |
| Total Criteria | 4253 (100) | 1549 (100) |
Inappropriateness criteria are not mutually exclusive. Inappropriate rated hysterectomies could be missing multiple items.
Excluding indications not mentioned in RAND and primary indications for surgery that were never the lone indication.
Not Applicable for the Wright appropriateness algorithm.
Table cells with sample sizes <5 have been suppressed.
Missing medical management was criteria for an ambiguous rating in the Wright-based appropriateness algorithm.
FIG. 1.
Bar Chart representing the six nonmutually-exclusive pathways that led to an inappropriate rating for 1,256 RAND-based and 479 Wright-based inappropriate rated noncancerous and nonobstetric hysterectomies received between October 2, 2014, and December 31, 2017, among patients 18–44 years of age. *Inappropriateness criteria are not mutually exclusive. Inappropriate rated hysterectomies could be missing multiple items. †Percentages are calculated by the number of each type of inappropriateness criteria divided by the total number of inappropriateness criteria for RAND-based or Wright-based inappropriate rated hysterectomies. ‡Excluding primary indications not mentioned in RAND and that were never the lone primary indication.
We ran several sensitivity analyses. In one, we estimated the weighted kappa statistics comparing the RAND classification system and the Wright appropriateness algorithm after excluding the primary indications for surgery that were not discussed in the 1997 RAND recommendations.29
Other sensitivity analyses focused on the extent to which missing EMR data could lead us to incorrectly classify a surgery as inappropriate because recommended documentation or clinical care was not provided. The underlying assumption of the main analysis is that missing EMR data reflects incomplete care delivery. To evaluate whether missing EMR data could be overestimating the prevalence of inappropriate ratings, we considered several scenarios that would cause potential misclassification of EMR data as incomplete care delivery. We distinguished two causes of missing data as follows: incomplete care delivery (clinical care not provided to the patient) and missing records (clinical care provided to patient but not recorded in the EMR data).
First, we ran a coarse sensitivity analysis where we estimated the prevalence of appropriateness ratings if 10 − 100% of the missing EMR data had been misclassified as incomplete care delivery. In the second sensitivity analysis, we estimated the prevalence of each of the six pathways associated with an inappropriate rating. We then estimated the prevalence of appropriateness ratings if 10 − 100% of the six pathways associated with an inappropriate rating were misclassified as incomplete care delivery. In both sensitivity analyses, the presence of an inappropriate primary indication for surgery resulted in an inappropriate rating, just as it was in the main analysis.
This study was approved by the Institutional Review Board at the University of North Carolina Chapel Hill (IRB # 21–2305).
Results
Descriptive statistics of total population
Based on the abstracted data in the CHC dataset, there were 30 primary indications observed for these surgeries (Table 1). Eighty-three patients (4.5%) did not have a primary indication for surgery listed in the EMR, and 32.8% of surgeries had only one primary indication for surgery. There were eight primary indications for surgery (polyps, polycystic ovarian syndrome, structural abnormality, infertility, incontinence, anemia, bulk symptoms, and sterilization) that only occurred in tandem with another primary indication for surgery. The most prevalent primary indications for surgery were for uterine leiomyomas (37.3%) and menorrhagia (34.9%) (Table 1).
Appropriateness estimates
Using the RAND classification system, 68.7% of hysterectomies were rated as inappropriate and 31.3% as appropriate (Table 2). Thirty-four (1.9%) hysterectomies received a RAND-based inappropriate rating due to the presence of inappropriate primary indications (premenstrual syndrome and ovarian cyst or pelvic mass). Missing information on diagnostic procedures was the most common characteristic related to a RAND-based inappropriate rating (46.1%) (Table 3, Fig. 1).
Table 2.
Appropriateness Rating by Primary Indication for Hysterectomy Based on RAND Appropriateness Classification System and the Wright Appropriateness Algorithm in the Carolina Hysterectomy Cohort for 1,829 Noncancerous and Nonobstetric Hysterectomies Received Between October 2, 2014, and December 31, 2017, Among Patients Aged 18–44 Years
| Primary indications for surgerya | Total | RAND classification | Wright classification | |||
|---|---|---|---|---|---|---|
| Inappropriate | Appropriate | Inappropriate | Ambiguous | Appropriate | ||
| (N = 1256; 68.7%) | (N = 573; 31.3%) | (N = 479; 26.2%) | (N = 287; 15.7%) | (N = 1063; 58.1%) | ||
| %b | %b | %b | %b | %b | ||
| Uterine Leiomyomas | 683 | 59.3 | 40.7 | 32.2 | 20.2 | 47.6 |
| Menorrhagia | 638 | 95.1 | 4.9 | 28.0 | 39.5 | 32.5 |
| Abnormal Uterine Bleeding | 458 | 89.1 | 10.9 | 11.4 | 40.2 | 48.4 |
| Chronic Pelvic Pain | 416 | 91.3 | 8.7 | 31.5 | 14.9 | 53.6 |
| Dysmenorrhea | 322 | 78.3 | 21.7 | 40.7 | 7.1 | 52.2 |
| Endometriosis | 143 | 67.1 | 32.9 | 35.0 | 4.9 | 60.1 |
| Ovarian Cyst or Pelvic mass | 141 | 100.0 | 0.0 | 100.0 | 0.0 | 0.0 |
| Anemiac,d | 139 | 100.0 | 0.0 | 100.0 | 0.0 | 0.0 |
| Prolapse | 105 | 9.5 | 90.5 | 2.9 | 6.7 | 90.4 |
| Cervical Dysplasia | 75 | 77.3 | 22.7 | 24.0 | 52.0 | 24.0 |
| Incontinencec,d | 68 | 100.0 | 0.0 | 100.0 | 0.0 | 0.0 |
| Dyspareunia | 52 | 96.1 | 3.9 | 19.2 | 30.8 | 50.0 |
| Prophylactic for Cancer | 45 | 73.3 | 26.7 | 8.9 | 15.6 | 75.5 |
| Adenomyosis | 42 | 92.9 | 7.1 | 26.2 | 9.5 | 64.3 |
| Metrorrhagia | 41 | 95.1 | 4.9 | 9.8 | 41.5 | 48.7 |
| Endometrial Hyperplasia | 38 | 65.8 | 34.2 | 44.7 | 0.0 | 55.3 |
| Bulk Symptomsc,d | 35 | 100.0 | 0.0 | 100.0 | 0.0 | 0.0 |
| Gender Affirming Carec | 35 | 100.0 | 0.0 | 34.3 | 11.4 | 54.3 |
| Other Pain | 27 | 88.9 | 11.1 | 18.5 | 29.6 | 51.9 |
| Post-Ablation Tubal Sterilization Syndromec | 19 | 100.0 | 0.0 | 0.0 | 0.0 | 100.0 |
| Polypsd | 14 | 92.9 | 7.1 | 7.1 | 0.0 | 92.9 |
| Structural Abnormalitiesd | 11 | 100.0 | 0.0 | 100.0 | 0.0 | 0.0 |
| Failed Medical Managementc | 10 | 100.0 | 0.0 | 100.0 | 0.0 | 0.0 |
| Infection (PID, TOA) | 8 | 62.5 | 37.5 | 62.5 | 0.0 | 37.5 |
| Polycystic Ovarian Syndromec,d | 8 | 100.0 | 0.0 | 100.0 | 0.0 | 0.0 |
| Infertilityc,d | 7 | 100.0 | 0.0 | 100.0 | 0.0 | 0.0 |
| Premenstrual Syndrome/Premenstrual Dysphoric Disorder | 6 | 100.0 | 0.0 | 83.3 | 0.0 | 16.7 |
| Low Grade Cervical Dysplasia | 5 | 80.0 | 20.0 | 40.0 | 40.0 | 20.0 |
| Sterilizationd | e | 100.0 | 0.0 | 100.0 | 0.0 | 0.0 |
| Otherf | e | 100.0 | 0.0 | 100.0 | 0.0 | 0.0 |
| No Primary Indication | 83 | 100.0 | 0.0 | 100.0 | 0.0 | 0.0 |
Primary indications for surgery are not mutually exclusive.
Row percent.
Primary indications for surgery not described in RAND.
Primary indications for surgery never the lone primary indication.
Table cells with sample sizes <5 have been suppressed.
Muscle spasm and female genital organ symptoms.
Using the Wright appropriateness algorithm, 26.2% of hysterectomies were rated as inappropriate, 15.7% as ambiguous, and 58.1% as appropriate (Table 2). Twenty-eight (1.4%) hysterectomies received a Wright-based inappropriate rating due to the presence of inappropriate primary indications (failed medical management and ovarian cyst or pelvic mass). Similarly, missing information on diagnostic procedures was the most common characteristic related to a Wright-based inappropriate rating (51.2%) (Table 3, Fig. 1).
Results of our sensitivity analysis regarding missing data are presented in Supplementary Figures S5-S8. Under the assumption that 50% of the missing criteria were related to diagnostic procedures, medical management, or identification of diagnosis or symptoms were misclassified as incomplete care delivery, 36.6% of hysterectomies would receive a RAND-based inappropriate rating (Supplementary Fig. S7), whereas 18.2% and 10.2% of hysterectomies would receive a Wright-based inappropriate and ambiguous rating, respectively (Supplementary Fig. S8).
Agreement between appropriateness ratings
The weighted kappa statistic to characterize agreement between RAND-based and Wright-based appropriateness ratings was 0.16 (95% CI = 0.14–0.17), suggesting only slight agreement. This estimate was robust to our restriction to primary indications addressed in RAND recommendations (κw= 0.16, 95% CI = 0.15–0.18).
Discussion
In a study exploring the appropriateness of hysterectomies as treatment for benign gynecological conditions, we applied appropriateness ratings based on the 1997 RAND appropriateness classification system and an updated adaptation, the Wright appropriateness algorithm. The majority (68.7%) of the hysterectomies received a RAND-based inappropriate rating, compared with a minority (26.2%) that received a Wright-based inappropriate rating. For both the RAND appropriateness classification system and the Wright appropriateness algorithm, the most common reason that surgeries were classified as inappropriate was lack of documentation of recommended diagnostic procedures before the hysterectomy.
When applying the RAND-based appropriateness ratings, our findings are comparable to a study conducted by Broder et al.32 in 2000. Broder et al. also used RAND and three American College of Obstetrics and Gynecology criteria sets to rate the appropriateness of 497 hysterectomies with nonmalignant diagnoses from 9 medical groups in southern California, from August 1,1993 to July 31,1995. Using data from a structured chart review and interviews with patients, Broder et al. estimated that 70% of hysterectomies were inappropriate. Most of the hysterectomies deemed inappropriate did not have diagnostic or therapeutic procedures performed before the hysterectomy.
The lower prevalence of Wright-based inappropriate rated surgeries was likely due to changes in standards of gynecological care since the 1997 publication of the RAND recommendations.29 There were several primary indications for surgery in the study population that were not discussed in RAND. In particular, RAND lacked guidance for hysterectomies as a part of gender-affirming care. In addition, with the increased understanding of hereditary breast and ovarian cancer syndromes, hysterectomies are considered prophylactic treatment for patients with the BRCA gene and other genetic markers.33,34 However, these newly recognized indications account for only a small amount of the discrepancy between the two classification systems.
The major difference between the Wright appropriateness algorithm and the RAND appropriateness classification system was RAND’s requirement for diagnostic surgeries to assess potential causes of pain or bleeding for gynecological conditions such as dysmenorrhea and abnormal uterine bleeding (Table 3, Fig. 1). Today, with advancements in imaging, the Wright appropriateness algorithm’s expert panel concluded that invasive diagnostic surgeries are not necessary to confirm diagnoses such as dysmenorrhea and abnormal uterine bleeding. Instead, symptomatic patients with confirmatory imaging should be counseled on benefits and risks of hysterectomy versus alternative treatments such as uterine artery embolization, minimally invasive myomectomy, and other uterus sparing procedural therapies.2,4 If definitive treatment with hysterectomy is the patient’s preferred treatment, it is reasonable to proceed without additional diagnostic surgeries. Still, missing data on diagnostic procedures accounted for 51.2% of the Wright-based inappropriate ratings. Although these estimates are an upper limit, they indicate that some hysterectomies may be proceeding without adequate diagnostic workup.
This study utilized structured and unstructured data abstracted from the EMR. Missing data are a particular concern when conducting research with EMR data. In a review of EMR data quality, Chan et al. found a wide range of missing data percentages, with a high of 53%.35 The inability to determine why the data are missing in EMR data can affect the validity of study conclusions.36 In our analysis, we recognized that the absence of recommended clinical criteria may reflect incomplete care delivery or may simply reflect failure to document care that was delivered. Our coarse sensitivity analyses set extreme bounds for appropriateness ratings, and the inappropriateness criterion specific sensitivity analyses provided more realistic estimates when considering the possibility of misclassification. When using the Wright appropriateness algorithm, even if 50% of missing criteria related to diagnostic procedures, medical management, or identification of diagnosis or symptoms were misclassified as incomplete care delivery, 18.2% of the hysterectomies would be rated as inappropriate and 10.2% as ambiguous (Supplementary Fig. S8). This suggests that the inappropriate ratings cannot solely be attributed to missing records from the EMR.
This study has several strengths. First, we did not solely rely on procedure and diagnosis codes. We were able to incorporate information on primary indications and symptoms based on provider’s text abstracted from the preoperative surgical notes. Second, we conducted crude sensitivity analyses to consider the potential effects of missing data on our findings. Third, using EMR data versus relying on insurance claims data allowed us to include uninsured patients, who may be particularly at-risk for inappropriate care.37,38 Finally, to our knowledge, our Wright-based appropriateness estimates are the most recent estimates of appropriateness since the Broder et al. study in 2000.32
This study also has some limitations. We provided appropriateness estimates using an updated algorithm that takes into consideration the current clinical context; however, this is not a tool to be applied in clinical settings. Relatedly, there is no gold standard of appropriateness that we could compare our findings against. A gold standard of appropriateness would need to incorporate the patient perspective, which is currently difficult to capture using EMR-based data. In addition, the members of the Wright appropriateness algorithm expert panel have all worked primarily in academic medical centers. Academic medical centers often have more imaging and procedural resources than would be available at smaller, private hospitals or independent medical practices. The assessment of minimum requirements for appropriateness may differ among clinicians practicing in different settings. Finally, using both structured and unstructured data abstracted from EMR, a key strength, may also be a limitation for researchers trying to repeat this work in other settings where abstracted data are incomplete or not readily accessible. Researchers may only have access to structured administrative data which may be a less accurate reflection of the primary indication for surgery.31
Conclusions
Due to changes in standards of care in the past 20 years, the RAND-based appropriateness estimates for benign hysterectomy are outdated and dramatically overestimate the prevalence of inappropriate hysterectomy. However, our Wright-based appropriateness estimates suggest that a notable percentage of young patients in a large hospital system in the South may be treated with hysterectomy without getting the recommended prediagnostic workup and other recommended clinical care.
The updated Wright appropriateness algorithm is a research tool that can renew the conversation around improving adequate clinical documentation of hysterectomy decision-making. Indications for surgery should be clear and transparent; whether they are disease based, symptom based, or patient desired. More consistent documentation can be used by clinicians to improve their own practice, by clinicians’ employers for evidence-based accountability, by researchers to prospectively follow changes in clinical practice, and by patients to better understand the care they may be receiving.
Hysterectomy is one of the most prevalent procedures performed in gynecology today. This fact, combined with its marred history of unethical application, punctuates the necessity to systematically monitor its use.
Acknowledgments
The project described was supported by the National Center for Advancing Translational Sciences (NCATS), National Institutes of Health, through Grant Award Number UL1TR002489.
Portions of this work were presented at three research meetings: A virtual plenary podium presentation at the 28th Annual AHRQ NRA Trainees Research Conference on May 25, 2022, a poster presentation in Washington, DC at the Academy Health Research Meeting on June 5, 2022, and a poster presentation in Chicago, Illinois at the Society for Epidemiological Research Meeting on June 16, 2022.
The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH or the CDC Foundation.
Portions of this article are available online as part of the dissertation of Maya Wright, Doctor of Philosophy in Epidemiology, submitted on May 26, 2022 at the University of North Carolina at Chapel Hill, Gillings School of Global Public Health https://cdr.lib.unc.edu/concern/dissertations/pn89dh139.
Authors’ Contributions
M.W.: Writing—original draft (lead), review & editing (lead), Visualization (lead), Methodology (equal), Formal Analysis (lead), Data Curation (lead), and Investigation (lead); A.C.K.: Writing—review & editing (equal), Supervision (equal), and Methodology (equal); A.B.M.: Writing—review & editing (equal); E.C.: Writing—review & editing (equal); K.M.D.: Writing—review & editing (equal); A.I.V.: Writing—review & editing (equal); A.F.O.: Writing—review & editing (equal) and Supervision (equal); W.R.R.: Writing—review & editing (equal); Supervision (equal); Funding acquisition (lead); and Resources (lead).
Attestation Statement
Data regarding any of the subjects in the study have not been previously published unless specified.
Data will be made available to the editors of the journal for review or query upon request.
Author Disclosure Statement
The authors have no financial disclosures and report no conflict of interests.
Funding Information
Dr. M.W. was partially supported by a National Research Service Award Pre-Doctoral/Post-Doctoral Traineeship from the Agency for Healthcare Research and Quality sponsored by The Cecil G. Sheps Center for Health Services Research, The University of North Carolina at Chapel Hill, Grant No. T32-HS000032.
Drs. E.C., K.D., and W.R. were partially supported by the National Institute on Minority Health and Health Disparities of the National Institutes of Health under award number R01MD011680.
The funders had no role in the study design, collection, analysis, or interpretation of data, in the writing of the report, or in the decision to submit the article for publication.
Supplementary Material
References
- 1. Laughlin-Tommaso SK, Jacoby VL, Myers ER. Disparities in fibroid incidence, prognosis, and management. Obstet Gynecol Clin North Am 2017;44(1):81–94; doi: 10.1016/j.ogc.2016.11.007 [DOI] [PubMed] [Google Scholar]
- 2. American College of Obstetricians and Gynecologists. ACOG practice bulletin. Alternatives to hysterectomy in the management of leiomyomas. Obstet Gynecol 2008;112(2 Pt 1):387–400; doi: 10.1097/AOG.0b013e318183fbab [DOI] [PubMed] [Google Scholar]
- 3. Nguyen NT, Merchant M, Ritterman Weintraub ML, et al. Alternative treatment utilization before hysterectomy for benign gynecologic conditions at a large integrated health system. J Minim Invasive Gynecol 2019;26(5):847–855; doi: 10.1016/j.jmig.2018.08.013Published online [DOI] [PubMed] [Google Scholar]
- 4. Fortin C, Flyckt R, Falcone T. Alternatives to hysterectomy: The burden of fibroids and the quality of life. Best Pract Res Clin Obstet Gynaecol 2018;46:31–42; doi: 10.1016/j.bpobgyn.2017.10.001 [DOI] [PubMed] [Google Scholar]
- 5. Farris M, Bastianelli C, Rosato E, et al. Uterine fibroids: An update on current and emerging medical treatment options. Ther Clin Risk Manag 2019;15:157–178; doi: 10.2147/TCRM.S147318 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Stewart EA, Lytle BL, Thomas L, et al. The comparing options for management: Patient-centered results for uterine fibroids (COMPARE-UF) registry: Rationale and design. Am J Obstet Gynecol 2018;219(1):95.e1–95.e10. Published online [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Kuppermann M, Learman LA, Schembri M, et al. Contributions of hysterectomy and uterus-preserving surgery to health-related quality of life. Obstet Gynecol 2013;122(1):15–25; doi: 10.1097/AOG.0b013e318292aea4 [DOI] [PubMed] [Google Scholar]
- 8. Kuppermann M, Learman LA, Schembri M, et al. Effect of noncancerous pelvic problems on health-related quality of life and sexual functioning. Obstet Gynecol 2007;110(3):633–642; doi: 10.1097/01.AOG.0000279153.56275.b5 [DOI] [PubMed] [Google Scholar]
- 9. Jones GL, Kennedy SH, Jenkinson C. Health-related quality of life measurement in women with common benign gynecologic conditions: A systematic review. Am J Obstet Gynecol 2002;187(2):501–511; doi: 10.1067/mob.2002.124940 [DOI] [PubMed] [Google Scholar]
- 10. Kuppermann M, Learman LA, Schembri M, et al. Predictors of hysterectomy use and satisfaction. Obstet Gynecol 2010;115(3):543–551; doi: 10.1097/AOG.0b013e3181cf46a0 [DOI] [PubMed] [Google Scholar]
- 11. Doll KM, Dusetzina SB, Robinson W. Trends in inpatient and outpatient hysterectomy and oophorectomy rates among commercially insured women in the United States, 2000-2014. JAMA Surg 2016;151(9):876–877; doi: 10.1001/jamasurg.2016.0804 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Wright JD, Herzog TJ, Tsui J, et al. Nationwide trends in the performance of inpatient hysterectomy in the United States. Obstet Gynecol 2013;122(2 Pt 1):233–241; doi: 10.1097/AOG.0b013e318299a6cf [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Lethaby A, Wise MR, Weterings MA, et al. Combined hormonal contraceptives for heavy menstrual bleeding. Cochrane Database Syst Rev 2019;2(2):CD000154; doi: 10.1002/14651858.CD000154.pub3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Dawood MY. Primary dysmenorrhea: Advances in pathogenesis and management. Obstet Gynecol 2006;108(2):428–441; doi: 10.1097/01.AOG.0000230214.26638.0c [DOI] [PubMed] [Google Scholar]
- 15. AAGL Advancing Minimally Invasive Gynecology Worldwide. AAGL position statement: Route of hysterectomy to treat benign uterine disease. J Minim Invasive Gynecol 2011;18(1):1–3; doi: 10.1016/j.jmig.2010.10.001 [DOI] [PubMed] [Google Scholar]
- 16. Sandberg EM, Tummers FH, Cohen SL, et al. Reintervention risk and quality of life outcomes after uterine-sparing interventions for fibroids: A systematic review and meta-analysis. Fertil Steril 2018;109(4):698–707. e1. [DOI] [PubMed] [Google Scholar]
- 17. Doğanay M, Kokanalı D, Kokanalı MK, et al. Comparison of female sexual function in women who underwent abdominal or vaginal hysterectomy with or without bilateral salpingo-oophorectomy. J Gynecol Obstet Hum Reprod 2019;48(1):29–32; doi: 10.1016/j.jogoh.2018.11.004 [DOI] [PubMed] [Google Scholar]
- 18. Hendrix SL. Bilateral oophorectomy and premature menopause. Am J Med 2005;118 Suppl 12B(12):131–135; doi: 10.1016/j.amjmed.2005.09.056 [DOI] [PubMed] [Google Scholar]
- 19. Evans EC, Matteson KA, Orejuela FJ, et al. Salpingo-oophorectomy at the time of benign hysterectomy: A systematic review. Obstet Gynecol 2016;128(3):476–485; doi: 10.1097/AOG.0000000000001592 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Rocca WA, Gazzuola-Rocca L, Smith CY, et al. Accelerated accumulation of multimorbidity after bilateral oophorectomy: A population-based cohort study. Mayo Clin Proc 2016;91(11):1577–1589; doi: 10.1016/j.mayocp.2016.08.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Ingelsson E, Lundholm C, Johansson ALV, et al. Hysterectomy and risk of cardiovascular disease: A population-based cohort study. Eur Heart J 2011;32(6):745–750; doi: 10.1093/eurheartj/ehq477 [DOI] [PubMed] [Google Scholar]
- 22. Moorman PG, Myers ER, Schildkraut JM, et al. Effect of hysterectomy with ovarian preservation on ovarian function. Obstet Gynecol 2011;118(6):1271–1279; doi: 10.1097/AOG.0b013e318236fd12 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Lamvu G, Carrillo J, Ouyang C, et al. Chronic pelvic pain in women: A review. JAMA 2021;325(23):2381–2391; doi: 10.1001/jama.2021.2631 [DOI] [PubMed] [Google Scholar]
- 24. As-Sanie S, Till SR, Schrepf A, et al. Incidence and predictors of persistent pelvic pain following hysterectomy in women with chronic pelvic pain. Am J Obstet Gynecol 2021;225(5):568.e1–568.e11; doi: 10.1016/j.ajog.2021.08.038 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Price GN, Darity WA. The economics of race and eugenic sterilization in North Carolina: 1958-1968. Econ Hum Biol 2010;8(2):261–272; doi: 10.1016/j.ehb.2010.01.002 [DOI] [PubMed] [Google Scholar]
- 26. Jarrell RH. Native American women and forced sterilization, 1973-1976. Caduceus Springf Ill 1992;8(3):45–58. [PubMed] [Google Scholar]
- 27. Bernstein SJ, Fiske M, McGlynn EA, et al. Hysterectomy: A review of the literature on indications, effectiveness, and risks. Published online January 1, 1997. Available from: https://www.rand.org/pubs/monograph_reports/MR592z2.html [Last accessed: November 24, 2020].
- 28. Bernstein SJ. Hysterectomy: Ratings of appropriateness. Published online January 1, 1997. Available from: https://www.rand.org/pubs/monograph_reports/MR592z3.html [Last accessed: May 13, 2021].
- 29. Leape LL, Bernstein SJ, Bohon CJ, et al. Hysterectomy: Clinical recommendations and indications for use. Published online January 1, 1997. Available from: https://www.rand.org/pubs/monograph_reports/MR592z1.html [Last accessed: November 24, 2020].
- 30. Healthcare Cost and Utilization Project (HCUP). HCUP tools and software. Agency for Healthcare Research and Quality, Rockville, MD. Published February 2022. Available from: https://www.hcup-us.ahrq.gov/tools_software.jsp [Last accessed: February 23, 2022]. [Google Scholar]
- 31. Doll KM, Howard AG, Stürmer T, et al. Development of an algorithm to assess unmeasured symptom severity in gynecologic care. Am J Obstet Gynecol 2021;226(3):388.e1–388.e11; doi: 10.1016/j.ajog.2021.11.020Published online November 6, S0002-9378(21)01214-X [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Broder MS, Kanouse DE, Mittman B, et al. The appropriateness of recommendations for hysterectomy. Obstetrics & Gynecology 2000;95(2):8. [DOI] [PubMed] [Google Scholar]
- 33. American Society of Clinical Oncology. Practice bulletin No 182: Hereditary breast and ovarian cancer syndrome. Obstet Gynecol 2017;130(3):e110–e126; doi: 10.1097/AOG.0000000000002296 [DOI] [PubMed] [Google Scholar]
- 34. Shu CA, Pike MC, Jotwani AR, et al. Uterine cancer after risk-reducing Salpingo-oophorectomy without hysterectomy in women with BRCA mutations. JAMA Oncol 2016;2(11):1434–1440; doi: 10.1001/jamaoncol.2016.1820 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Chan KS, Fowles JB, Weiner JP. Review: Electronic health records and the reliability and validity of quality measures: A review of the literature. Med Care Res Rev 2010;67(5):503–527; doi: 10.1177/1077558709359007 [DOI] [PubMed] [Google Scholar]
- 36. Feder SL. Data quality in electronic health records research: Quality domains and assessment methods. West J Nurs Res 2018;40(5):753–766; doi: 10.1177/0193945916689084 [DOI] [PubMed] [Google Scholar]
- 37. Nguyen KH, Sommers BD. Access and quality of care by insurance type for low-income adults before the affordable care act. Am J Public Health 2016;106(8):1409–1415; doi: 10.2105/AJPH.2016.303156 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Gardner L, Vishwasrao S. Physician quality and health care for the poor and uninsured. Inquiry 2010;47(1):62–80; doi: 10.5034/inquiryjrnl_47.01.62 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.

