Skip to main content
Learning Health Systems logoLink to Learning Health Systems
. 2019 Apr 25;3(3):e10193. doi: 10.1002/lrh2.10193

Advantages of large medical record database for outcomes research: Insights into post‐menopausal hormone therapy

Richard L Tannen 1,, Kurt T Barnhart 1, Joshua C Rubin 2,3
PMCID: PMC6628979  PMID: 31317074

Abstract

Approximately 25 years ago, our team initiated studies to determine whether outcome results from a large medical record database would yield valid results. We utilized the data in the United Kingdom (UK) General Practice Research Database (GPRD) to replicate the randomized controlled trial (RCT) study result and compared them to confirm the database results. The initial studies compared favorably, but some subsequent studies did not. This prompted development of a new strategy to determine and correct for unrecognized confounding in the database. This strategy divided outcome rates prior to initiation of therapy in the database study (which should include both identified and unidentified confounders) into the outcome rates during the treatment interval. When they differed from Cox‐adjusted results, it reflected unrecognized confounding. We called this strategy Prior Event Rate Ratio (PERR)–adjusted outcome.

One of our previously published observational studies replicated the Women's Health Initiative (WHI) RCT study of hormone therapy in post‐menopausal women. Our study results replicated the WHI RCT results except it did not exhibit an increase in heart attack in contrast to the RCT. Furthermore, we could not evaluate death reliably since our analytic approach to overcome unrecognized confounding does not work for this outcome. In Volume 1, Issue 1 of the Learning Health Systems open access journal, we published a new study (titled “A new method to address unmeasured confounding of mortality in observational studies”) that reported a novel death method, based on our prior methodology, that could analyze unrecognized confounding of the death outcome. This new methodology, termed Post Treatment Event Rate Ratio (PTERR), permitted a reliable examination of mortality in post‐menopausal women undergoing hormone therapy. These results are reported in this manuscript. The study used the data from our previous observational study. It demonstrates that estrogen therapy markedly reduced death in post‐menopausal women.

This work also illuminates principles of database construction and correspondingly demonstrates the use of novel methodologies for obtaining valid results, which can be applied to enable learning from such databases. Work to advance such methodologies is essential to advancing the scientific integrity Core Value underpinning learning health systems (LHSs). Indeed, in the absence of such efforts to develop and refine methodologies for learning trustworthy lessons from real‐world data, we risk inadvertently creating mis‐learning systems.

Keywords: death, estrogen, menopause, progesterone


Key acronyms

BMI

body mass index

GPRD

General Practice Research Database

HR

hazard ratio

LHSs

learning health systems

PCORI

Patient‐Centered Outcomes Research Institute

PERR

Prior Event Rate Ratio

PTERR

Post Treatment Event Rate Ratio

RCT

randomized controlled trial

SBP

systolic blood pressure

WHI

Women's Health Initiative

1. INTRODUCTION

The benefits of hormonal therapy for menopausal women have been debated for decades. The health benefits of hormonal therapy have been hypothesized, refuted, and re‐hypothesized. The risk and benefits for specific outcomes such as cardiovascular disease, cancer, or osteoporosis are often contradictory. However, despite the large amount of data on this subject, we are not able to evaluate mortality.

The Women's Health Initiative (WHI) was a large and important randomized controlled trial (RCT) study. Subgroup analysis of the WHI RCT provided a hint that combined estrogen‐progestin (EPT) therapy or estrogen (ET) therapy itself might decrease mortality in younger women (ages 50‐60).1, 2, 3 However, these results did not achieve statistical significance. Furthermore, secondary subgroup analysis could be distorted by unrecognized confounding, because subgroup analysis is similar to an observational study rather than an RCT.

Our prior observational studies that replicated the WHI RCT using the United Kingdom (UK) General Practice Research Database (GPRD) database also found a decrease in mortality with both ET and EPT therapy in women ages 50‐54 and 55‐79.4, 5, 6, 7 Other outcomes in these studies could be corrected for unrecognized confounding by a new technique we developed called Prior Event Rate Ratio (PERR).8, 9 PERR analysis is a tool to identify and quantitate unrecognized confounding for many outcomes; it involves analysis of outcomes in the time period preceding the therapy period and is then applied to the data during the analytic period. However, this strategy could not be applied to mortality (which cannot be a prior event). Therefore, we could not conclude with certainty that mortality was reduced.

Recently, we published new studies, which suggest that a strategy similar to PERR appears to overcome unrecognized confounding for mortality.10 In Volume 1, Issue 1 of the Learning Health Systems open access journal, we published a new study (titled “A new method to address unmeasured confounding of mortality in observational studies”) that discussed how this novel methodology, termed Post Treatment Event Rate Ratio (PTERR), permitted a reliable examination of mortality in post‐menopausal women undergoing hormone therapy. PTERR analysis applies an approach analogous to PERR analysis but, for the post‐treatment period, permitting the study of the death outcome. As part of our process for applying these methods, the WHI replication studies were included in these studies, and this manuscript expands upon these findings. This work and its implications will subsequently be further discussed.

Beyond the specific findings, our work also serves to advance the development of methodologies for learning trustworthy lessons from real‐world data, which is an important component of learning health systems (LHSs). One of the multi‐stakeholder consensus Core Values of LHSs is scientific integrity—rigorously applying science to ensure validity and credibility of findings. In a similar vein, communities working to mobilize computable biomedical knowledge have emphasized in their founding principles the paramount importance of ensuring that such knowledge can be trusted to improve health and health care; communities working to advance policies, practice, and research related to the ethical, social, and legal implications (ELSI) of LHSs also underscore the trust factor, grounded in part in the rigorous application of scientific methods. Federal government agencies and key nonprofit organizations supporting research to advance LHSs, including the Patient‐Centered Outcomes Research Institute (PCORI), have convened methodology committees and invested in synergistic efforts to advance the development of methodologies to learn trustworthy lessons from (every) data. More generally, LHSs are anchored in a common cultural commitment to learn and improve from (every) experience, and in doing so, LHSs also share an implicit value for continuously learning how to learn better. Our work here to build upon the PERR analysis method and to develop, test, and refine the PTERR method on an outcome we had not previously had the capacity to study, all work to advance such methodological development and refinement. This scientific methodological research, along with our other publications, collectively emphasize that appropriately constructed medical databases, along with proper analysis of the data they contain, promise to advance medicine. Our ability to analyze most outcomes, and now mortality, and to overcome potential unrecognized confounding in doing so has propelled database construction and analysis into a major tool for advancing clinical research and clinical medicine.

2. METHODS

The initial database replications of the WHI RCT included subjects from ages 50 to 79 years old. First exposed subjects were selected from those that met the entry criteria for the RCT and were being treated with either conjugated estrogen and progestin (norgestrel) if they had an intact uterus or no progestin if they had a prior hysterectomy. Start time for the exposed subjects was defined by when therapy began during a predetermined recruitment interval. Then the unexposed subjects were matched to the exposed subjects by age using a random computer matching technique and were assigned the same start time as the matched exposed subject.

The original cohorts that included subjects aged 50 to 79 were then subdivided into two groups: ages 50 to 54 and 55 to 79 years. The older group more closely matched the age of the RCT, and the 50‐ to 54‐year‐old group encompassed a younger cohort than the RCT. Two different types of analytics were employed to assess outcomes of the database studies. In the first type, analyses were performed using a simulated intention‐to‐treat analysis where a fixed end time was set and subjects were followed until they dropped out of the database, died, or reached a predetermined end time. In the second type, an as‐treated analysis was performed where, in addition to the above criteria, the subject's study ended if the post‐menopausal drug treatment was altered.

The database study outcomes' hazard ratios (HRs) were analyzed using Cox‐unadjusted and Cox‐adjusted results and compared with the RCT event results. Database results also were analyzed using the PERR method to overcome unrecognized confounding, and these results for stroke and acute myocardial infarction were compared with both the Cox‐adjusted and the RCT results.7 Prior Event Rate adjustment compared outcomes before study entry, when neither the exposed nor the unexposed cohorts were taking medications. Theoretically, this should delineate the aggregate effect of all confounders (both measured and unmeasured) on an outcome. Dividing the HR of an outcome during the study by the PERR HR should the result in a value similar to the Cox adjusted HR if there is no unmeasured confounding. The validity of this approach has been substantiated by comparisons of database and RCT outcome results and also by theoretical analyses of the method.8, 9 Mortality cannot be examined by this method, because death prior to study entry would eliminate entry into the study.

A new method (PTERR) was developed to address unmeasured confounding of mortality of the death rate after the treatment interval in a fashion similar to the PERR use of the outcome rate prior to study.10 Our prior database studies were used to test the validity of this method. This strategy was feasible because all database studies were analyzed using both “intention‐to‐treat” and “as‐treated” analyses. Since the “intention‐to‐treat” analysis often had a duration longer than the “as‐treated” analysis, a post‐treatment time period could be delineated for the exposed cohort. Because the unexposed cohort subjects were matched by start time to the exposed subjects, their duration could be altered similar to the matched subjects comparison of a similar “post‐treatment” period. As noted in the published manuscript, this new method appears to yield valid results, albeit that it could not be verified with the same rigor as the PERR method.10 One of the prior studies evaluated by this method was the database replication of the WHI RCT.4 The striking results for mortality in this analysis led to publication of this manuscript. We designed the database mortality results to be for the same age groups as the WHI secondary study, which encompassed women 50 to 60 and 50 to 70 years of age.

3. SUMMARY OF METHODOLOGIES DEVELOPED

  • PERR analysis is a tool we developed to identify and quantitate unrecognized confounding for many outcomes; it involves analysis of outcomes in the time period preceding the therapy period and is then applied to the data during the analytic period.

  • PTERR analysis is a tool we developed specifically to identify and quantitate unrecognized confounding for the death outcome, applying an approach analogous to PERR analysis, but for the post‐treatment period.

4. RESULTS

The comparisons between the RCT intact uterus and hysterectomy studies and the database PTERR correction results for these studies are shown in Table 1. The details regarding the death analysis of the database studies are shown in Table 2. The results for mortality from the WHI RCT intact uterus and hysterectomy studies were compared with database study results of women in the same age brackets (50‐60 y and 50‐70 y). The intact uterus and hysterectomy results for the WHI RCT for the 50‐ to 60‐year‐old cohort showed a decrease in mortality in the treated group, which was not significant statistically. The database studies of this same age group demonstrated a decrease in mortality that achieved statistical significance but did not exhibit statistical significance in comparison with the WHI RCT results. It should be noted that the size of the cohorts in the database study was markedly larger than the cohorts in the secondary analysis of the WHI RCT.

Table 1.

RCT vs database results

N N HR (95% CI) P vs RCT
Case Control
Intact uterus
RCT
Age 50‐60 2839 2683 0.69 (0.44‐1.07)
Age 50‐70 6692 6340 0.98 (0.77‐1.25)
Database PTERR adj
Age 50‐60 29 972 51 584 0.39 (0.29‐0.50) NS 0.03
Age 50‐70 34 006 64 226 0.38 (0.31‐0.48) <0.01
Hysterectomy
RCT
Age 50‐60 1637 1673 0.71 (0.46‐1.11)
Age 50‐70 4024 4128 0.94 (0.75‐1.16)
Database Adjusted PTERR adj
Age 50‐60 10 802 15 902 0.44 (0.28‐0.70) NS 0.14
Age 50‐70 13 659 20 206 0.39 (0.28‐0.54) <0.01

Abbreviations: HR, hazard ratio; PTERR adj, Post Treatment Event Rate Ratio adjusted; RCT, randomized controlled trial.

Table 2.

Database study results

As‐Treated Period
N
Cox Univariable (95% CI)
HR/95% CI
Cox Multivariable (95% CI)
HR/95% CI
PTERR Adj (95% CI)
HR/95% CI
Cox Multivariable vs PTERR Adj P Value Post RX Period PTERR
HR/95% CI
Intact uterus study
Age 50‐70 0.01
Exposed 34 006 0.38 0.55 0.38 0.99
Unexposed 64 226 (0.29‐0.50) (0.46‐0.62) (0.31‐0.48) (0.85‐1.16)
Age 50‐60 0.27
Exposed 29 972 0.42 0.55 0.39 1.08
Unexposed 51 584 (0.36‐0.49) (0.47‐0.66) (0.29‐0.50) (0.89‐1.31)
Hysterectomy study
Age 50‐70 NS
Exposed 13 659 0.37 0.44 0.39 1.02
Unexposed 20 206 (0.30‐0.46) (0.35‐0.54) (0.28‐0.54) (0.79‐1.31)
Age 50‐60 NS
Exposed 10 802 0.38 0.43 0.44 0.86
Unexposed 15 902 (0.29‐0.50) (0.26‐0.65) (0.28‐0.70) (0.61‐1.20)

Abbreviations: HR, hazard ratio; PTERR, Post Treatment Event Rate Ratio; PTERR adj, Post Treatment Event Rate Ratio adjusted.

The intact uterus and hysterectomy results for the WHI RCT for the 50‐ to 70‐year‐old cohorts did not exhibit a decrease in mortality. By contrast, the results for this age group in both the database studies exhibited a significant decrease in mortality that was also significantly lower than the results in the RCT study. It is important to note that the size of the cohorts in the database studies was dramatically higher than the secondary RCT studies.

In our prior publication, death was decreased significantly in both older (55‐79) and younger (50‐54) women in both the intact uterus study (estrogen and progesterone) and the hysterectomy study (estrogen only).7 There was a hint, however, that cohorts without missing data on baseline confounders (BMI, SBP, and smoking) did not exhibit as large a decrease in death. We revisited this issue using our current data on death using cohorts from both the intact uterus and also hysterectomy cohorts. These studies were performed on subjects aged 50 to 70 with no missing confounders. The results for these “no missing” studies are shown in Table 3. In comparison with the results of the studies with missing data on baseline confounders shown in Table 2, there are no meaningful differences.

Table 3.

No missing base line data (smoking, SBP, BMI)

Subjects
N
Death as RX
N/%
Post Subjects N Death Post
N/%
As RX HR Post RX HR PTERR Adj HR
Intact uterus study
Age 50‐70 0.487 1.094 0.44 (0.35‐0.59)
Exposed 26 690 183 (0.69%) 11 023 183 (1.66%)
Unexposed 39 253 391 (1.00%) 13 863 261 (1.88%)
Hysterectomy study
Age 50‐70 0.381 1.099 0.35 (0.27‐0.44)
Exposed 9250 68 (0.74%) 3420 69 (2.02%)
Unexposed 12 029 137 (1.14%) 3404 78 (2.29%)

Abbreviations: HR, hazard ratio; PTERR adj, Post Treatment Event Rate Ratio adjusted.

5. DISCUSSION

Our replication of the WHI RCT provides a deeper insight to support the striking decrease in mortality for women treated with estrogen hormone replacement therapy during menopause. Estrogen is clearly identified as the medication responsible for this effect, since women who ingested both estrogen and progesterone exhibited decreases in mortality similar to those who ingested only estrogen.

The methods utilized aimed to address the potential for unrecognized confounding. Such an approach can be compared with other previous studies. For instance, the secondary subgroup analysis of the WHI RCT tends to support this finding, but the cohorts were too small to achieve statistically meaningful results and also suffer from the possibility of unrecognized confounding.1 The meta‐analysis by Salpeter and colleagues found a decrease in mortality in women with the mean age of <60 years, but not in older women. Furthermore, it does not identify estrogen as the protective medication and also is subject to unrecognized confounding.11 Mikkola et al used data from the nationwide reimbursement register and the Cause of Death Register of Finland for their studies of post‐menopausal hormone therapy.12 They reported a decrease in death from treatment with estradiol and/or progesterone treatment along with a decrease in coronary heart disease, stroke, or other causes of mortality. However, this database study did directly not address the possibility of unrecognized confounding.

How post‐menopausal estrogen accounts for the prevention of death is not delineated by our study. Both the WHI RCT and our database study examined other outcomes that can provide some clue as to potential ways in which estrogen can protect against death. Both the RCT and the database study found an increase in stroke, no reduction in acute myocardial infarction, an increase in breast cancer, and an increase in events related to venous clotting in subjects treated with estrogen. Estrogen did result in a decrease in colon cancer and also a decrease in hip fracture, but the degree of protection seems unlikely to have resulted in the magnitude of the decline in mortality. Thus, the basis for the protective effect of estrogen therapy remains unexplained. Nevertheless, our findings strongly support the use of post‐menopausal estrogen therapy.

More generally, this work demonstrates the potential of the PTERR analysis method, building upon the PERR analysis method, to help illuminate valid and trustworthy lessons that can be learned from real‐world health data derived from properly constructed medical databases. Doing so promises to stimulate further discussion of and research into methodologies grounded in the scientific integrity Core Value of LHSs, that is so paramount to building trustworthiness and engendering trust in the lessons learned in LHSs aimed at improving health, as well as in the LHSs themselves.

CONFLICTS OF INTEREST

No conflicts of interest are reported by authors.

This work did not require review by University of Pennsylvania ethics committee since it only used information in the UK national health database.

FUNDING INFORMATION

This work was supported by a research grant (ME‐1310‐06601) from the Patient‐Centered Outcomes Research Institute (PCORI).

The Joseph H. Kanter Family Foundation supported the payment of the open access article publication charge to John Wiley & Sons, Inc.

Tannen RL, Barnhart KT, Rubin JC. Advantages of large medical record database for outcomes research: Insights into post‐menopausal hormone therapy. Learn Health Sys. 2019;3:e10193 10.1002/lrh2.10193

Contributor Information

Richard L. Tannen, Email: tannen@mail.med.upenn.edu

Joshua C. Rubin, Email: rubinjc@umich.edu

REFERENCES

  • 1. Rossouw JE, Prentice RL, Manson JE, et al. Postmenopausal hormone therapy and risk of cardiovascular disease by age and years since menopause. JAMA. 2007;297(13):1465‐1477. [DOI] [PubMed] [Google Scholar]
  • 2. Writing group for the Women's Health Initiative investigators . Risks and benefits of estrogen plus progestin in health postmenopausal women. Principal results from the Women's Health Initiative randomized controlled trial. JAMA. 2002;288(3):321‐333. [DOI] [PubMed] [Google Scholar]
  • 3. Women's Health Initiative Steering Committee . Effects of conjugated equine estrogen in postmenopausal women with hysterectomy. The Women's Health Initiative randomized controlled trial. JAMA. 2004;291:1701‐1712. [DOI] [PubMed] [Google Scholar]
  • 4. Tannen RL, Weiner MG, Xie D, Barnhart K. Simulation of the Women's Health Initiative trial using data from a primary care practice database. J Clin Epidemiol. 2007;60(7):686‐695. [DOI] [PubMed] [Google Scholar]
  • 5. Weiner MG, Barnhart K, Xie D, Tannen RL. Hormone therapy and coronary heart disease in young women. Menopause. 2008;15(1):86‐93. [DOI] [PubMed] [Google Scholar]
  • 6. Tannen RL, Weiner MG, Xie D, Barnhart K. Estrogen affects post‐menopausal women differently than estrogen plus progestin therapy. Hum Reprod. 2007;22(6):1769‐1777. [DOI] [PubMed] [Google Scholar]
  • 7. Tannen RL, Weiner MG, Xie D, Barnhart K. Perspective on hormone replacement therapy: the Women's Health Initiative and new observational studies sampling the overall population. Fertil Steril. 2008;90(2):258‐264. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Tannen RL, Weiner MG, Xie D. Use of primary care electronic medical record database in drug efficacy research on cardiovascular outcomes: comparison of database and randomized controlled trial findings. BMJ. 2009;338:b81. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Yu M, Xie D, Wang X, Weiner MG, Tannen RL. Prior event ratio adjustment: numerical studies of statistical method to address unrecognized confounding in observational studies. Pharmacoepidemiol Drug Saf. 2012;21:60‐68. [DOI] [PubMed] [Google Scholar]
  • 10. Tannen R, Yu M. A new method to address unmeasured confounding of mortality in observational studies. Learn Health Sys. 2016. Dec;1(1):e10016 10.1002/rh2.10016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Salpeter SR, Walsh JME, Greyber E, Ormiston TM, Salpeter EE. Mortality associated with hormone replacement therapy in younger and older women. A meta‐analysis. J Gen Intern Med. 2004;19(7):791‐804. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Mikkola TS, Tuomikoski P, Lyytinen H, et al. Estradiol‐based postmenopausal hormone therapy and risk of cardiovascular and all‐cause mortality. Menopause. 2015;22:9763‐9983. [DOI] [PubMed] [Google Scholar]

Articles from Learning Health Systems are provided here courtesy of Wiley

RESOURCES