Tree-based scan statistics to generate drug repurposing hypotheses: a test case using sodium-glucose cotransporter-2 inhibitors

George S Q Tan; Judith C Maro; Shirley V Wang; Sengwee Toh; Jedidiah I Morton; Jenni Ilomäki; Jenna Wong; Xiaojuan Li

doi:10.1093/aje/kwae355

. 2024 Sep 11;194(7):1999–2011. doi: 10.1093/aje/kwae355

Tree-based scan statistics to generate drug repurposing hypotheses: a test case using sodium-glucose cotransporter-2 inhibitors

George S Q Tan ^1,^2,^✉, Judith C Maro ³, Shirley V Wang ⁴, Sengwee Toh ^5,⁶, Jedidiah I Morton ^7,⁸, Jenni Ilomäki ⁹, Jenna Wong ^10,^✉,^#, Xiaojuan Li ^11,^✉,^#

PMCID: PMC12461578 PMID: 39270669

Abstract

Most drug repurposing studies using real-world data focused on validating, instead of generating, hypotheses. We used tree-based scan statistics to generate repurposing hypotheses for sodium-glucose cotransporter-2 inhibitors (SGLT2i). We used an active-comparator, new-user study design to create a 1:1 propensity-score matched cohort of SGLT2i and dipeptidyl peptidase-4 inhibitors (DPP4i) initiators in the Merative MarketScan Research Databases. Tree-based scan statistics were estimated across an ICD-10-CM-based hierarchical outcome tree using incident outcomes identified from hospital and outpatient diagnoses. We used an adjusted P ≤ .01 as the threshold for statistical alert to prioritize associations for evaluation as repurposing signals. We varied the analyses by tree size, scanning level, and clinical settings for outcomes. There were 80 510 matched SGLT2i-DPP4i initiator pairs with 215 333 outcomes among SGLT2i initiators and 223 428 outcomes among DPP4i initiators. There were 18 prioritized associations, which included chronic kidney disease (P = .0001), an expected signal, and anemia (P = .0001). Heart failure (P = .0167), another expected signal, was identified slightly beyond the statistical alert threshold. Narrowing the outcome tree, scanning at different tree levels, and including outcomes from different clinical settings influenced the scan statistics. We identified signals aligning with recently approved indications of SGLT2i, plus potential repurposing signals supported by existing evidence but requiring future validation.

Keywords: drug repurposing, drug repositioning, tree-based scan statistics, TreeScan, data-mining, real-world data, pharmacoepidemiology

Introduction

Drug repurposing, defined as finding new indications for existing drugs, has garnered much interest in the past decade due to significant cost and time savings, as well as greater success rates across the drug development and regulatory approval pipeline compared to de novo drug development.¹^‑⁴ One of the computational approaches for drug repurposing is retrospective analysis of real-world data (RWD), defined as data collected during routine delivery of healthcare by the United States Food and Drug Administration (FDA).⁵ Most previous drug repurposing studies using RWD have focused on validating, rather than generating repurposing hypotheses.⁶ Using RWD to generate novel repurposing hypotheses holds much promise given improving data quality and availability.⁷^,⁸

Tree-based scan statistics (TBSS), enabled by TreeScan, is a data mining method originally developed to conduct scan statistics across a hierarchical tree.⁹ In general, a hierarchical tree consists of variables arranged in a tree structure, for example, occupations, pharmaceutical drugs, and clinical diagnoses. Applications of TBSS thus far have been predominantly for occupational disease and medication safety surveillance.⁹^‑¹³

We aimed to demonstrate how TBSS can be used to generate new drug repurposing hypotheses from RWD. In essence, an inverse association between drug exposure and a health outcome identified by the scan statistics may suggest a potential repurposing signal relating to the outcome. We used sodium-glucose cotransporter-2 inhibitors (SGLT2i) as a test case, which is a new class of glucose-lowering drugs initially approved for the treatment of type 2 diabetes. Sodium-glucose cotransporter-2 inhibitors were additionally approved in the United States for the treatment of heart failure in 2020 and chronic kidney disease in 2021.¹⁴^‑²⁰ These new indications could serve as “positive controls” to evaluate the performance of this approach.

Methods

Data sources

We used data from the Merative MarketScan Research Databases from October 1, 2014, to December 31, 2021, where the data were converted to the Sentinel Common Data Model (version 8.1). MarketScan captures one of the largest convenience samples of individuals (and their spouses and dependents) with employer-sponsored health insurance plans across the United States.²¹^,²² It provides de-identified patient-level health data, including insurance enrollment status, diagnosis and procedure codes for inpatient and outpatient services, and outpatient prescription medication dispensing data based on National Drug Codes. This study was approved by the Institutional Review Board of Harvard Pilgrim Health Care Institute and Monash University.

Study design and cohort

We used an active-comparator, new-user study design by comparing initiators of SGLT2i (canagliflozin, dapagliflozin, empagliflozin, ertugliflozin, other SGLT2i-containing combination products) to initiators of dipeptidyl peptidase-4 inhibitors (DPP4i; alogliptin, linagliptin, saxagliptin, sitagliptin, other DPP4i-containing combination products; Table S1). Dipeptidyl peptidase-4 inhibitors were chosen as the active comparator because, like SGLT2i, they are second-line glucose-lowering drugs for type 2 diabetes.²³ Commonly, DPP4i have been used as active comparators for SGLT2i in comparative studies.²⁴^‑²⁷

The study cohort consisted of beneficiaries aged ≥18 years who initiated treatment with an SGLT2i or DPP4i between October 1, 2015, and October 31, 2019. The latter date was selected because the pivotal DAPA-HF trial, published in November 2019, was the first to report that dapagliflozin use was associated with a reduced risk in heart failure outcomes irrespective of diabetes status, which could have influenced prescribing practices of SGLT2i.²⁸ The index date was defined as the first dispensing of either SGLT2i or DPP4i. Eligible individuals were required to have at least 1 year of continuous medical and pharmacy coverage prior to the index date, with allowable gaps of no more than 45 days. We used a 1-year washout period (with no prior dispensing of either SGLT2i or DPP4i) prior to the index date to identify new users. We excluded individuals who initiated treatment with both SGLT2i and DPP4i on the index date. The drug exposure periods were constructed using the days’ supply of medication in a dispensing, allowing for stockpiling when an additional dispensing occurred before the end of the days’ supply of the previous dispensing. A grace period was allowed and used to bridge brief gaps between exposure periods of up to 14 days. Using an adapted version of the Chronic Condition Warehouse algorithm for type 2 diabetes,²⁹ we required eligible individuals to have a diagnosis of type 2 diabetes and excluded those with a diagnosis of type 1 diabetes, using at least 1 inpatient diagnosis or at least 2 ambulatory or emergency department diagnoses on separate days, within 1 year prior to the index date. Figure 1 illustrates the complete study design.³⁰

Graphical representation of longitudinal study design Modified from: Schneeweiss S, Rassen JA, Brown JS, et al. Graphical Depiction of Longitudinal Study Designs in Health Care Databases. *Ann Intern Med* 2019; 170: 398-406. 20190312. DOI: 10.7326/m18-3079.

Propensity score matching

To reduce potential confounding across all outcomes, we used 1:1 propensity score matching where SGLT2i initiators were matched with DPP4i initiators using optimal nearest-neighbor matching with a caliper of 0.025 and no replacement. We estimated propensity scores for initiating SGLT2i using a predefined set of baseline covariates measured in a 1-year baseline period prior to the index date.⁵ These included demographic factors (age and sex); calendar year of the index date; combined Charlson/Elixhauser comorbidity score³¹^,³²; adapted Diabetes Complications Severity Index³³; baseline use of glucose-lowering drugs; comorbidities and other medications; procedures; and healthcare utilization characteristics (Table S2). We examined covariate balance after matching, with covariate imbalance defined as an absolute value of the standardized mean difference of > .1.

Hierarchical outcome tree

We used a pruned version of the hierarchical tree based on International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) diagnosis codes. The ICD-10-CM codes are inherently organized into a hierarchical tree-like structure with up to 7 levels, corresponding to the maximum 7 digits of the diagnosis codes. Broad categories of diagnoses start at the “root” and progressively “branch” into more specific groups of diagnoses, culminating in specific diagnosis codes at the “leaf” (Figure S1). Each level has multiple nodes, which encompass all downstream diagnoses. We pruned the ICD-10-CM tree to remove branches containing diagnoses that are less plausible as drug-related outcomes: external causes of morbidity (V00-Y99) and factors influencing health status and contact with health services (Z00-Z99). We also excluded codes for conditions originating in the perinatal period (P00-P96) and codes for pregnancy, childbirth, and the puerperium (O00-O9A), as we did not intend to evaluate pregnancy-related outcomes. The tree was further pruned in the sensitivity analyses (described later on). Refer to Table S3 for specifications of the tree.

Follow-up for outcomes

Follow-up began on the day following the first dispensing of the drug of interest and continued until the earliest of any of the following events: end of the drug exposure period, disenrollment, death, end of data availability (December 31, 2021), initiation of opposite study drug, censoring of 1 person from the matched pair for any of the aforementioned reasons, or end of the 2-year (730 days) follow-up period. We defined incident outcomes based on diagnoses (in any diagnosis position) from inpatient admissions, emergency department presentations, or ambulatory care. Each incident outcome was considered separately. However, to be considered an incident outcome, the individual must have had no diagnosis with the first 3 digits of the ICD-10-CM code recorded in at least 1 year preceding its occurrence. In other words, incidence was defined as level 3 of the outcome tree. This was to exclude closely related diagnoses (categorized within the same level 3) that were recorded within the same timeframe, which could reflect a related follow-up diagnosis or nuanced differences when coding a similar condition.

Scan statistics

As the interest of this study was identifying repurposing signals rather than safety signals, we looked for nodes where the observed probability of the outcome in the exposure group was lower than the corresponding expected probability if there was truly no difference with the comparator group (inverse associations). In the TreeScan software, this was implemented by interchanging the exposure and comparator groups because TreeScan was designed to evaluate safety signals (positive associations) “out of the box.”⁵^,¹² The expected number of outcomes at each node is calculated as half of the total number of outcomes from both exposure groups, given that follow-up time was matched between groups. Any node that was scanned had to have at least 2 outcomes among the exposed. We used the unconditional Bernoulli scan statistics as we assumed that outcomes in the exposed group occur in a fixed probability of .5 within the 1:1 matched cohort.

Due to the evaluation of thousands of outcomes concurrently in this study, it was important to limit false positive signals.³⁴ Tree-based scan statistics derives multiplicity-adjusted P values nonparametrically using Monte Carlo simulations.⁹ A P value can be interpreted as the 1-sided probability of observing the difference between observed and expected outcomes at the specific node (alternative hypothesis) if the composite null hypothesis were true. The composite null hypothesis was that there is no difference in observed and expected outcomes across all nodes. The alternative hypothesis in this study was the likelihood of an inverse association, unlike in drug safety studies that look for a positive association.⁵^,¹² We describe how the P values were derived in more detail in Appendix S1. However, it is important to note that the P values were used to prioritize signals for further evaluation.³⁵ We specified in the TreeScan software to output all inverse associations with P < 1.

Repurposing signals

We only looked for associations using outcome nodes in levels 3, 4, and 5, so as not to expend statistical power looking for signals that were clinically either too broad or too specific. Similar to some previous TBSS studies,⁵^,¹² we used P ≤ .01 as the threshold for statistical alerts prioritizing associations for evaluation as potential repurposing signals, rather than the conventional P ≤ .05 to further guard against type 1 error. However, we presented all inverse associations with P < 1 sorted by ascending P values for transparency.

The established cardiorenal benefits of SGLT2i, specifically for heart failure and chronic kidney disease (CKD), were expected signals and served as positive controls in this study. Evaluation of unexpected signals as potential repurposing signals included consideration of biological and pharmacological plausibility, clinical context, confounding, and bias by study design. We summarize the workflow for using TBSS to identify potential repurposing signals in Figure 2.

Workflow using tree-based scan statistics to identify repurposing signals.

Sensitivity analyses

We conducted a number of sensitivity analyses to investigate the impact of modifying certain analytic parameters on the repurposing signals identified. First, we further pruned the ICD-10-CM outcome tree to preserve statistical power (Table S3). Codes for neoplasms (C00-D49) were excluded, as outcomes with long induction and latent periods, such as cancers, are less likely to be causally associated with the exposure within 2 years of follow-up.³⁶ Codes for diabetes mellitus (E10-E14) were also excluded as both the exposure and comparator drugs are already indicated for diabetes. Finally, codes relating to symptoms, signs, and abnormal laboratory findings (R00-R99) were excluded as most are nonspecific or subclinical symptoms of diseases. Second, we repeated the analyses to also scan across nodes at level 2 (in addition to levels 3, 4, and 5), where the incidence of outcomes was redefined at level 2 of the ICD-10-CM outcome tree. Third, we restricted the analyses such that incident outcomes were identified using diagnoses from only inpatient admissions or emergency department presentations and not ambulatory care.

Software

Sentinel Routine Query Modules (version 12.1.2) were executed in SAS Studio 3.7 (SAS Institute, Inc., Cary, North Carolina) to extract the matched cohorts and outcome data (see Table S4 for parameter specifications used for the modules). Sentinel Query Request Package Reporting Tool (version 2.1.0) was used to generate tables and figures. We used TreeScan software (version 2.1.1; www.treescan.org) to conduct the TBSS.

Results

Cohort characteristics

We identified a total of 106 143 SGLT2i initiators and 118 575 DPP4i initiators. The baseline characteristics of individuals in the 2 exposure groups before matching are included in Table 1. Briefly, compared to DPP4i initiators, SGLT2i initiators were slightly younger (mean age, 55 vs 58 years), had fewer comorbidities (mean Charlson/Elixhauser combined comorbidity score, 0.9 vs 1.3), and had fewer or less severe diabetes complications (mean adapted Diabetes Complication Severity Index, 0.9 vs 1.1). Sodium-glucose cotransporter-2 inhibitors compared to DPP4i initiators were less likely to have a baseline diagnosis of CKD (9.9% vs 14.8%) and heart failure (4.1% vs 6.6%). The median follow-up time before matching for SGLT2i and DPP4i initiators was 116 (interquartile range [IQR], 43-336) and 104 (IQR, 43-290) days, respectively. The distribution of censoring reasons for both groups before matching was comparable (Table S5), with most censored due to end of treatment episode (76%-77%) and disenrollment (18%-19%).

Table 1.

Baseline characteristics of the study cohort before and after 1:1 propensity score matching.

	Before matching					After 1:1 propensity score matching
	SGLT2i initiators		DPP4i initiators		Standardized mean difference	SGLT2i initiators		DPP4i initiators		Standardized mean difference
	Number/Mean	%/SD	Number/Mean	%/SD	Standardized mean difference	Number/Mean	%/SD	Number/Mean	%/SD	Standardized mean difference
Number of patients	106 143		118 575			80 510		80 510
Patient characteristics
Age, years	54.7	9.8	58.0	11.9	−0.303	55.3	9.8	55.2	10.6	0.018
Female	47 272	44.5%	55 173	46.5%	−0.040	35 984	44.7%	35 816	44.5%	0.004
Index year of initiation
2015	6386	6.0%	8756	7.4%	−0.055	5235	6.5%	5203	6.5%	0.002
2016	27 446	25.9%	37 974	32.0%	−0.136	22 441	27.9%	22 501	27.9%	−0.002
2017	26 589	25.1%	31 015	26.2%	−0.025	20 636	25.6%	20 704	25.7%	−0.002
2018	21 933	20.7%	23 022	19.4%	0.031	16 633	20.7%	16 685	20.7%	−0.002
2019	23 789	22.4%	17 808	15.0%	0.190	15 565	19.3%	15 417	19.1%	0.005
Diabetes-related covariates
Adapted diabetes complications severity index	0.9	1.3	1.1	1.7	−0.171	0.9	1.4	0.8	1.4	0.020
Glucose-lowering drugs
Metformin	85 236	80.3%	89 307	75.3%	0.120	64 009	79.5%	64 245	79.8%	−0.007
Sulfonylurea	34 389	32.4%	41 780	35.2%	−0.060	26 771	33.3%	26 712	33.2%	0.002
GLP-1 agonist	25 303	23.8%	7166	6.0%	0.515	7461	9.3%	7077	8.8%	0.017
Thiazolidinedione	7800	7.3%	6215	5.2%	0.087	4961	6.2%	4847	6.0%	0.006
α-glucosidase inhibitor	299	0.3%	347	0.3%	−0.002	222	0.3%	216	0.3%	0.001
Insulin	28 243	26.6%	16 994	14.3%	0.308	14 617	18.2%	14 287	17.7%	0.011
Comorbidities and comedications
Charlson/Elixhauser combined comorbidity score	0.9	1.6	1.3	2.2	−0.204	0.9	1.7	0.9	1.7	0.020
Diagnoses and procedures
Anemia	10 232	9.6%	16 068	13.6%	−0.122	8014	10.0%	7881	9.8%	0.006
Arrhythmia	9362	8.8%	14 176	12.0%	−0.103	7230	9.0%	6928	8.6%	0.013
Autoimmune disease	8951	8.4%	10 008	8.4%	−0.000	6353	7.9%	6171	7.7%	0.008
Bacterial infection	15 519	14.6%	21 251	17.9%	−0.090	11 958	14.9%	11 795	14.7%	0.006
Coagulopathy	1309	1.2%	2454	2.1%	−0.066	1086	1.3%	1057	1.3%	0.003
Colonoscopy	10 006	9.4%	11 217	9.5%	−0.001	7505	9.3%	7529	9.4%	−0.001
Degenerative disease of the central nervous system	11 246	10.6%	14 741	12.4%	−0.058	8525	10.6%	8409	10.4%	0.005
Durable medical equipment	2615	2.5%	4536	3.8%	−0.078	2053	2.5%	2030	2.5%	0.002
Fecal occult blood test	6340	6.0%	7030	5.9%	0.002	4865	6.0%	4954	6.2%	−0.005
Fluid and electrolyte disorder	6464	6.1%	11 682	9.9%	−0.139	5176	6.4%	5027	6.2%	0.008
Gallstones	1719	1.6%	2398	2.0%	−0.030	1342	1.7%	1327	1.6%	0.001
Human papillomavirus DNA test	54	0.1%	75	0.1%	−0.005	39	0.0%	50	0.1%	−0.006
Hyperparathyroidism	460	0.4%	766	0.6%	−0.029	354	0.4%	325	0.4%	0.006
Kawasaki disease	1	0.0%	3	0.0%	−0.004	1	0.0%	3	0.0%	−0.005
Mammogram	21 314	20.1%	22 794	19.2%	0.022	15 912	19.8%	15 999	19.9%	−0.003
Organ transplant	531	0.5%	1033	0.9%	−0.045	434	0.5%	432	0.5%	0.000
Other infections	5545	5.2%	6223	5.2%	−0.001	4159	5.2%	4143	5.1%	0.001
Prostate-specific antigen test	23 039	21.7%	23 668	20.0%	0.043	17 541	21.8%	17 726	22.0%	−0.006
Pap smear	11 602	10.9%	11 635	9.8%	0.037	8564	10.6%	8632	10.7%	−0.003
Psychosis	11 903	11.2%	13 444	11.3%	−0.004	8652	10.7%	8610	10.7%	0.002
Pulmonary circulation disorders	569	0.5%	1086	0.9%	−0.045	461	0.6%	425	0.5%	0.006
Pulmonary disease	10 912	10.3%	14 737	12.4%	−0.068	8492	10.5%	8398	10.4%	0.004
Renal failure	4698	4.4%	11 991	10.1%	−0.220	4004	5.0%	3623	4.5%	0.022
Reye’s syndrome	0	0.0%	0	0.0%	-	0	0.0%	0	0.0%	-
Screening, examinations and disease management training	8372	7.9%	8472	7.1%	0.028	6073	7.5%	6126	7.6%	−0.002
Thrombotic and thrombocytopenic purpura	4	0.0%	26	0.0%	−0.016	4	0.0%	11	0.0%	−0.009
Weight loss	248	0.2%	781	0.7%	−0.064	219	0.3%	188	0.2%	0.008
Acute myocardial infarction	1640	1.5%	2207	1.9%	−0.024	1251	1.6%	1191	1.5%	0.006
Alzheimer’s disease	78	0.1%	614	0.5%	−0.082	71	0.1%	134	0.2%	−0.022

Open in a new tab

Table 1.

Continued

	Before matching					After 1:1 propensity score matching
	SGLT2i initiators		DPP4i initiators		Standardized mean difference	SGLT2i initiators		DPP4i initiators		Standardized mean difference
	Number/Mean	%/SD	Number/Mean	%/SD	Standardized mean difference	Number/Mean	%/SD	Number/Mean	%/SD	Standardized mean difference
Asthma	7494	7.1%	8813	7.4%	−0.014	5670	7.0%	5684	7.1%	−0.001
Benign prostatic hyperplasia	5208	4.9%	7612	6.4%	−0.065	4185	5.2%	4090	5.1%	0.005
Cataract	14 747	13.9%	19 153	16.2%	−0.063	11 137	13.8%	11 091	13.8%	0.002
Chronic kidney disease	10 543	9.9%	17 564	14.8%	−0.149	7884	9.8%	7526	9.3%	0.015
Chronic obstructive pulmonary disease	7262	6.8%	10 748	9.1%	−0.082	5828	7.2%	5703	7.1%	0.006
Depressive bipolar disorder	13 215	12.5%	14 561	12.3%	0.005	9540	11.8%	9501	11.8%	0.002
Diabetes	106 142	100.0%	118 572	100.0%	0.004	80 509	100.0%	80 509	100.0%	0.000
Glaucoma	6878	6.5%	9338	7.9%	−0.054	5231	6.5%	5137	6.4%	0.005
Heart failure	4375	4.1%	7822	6.6%	−0.110	3437	4.3%	3252	4.0%	0.012
Hip fracture	93	0.1%	344	0.3%	−0.047	78	0.1%	58	0.1%	0.009
Hyperlipidemia	81 260	76.6%	87 646	73.9%	0.061	60 287	74.9%	60 218	74.8%	0.002
Hypertension	80 440	75.8%	90 103	76.0%	−0.005	60 265	74.9%	60 177	74.7%	0.003
Hyperthryoidism	15 875	15.0%	17 662	14.9%	0.002	11 692	14.5%	11 488	14.3%	0.007
Ischemic heart disease	13 969	13.2%	17 381	14.7%	−0.043	10 325	12.8%	10 091	12.5%	0.009
Nonalzheimer’s dementia	287	0.3%	1747	1.5%	−0.130	262	0.3%	424	0.5%	−0.031
Osteoporosis	1141	1.1%	2252	1.9%	−0.068	969	1.2%	920	1.1%	0.006
Parkinson	142	0.1%	488	0.4%	−0.053	120	0.1%	105	0.1%	0.005
Pneumonia	3712	3.5%	6208	5.2%	−0.085	3020	3.8%	2893	3.6%	0.008
Rheumatoid arthritis	20 303	19.1%	25 487	21.5%	−0.059	15 499	19.3%	15 408	19.1%	0.003
Stroke and transient ischemic attack	2456	2.3%	4537	3.8%	−0.088	1973	2.5%	1916	2.4%	0.005
Attention deficit and hyperactivity disorder	1299	1.2%	1144	1.0%	0.025	881	1.1%	871	1.1%	0.001
Alcohol use	1042	1.0%	1365	1.2%	−0.016	827	1.0%	845	1.0%	−0.002
Autism	48	0.0%	57	0.0%	−0.001	35	0.0%	35	0.0%	0.000
Anxiety disorder	12 323	11.6%	13 626	11.5%	0.004	9144	11.4%	9098	11.3%	0.002
Bipolar disorder	1617	1.5%	1929	1.6%	−0.008	1139	1.4%	1232	1.5%	−0.010
Cerebral palsy	31	0.0%	49	0.0%	−0.006	24	0.0%	30	0.0%	−0.004
Cystic fybrosis	810	0.8%	845	0.7%	0.006	576	0.7%	568	0.7%	0.001
Depressive disorder	11 533	10.9%	12 720	10.7%	0.004	8374	10.4%	8247	10.2%	0.005
Drug use disorder	1223	1.2%	1463	1.2%	−0.008	906	1.1%	933	1.2%	−0.003
Epilepsy	570	0.5%	895	0.8%	−0.027	454	0.6%	435	0.5%	0.003
Fibromylagia and chronic pain	16 575	15.6%	18 119	15.3%	0.009	12 159	15.1%	12 119	15.1%	0.001
Human immunodeficiency virus	305	0.3%	403	0.3%	−0.009	239	0.3%	238	0.3%	0.000
Intellectual disability	49	0.0%	84	0.1%	−0.010	36	0.0%	48	0.1%	−0.007
Learning disability	44	0.0%	85	0.1%	−0.013	32	0.0%	40	0.0%	−0.005
Leukemia and lymphoma	657	0.6%	988	0.8%	−0.025	507	0.6%	506	0.6%	0.000
Liver disease	9089	8.6%	10 095	8.5%	0.002	6690	8.3%	6754	8.4%	−0.003
Migraine	4683	4.4%	5207	4.4%	0.001	3497	4.3%	3451	4.3%	0.003
Mobility impairment	569	0.5%	1389	1.2%	−0.069	460	0.6%	550	0.7%	−0.014
Muscular dystrophy	17	0.0%	26	0.0%	−0.004	16	0.0%	13	0.0%	0.003
Multiple sclerosis	313	0.3%	397	0.3%	−0.007	241	0.3%	255	0.3%	−0.003
Obesity	42 081	39.6%	38 386	32.4%	0.152	29 189	36.3%	29 058	36.1%	0.003
Opioid use disorder	751	0.7%	900	0.8%	−0.006	546	0.7%	575	0.7%	−0.004
Developmental disorder	15	0.0%	20	0.0%	−0.002	13	0.0%	11	0.0%	0.002
Peripheral vascular disorder	5681	5.4%	9350	7.9%	−0.102	4449	5.5%	4327	5.4%	0.007
Personality disorder	1223	1.2%	1322	1.1%	0.004	877	1.1%	877	1.1%	0.000
Post-traumatic stress disorder	659	0.6%	654	0.6%	0.009	454	0.6%	472	0.6%	−0.003
Pressure and chronic ulcer	2113	2.0%	3286	2.8%	−0.051	1586	2.0%	1508	1.9%	0.007
Schizophrenia	149	0.1%	300	0.3%	−0.025	132	0.2%	151	0.2%	−0.006
Schizophrenic psychosis	421	0.4%	894	0.8%	−0.047	358	0.4%	348	0.4%	0.002
Blind and visual impairment	65	0.1%	178	0.2%	−0.027	49	0.1%	74	0.1%	−0.011
Deaf and hearing impairment	3040	2.9%	4252	3.6%	−0.041	2402	3.0%	2213	2.7%	0.014
Spina bifida	38	0.0%	78	0.1%	−0.013	31	0.0%	29	0.0%	0.001
Spinal injury	151	0.1%	321	0.3%	−0.028	118	0.1%	107	0.1%	0.004
Tobacco use	6862	6.5%	8418	7.1%	−0.025	5439	6.8%	5449	6.8%	−0.000
Traumatic brain injury	149	0.1%	243	0.2%	−0.016	111	0.1%	127	0.2%	−0.005

Open in a new tab

Table 1.

Continued

	Before matching					After 1:1 propensity score matching
	SGLT2i initiators		DPP4i initiators		Standardized mean difference	SGLT2i initiators		DPP4i initiators		Standardized mean difference
	Number/Mean	%/SD	Number/Mean	%/SD	Standardized mean difference	Number/Mean	%/SD	Number/Mean	%/SD	Standardized mean difference
Viral hepatitis	1620	1.5%	2751	2.3%	−0.058	1347	1.7%	1286	1.6%	0.006
Mental and physical impairment	3717	3.5%	5833	4.9%	−0.071	2936	3.6%	2878	3.6%	0.004
Other comedications
Gout medications	56 857	53.6%	63 665	53.7%	−0.003	42 995	53.4%	42 772	53.1%	0.006
Oxicam medications	19 350	18.2%	21 135	17.8%	0.011	14 496	18.0%	14 509	18.0%	−0.000
Sertraline	4683	4.4%	4981	4.2%	0.010	3351	4.2%	3359	4.2%	−0.000
Sulfa antibiotics	9487	8.9%	11 044	9.3%	−0.013	7094	8.8%	7058	8.8%	0.002
Health care utilization characteristics
Mean number of ambulatory encounters	14.2	12.8	14.9	15.3	−0.053	13.8	12.8	13.7	13.6	0.008
Mean number of emergency room encounters	0.4	1.1	0.5	1.3	−0.092	0.4	1.1	0.4	1.0	0.007
Mean number of inpatient hospital encounters	0.1	0.4	0.2	0.6	−0.164	0.1	0.4	0.1	0.4	0.011
Mean number of nonacute institutional encounters	0.0	0.0	0.0	0.0	−0.004	0.0	0.0	0.0	0.0	−0.003
Mean number of other ambulatory encounters	3.0	5.0	3.7	7.5	−0.105	3.0	5.1	3.0	5.3	−0.013
Mean number of filled prescriptions	35.1	26.2	33.2	26.3	0.075	33.0	25.1	32.8	26.0	0.007
Mean number of generics dispensed	10.0	5.8	9.7	5.9	0.061	9.6	5.7	9.5	5.8	0.016
Mean number of unique drug classes dispensed	8.9	5.0	8.7	5.2	0.035	8.6	4.9	8.5	5.0	0.010

Open in a new tab

Abbreviations: SD: Standard deviation; %: Percentage.

After 1:1 propensity score matching, there were 80 510 pairs (Figure 3), corresponding to a reduction in sample size of approximately 25% after matching. All baseline characteristics were balanced after matching, as indicated by absolute standardized mean differences ≤.1 (Table 1). Figure S2 of the supplemental material shows the propensity score distribution of both groups before and after matching. From inpatient admissions, emergency department presentations, and ambulatory care, there were 215 133 incident outcomes among 45 444 SGLT2i initiators and 223 428 among 45 931 DPP4i initiators.

Cohort attrition in preparing the analytic cohort for tree-based scan statistic analysis. ED: Emergency department; PS: Propensity score; T1DM(+): Presence of a diagnosis for type 1 diabetes; T2DM(-): Absence of a diagnosis for type 2 diabetes

Repurposing signals

In the original pruned outcome tree, there were 175 922 incident outcomes among SGLT2i initiators and 183 824 among DPP4i initiators, across 30 555 nodes (levels 3, 4, and 5). Tree-based scan statistics analysis using the original pruned tree yielded 18 statistical alerts (ie, prioritized associations that met the statistical threshold for alerting; P ≤ .01; Table 2). The statistical alerts were predominantly outcomes relating to kidney diseases, anemia, and clinical symptoms, such as edema and dyspnea. As for the expected signals, CKD (N18) was identified as the most likely node (P = .0001), while heart failure (I50) was the first node that fell beyond the threshold for prioritization (P = .0167). We present the complete list of inverse associations with P <1 in Table S6 of the supplemental material.

Table 2.

Tree-based scan statistics for associations between SGLT2i vs DPP4i and outcomes (only for associations with P < .1).^a

Node	Description	Total outcomes	Observed outcomes (SGLT2i)	Expected putcomes ^b (SGLT2i)	Observed: expected outcomes (SGLT2i)	Log likelihood ratio (scan statistic)	P value
Associations with statistical alert (P ≤ .01)
N18	Chronic kidney disease (CKD)	1470	594	735	0.81	27.21738	0.0001
N18.3	Chronic kidney disease, stage 3 (moderate)	722	270	361	0.75	23.18839	0.0001
D64	Other anemias	1415	581	707.5	0.82	22.7401	0.0001
D64.9	Anemia, unspecified	1356	556	678	0.82	22.07283	0.0001
R60.0	Localized edema	941	371	470.5	0.79	21.20169	0.0001
R60	Edema, not elsewhere classified	1564	656	782	0.84	20.39056	0.0001
I12	Hypertensive chronic kidney disease	833	333	416.5	0.8	16.85408	0.0001
E83.4	Disorders of magnesium metabolism	307	106	153.5	0.69	14.94276	0.0002
I12.9	Hypertensive chronic kidney disease with stage 1-4 or unspecified chronic kidney disease	802	324	401	0.81	14.87777	0.0002
R06	Abnormalities of breathing	4176	1920	2088	0.92	13.53186	0.0005
R80	Proteinuria	852	351	426	0.82	13.2733	0.0006
E83.42	Hypomagnesemia	287	101	143.5	0.7	12.7779	0.0008
N25.8	Other disorders resulting from impaired renal tubular function	74	17	37	0.46	11.41062	0.0055
R06.0	Dyspnea	3177	1454	1588.5	0.92	11.40191	0.0055
R80.9	Proteinuria, unspecified	754	312	377	0.83	11.26309	0.0056
D63.1	Anemia in chronic kidney disease	124	36	62	0.58	11.24766	0.0058
N20	Calculus of kidney and ureter	1070	458	535	0.86	11.12082	0.0066
I51	Complications and ill-defined descriptions of heart disease	1232	534	616	0.87	10.94805	0.0089
Nonsignificant associations (P > 0.01)^a
I50	Heart failure	846	357	423	0.84	10.34007	0.0167
R14.2	Eructation	42	7	21	0.33	10.18861	0.018
N25	Disorders resulting from impaired renal tubular function	95	26	47.5	0.55	10.09454	0.0198
D63	Anemia in chronic diseases classified elsewhere	217	76	108.5	0.7	9.886091	0.0236
N25.81	Secondary hyperparathyroidism of renal origin	70	17	35	0.49	9.715734	0.0266
R09	Other symptoms and signs involving the circulatory and respiratory system	1967	887	983.5	0.9	9.483731	0.0372
N13.2	Hydronephrosis with renal and ureteral calculous obstruction	127	40	63.5	0.63	8.907116	0.0713
N19	Unspecified kidney failure	119	37	59.5	0.62	8.72376	0.0784

Open in a new tab

Refer to table in Table S6 for associations with P > .1.

The number of expected outcomes at a node was calculated as half of the total number of outcomes from both exposure and comparator group.

Sensitivity analyses

After further pruning of the outcome tree, the total node count scanned decreased by 14% from 30 555 to 26 288 (Table S3). The number of incident outcomes decreased by 24% after additional pruning (133 821 incident outcomes among SGLT2i initiators; 139 083 among DPP4i initiators). The analysis using this further pruned tree yielded a total of 12 statistical alerts (P ≤ .01; Table S7). All the inverse associations in this sensitivity analysis were identified (and in the same order as) in the primary analysis using the original pruned tree, but most with lower P values, such as for heart failure (P = .0167 [original pruned tree] vs P = .0134 [further pruned tree]). No additional inverse associations with P <1 were identified using the further pruned tree.

When scanning additionally at level 2 of the original outcome tree, TBSS analysis yielded a total of 15 statistical alerts (P ≤ .01; Table S8). Notably, several level 2 outcome nodes were also identified as statistical alerts. As for the expected signals, CKD (N18) remained as one of the statistical alerts (P = .0027); heart failure (I50) remained slightly beyond the threshold for prioritization (P = .0333).

When restricting incident outcomes to diagnoses from inpatient admission and emergency department presentations only, there were 29 773 incident outcomes among 5942 SGLT2i initiators and 34 001 incident outcomes among 6473 DPP4i initiators. The analysis conducted without including diagnoses from ambulatory care yielded a total of 5 statistical alerts (P ≤ .01; Table S9). Both expected signals were not included as statistical alerts (CKD, P = .9695; heart failure, P = .2922).

Discussion

This study demonstrated a novel implementation of TBSS to generate drug repurposing hypotheses. Our test case using the glucose-lowering drug class, SGLT2i, identified the 2 expected signals, CKD and heart failure, that align with newly approved indications of SGLT2i in recent years. Chronic kidney disease was identified as the most statistically significant alert (P = .001); heart failure (P = .0167) fell just beyond the statistical alert threshold (P ≤ .01), which might be influenced by specifications of the statistical alert threshold and outcome tree (discussed later on), in addition to the number of events and the magnitude of the observed association. Furthermore, most of the statistical alerts could be related to clinical signs, symptoms, and abnormal laboratory results linked to heart failure and/or CKD, such as dyspnea, edema, and proteinuria.³⁷^,³⁸ Statistical alerts pertaining to anemia are also suggestive of complications of heart failure and/or CKD.³⁹^,⁴⁰ However, previous clinical studies have reported an association between SGLT2i use and improved hematocrit,⁴¹^‑⁴³ which was supported by emerging evidence that SGLT2i may stimulate erythropoiesis independent of its diuresis effect.⁴⁴^‑⁴⁶

The multiplicity-adjusted P value was used as a metric to prioritize nodes to be evaluated as repurposing signals, similar to previous studies using TBSS to look for drug safety signals.⁵^,¹⁰^,¹²^,⁴⁷ A list of statistical alerts drew our attention to nodes with reduced risk of outcomes associated with SGLT2i use that occurred least likely due to chance. We used a conservative P value threshold of 0.01, but a standard significance level of 0.05 has been used in some TBSS studies for safety signals.¹⁰^,⁴⁸ The threshold for prioritization may also be further relaxed (eg, P ≤ .1) in underpowered studies, such as for rare diseases.⁴⁸ It is important to note the arbitrary nature of prespecifying the statistical alert threshold for prioritization, and one should consider the trade-off between minimizing the false-positive rate and missing some “true” repurposing signals when using a more stringent threshold. If the significance level was relaxed to 0.05 in our study, there would be 6 additional statistical alerts, which would have then included heart failure. Moreover, this finding also highlights the value of having clinicians review not only associations meeting the prespecified threshold for prioritization but also those falling slightly beyond the arbitrary threshold. In fact, a clinician might have been able to point out heart failure as a repurposing signal by piecing together a clinical picture based on many of the clinical signs and symptoms of heart failure that were among the statistical alerts (eg, dyspnea and edema). Additionally, one could also consider circumventing the use of a threshold for statistical alerting and review the list of associations (prioritized in the order of increasing P value) with an emphasis on associations with lower P values. Although some studies have also prioritized associations based on their magnitude in addition to statistical significance,⁴⁷ we did not do this because some collateral drug benefits may have a relatively small effect size but lead to substantial public health implications due to the prevalence of the condition. We also did not prioritize associations based on absolute effect size, as a small absolute effect size may still suggest important novel therapeutics for rare or orphan diseases. Lastly, the multiplicity adjustment employed in TBSS accounts for dependencies between associations evaluated and is more correct than traditional methods of accounting for multiple testing, such as Bonferroni corrections, which could have led to higher rates of false negatives.⁴⁷

This study demonstrated several other important considerations when designing drug repurposing studies using TBSS. Notably, the size of the outcome tree (ie, total number of nodes scanned) affects the ability to prioritize associations and identify repurposing signals. When using a “narrower” tree with a lower total number of nodes scanned, the maximum likelihood ratios generated from the 9999 simulated data sets decrease and have a narrower distribution across the smaller number of nodes. This effectively increases the probability for likelihood ratios of the observed nodes to rank higher, which leads to smaller P values for the same node and makes it easier to be prioritized. In our test case of SGLT2i, the P value for heart failure was slightly smaller after further pruning the outcome tree (P = .0167 before vs P = .0134 after; Table S7). Outcome trees may be pruned to remove outcomes that are of less interest or less informative, such as nonspecific signs and symptoms, or those potentially affected by incorrect temporality relative to exposure, for example, reverse causation for cancer outcomes. A step further, the entire tree could be restricted to a specific therapeutic area if there is warranted a priori knowledge, which would address a different and more targeted research aim (eg, identifying repurposing signals of SGLT2i for cardiovascular outcomes). If we had restricted the outcome tree to only cardiovascular diseases (I00-I99; post hoc analysis), then the P value for heart failure would have notably decreased further and met the prespecified threshold for prioritization (P = .0167 before vs P = .0018 after; see Table S10).

The hierarchical grouping of diagnoses within the outcome tree influences the identification of potential repurposing signals using TBSS. Clinical conditions whose grouping of parent diagnoses and related diagnoses fall within the same branch or spatially close to each other within the tree will lead to larger aggregated sample sizes at the nodes, translating to greater power to detect potential signals. In our study, dyspnea (R06.0; P = .0055) was identified as a more likely cut than heart failure (I50; P = .0167) due to the significantly larger sample size of outcomes (3177 vs 846). It is possible that heart failure could have been identified as a more likely node (ie, a lower P value) if the increased occurrence of dyspnea, a common yet spatially distant (within the tree) clinical presentation of heart failure, could be considered. However, it is important to acknowledge that subclinical symptoms, such as dyspnea, can suggest a myriad of other parent diagnoses, such as obstructive respiratory diseases. Furthermore, parent diagnoses of interest may be grouped at higher hierarchical levels of the tree and scanning at higher levels of the tree may impact the associations prioritized. The statistical significance of the scan statistics for liver diseases increased when scanning additionally at level 2 of the outcome tree (K70-K77; P = .0452; Table S8). Indeed, previous clinical studies have reported reduced hepatitis fibrosis and steatosis from SGLT2i use.⁴⁹^,⁵⁰ This finding has been attributed to various pharmacological effects, including a reduction in oxidative stress and inflammation.⁵¹^,⁵² However, it is important to note that scanning at a higher level of the tree requires a more stringent incidence criterion (ie, defining the incidence of outcomes at the highest level of the tree scanned). Similar to reducing the size of the outcome tree via pruning, scanning across fewer levels (eg, at level 3 only) will theoretically increase statistical power. However, one should consider the trade-off between power and the possibility to detect potentially important associations at a finer-grained level, which may be more useful for drug repurposing. Lastly, future studies may also customize or construct a bespoke outcome tree with an enhanced grouping of diagnoses.

Another consideration is the clinical settings from which the outcome data are sourced, as it influences the overall number of outcomes, as well as the prevalence and distribution of recorded diagnoses. In our study, using diagnoses from inpatient admissions, emergency department presentations, and ambulatory care conferred an approximately 7-fold larger number of incident outcomes (n = 438 561) compared to using diagnoses from inpatient admissions and emergency department presentations only (n = 63 774). Moreover, the total number of incident outcomes was slightly more balanced between SGLT2i initiators and DPP4i initiators when considering diagnoses from all 3 clinical settings (Figure 2). This balance may suggest better exchangeability between the analytical cohorts, as people have a comparable number of incident outcomes diagnosed during follow-up regardless of the study drug received.⁵³ Inpatient admissions and emergency department presentations may better capture acute medical events or diseases of greater severity, while ambulatory care may provide a more comprehensive record of subacute medical conditions or milder stages of diseases. Therefore, including diagnoses from all clinical settings may provide a more complete picture of clinical outcomes. For example, in our study, both expected signals were not identified as statistical alerts when restricting the analysis to diagnoses from only inpatient admission and emergency department presentations (heart failure, P = .2922; CKD, P = .9695; Table S9).

Our study had some notable strengths. First, we used data from a large set of linked administrative databases capturing longitudinal records of healthcare utilization and outcomes from hospital and ambulatory settings, which provided a large cohort size and a large number of events across the hierarchical outcome tree, especially at finer-grained levels. Second, we used an example drug class where recently approved indications for these medications could serve as positive controls (ie, expected repurposing signals) to evaluate the utility of the methodology.

However, our study had several limitations. First, there may have been residual confounding in the observed associations since granular clinical characteristics (eg, renal function and glycaemic control) were not available in the data. Furthermore, SGLT2i was initially contraindicated in individuals with poor renal function, and there was an early perception of less benefit with SGLT2i use in people with impaired renal function.⁵⁴ This might have introduced some confounding by indication in the observed inverse association between SGLT2i use and renal-related outcomes. Although we accounted for a general list of common confounders across all the outcomes assessed, prioritized repurposing signals would need to be further scrutinized and validated in a follow-up pharmacoepidemiologic studies with more tailored confounding control specific to the drug-outcome pair or a randomized trial. Second, signals identified using TBSS could theoretically suggest potential safety signals of the comparator drug instead of potential repurposing signals of the exposure drug. This concern can be mitigated by excluding signals for known adverse effects of the comparator drug when evaluating the prioritized nodes. Third, we used a 1-year lookback period to ascertain baseline comorbidities which may have resulted in under-ascertainment. A longer lookback period could have improved the sensitivity of capturing chronic comorbidities but limited the sample size and representativeness of the study population. Fourth, we censored individuals if one person from the matched pair was censored for any of these reasons. This design was required for the propensity score-matched TBSS approach, but it reduced the number of events and hence power of the analyses. Fifth, we did not use a baseline washout period for outcomes, which means prevalent health outcomes, especially chronic diseases, may have been included. Sixth, MarketScan data have not included death data since 2016 for patient privacy.⁵⁵ Hence, censoring due to deaths might not be complete. Last, while MarketScan data are nationally representative of individuals in the United States with employer-sponsored insurance, who account for a significant portion (~65%) of the population,⁵⁶ it is possible that these data may not be generalizable to individuals with public insurance, such as Medicare and Medicaid services, as well as those who are uninsured.

Conclusion

In our case study using the class of SGLT2i drugs, TBSS was able to identify expected repurposing signals representing new additional indications recently approved for this drug class. Several potential repurposing signals, such as for anemia and liver disease, were detected and should be further investigated. There are several important considerations when conducting TBSS for drug repurposing, including the statistical threshold used to prioritize associations, specification of the outcome tree, and clinical settings used to capture outcomes. Future studies could apply this methodology to other drugs of interest to generate repurposing hypotheses from RWD.

Author contributions

G.S.Q.T. contributed to the design of the study, performed the statistical analysis and literature search, and wrote and revised the manuscript. X.L., J.W., J.C.M., S.V.W., J.I.M., and J.I. contributed to the design of the study and revision of the manuscript. S.T. contributed to the acquisition of data, design of the study, and revision of the manuscript. G.S.Q.T. is the guarantor of this work and, as such, had full access to all the data in the study, and takes responsibility for the integrity of the data and the accuracy of the data analyses.

Supplementary Material

Web_Material_kwae355

web_material_kwae355.zip^{(185.4KB, zip)}

Acknowledgments

We thank Jenice Ko from Harvard Pilgrim Health Care Institute for her assistance with the Sentinel Routine Query Modules, and Dr. Thuy Thai from Harvard Pilgrim Health Care Institute for her help with using and interpreting results from the TreeScan software.

Contributor Information

George S Q Tan, Centre for Medicine Use and Safety, Faculty of Pharmacy and Pharmaceutical Sciences, Monash University, Parkville, Australia; Baker Heart and Diabetes Institute, Melbourne, Australia.

Judith C Maro, Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, United States.

Shirley V Wang, Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, United States.

Sengwee Toh, Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, United States; Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, United States.

Jedidiah I Morton, Centre for Medicine Use and Safety, Faculty of Pharmacy and Pharmaceutical Sciences, Monash University, Parkville, Australia; Baker Heart and Diabetes Institute, Melbourne, Australia.

Jenni Ilomäki, Centre for Medicine Use and Safety, Faculty of Pharmacy and Pharmaceutical Sciences, Monash University, Parkville, Australia.

Jenna Wong, Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, United States.

Xiaojuan Li, Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, United States.

Supplementary material

Supplementary material is available at American Journal of Epidemiology online.

Funding

G.S.Q.T. was supported by the Monash Graduate Scholarship and the Enhanced Research Experience program, Monash University, Australia. J.C.M. received support from the Harvard Pilgrim Health Care Institute Robert H. Ebert Career Development Award. X.L. received support from grant K01AG073651 from the National Institute on Aging.

Conflict of interest

S.V.W. has consulted for Veracity Healthcare Analytics, Exponent Inc, and MITRE an FFRDC for the Centers for Medicare and Medicaid for unrelated work. S.T. consults for Pfizer, Inc. and TriNetX, LLC. for unrelated work. J.I. has received funding from AstraZeneca, PLC., and Amgen, Inc. for unrelated work.

Data availability

The MarketScan data that support the findings of this study are available from Merative, which was licensed for use by Harvard Pilgrim Health Care Institute. Restrictions apply to the availability of these data, and so they are not publicly available. Results are however available from the authors upon reasonable request and according to the data-use agreement. The computing codes were from Sentinel Routine Query Modules (version 12.1.2), namely the Cohort Identification and Descriptive Analysis, Propensity Score Analysis, and Signal Identification modules.

References

1. Baker NC, Ekins S, Williams AJ. A bibliometric review of drug repurposing. Drug Discov Today. 2018;23(3):661-672. 10.1016/j.drudis.2018.01.018 [DOI] [PMC free article] [PubMed] [Google Scholar]
2. Parvathaneni V, Kulkarni NS, Muth A. Drug repurposing: a promising tool to accelerate the drug discovery process. Drug Discov Today. 2019;24(10):2076-2085. 10.1016/j.drudis.2019.06.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
3. Pushpakom S, Iorio F, Eyers PA, et al. Drug repurposing: progress, challenges and recommendations. Nat Rev Drug Discov. 2019;18(1):41-58. 10.1038/nrd.2018.168 [DOI] [PubMed] [Google Scholar]
4. Roy S, Dhaneshwar S, Bhasin B. Drug repurposing: an emerging tool for drug reuse, recycling and discovery. Curr Drug Res Rev. 2021;13(2):101-119. 10.2174/2589977513666210211163711 [DOI] [PubMed] [Google Scholar]
5. Wang SV, Maro JC, Gagne JJ, et al. A general propensity score for signal identification using tree-based scan statistics. Am J Epidemiol. 2021;190(7):1424-1433. 10.1093/aje/kwab034 [DOI] [PubMed] [Google Scholar]
6. Tan GSQ, Sloan EK, Lambert P, et al. Drug repurposing using real-world data. Drug Discov Today. 2023;28(1):103422. 10.1016/j.drudis.2022.103422 [DOI] [PubMed] [Google Scholar]
7. Liu F, Panagiotakos D. Real-world data: a brief review of the methods, applications, challenges and opportunities. BMC Med Res Methodol. 2022;22(1):287. 10.1186/s12874-022-01768-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
8. Brown JS, Maro JC, Nguyen M. Using and improving distributed data networks to generate actionable evidence: the case of real-world outcomes in the Food and Drug Administration's sentinel system. J Am Med Inform Assoc. 2020;27(5):793-797. 10.1093/jamia/ocaa028 [DOI] [PMC free article] [PubMed] [Google Scholar]
9. Kulldorff M, Fang Z, Walsh SJ. A tree-based scan statistic for database disease surveillance. Biometrics. 2003;59(2):323-331. 10.1111/1541-0420.00039 [DOI] [PubMed] [Google Scholar]
10. Kulldorff M, Dashevsky I, Avery TR, et al. Drug safety data mining with a tree-based scan statistic. Pharmacoepidemiol Drug Saf. 2013;22(5):517-523. 10.1002/pds.3423 [DOI] [PubMed] [Google Scholar]
11. Yih WK, Kulldorff M, Dashevsky I. Using the self-controlled tree-temporal scan statistic to assess the safety of live attenuated herpes zoster vaccine. Am J Epidemiol. 2019;188(7):1383-1388. 10.1093/aje/kwz104 [DOI] [PubMed] [Google Scholar]
12. Wang SV, Maro JC, Baro E, et al. Data Mining for Adverse Drug Events with a propensity score-matched tree-based scan statistic. Epidemiology. 2018;29(6):895-903. 10.1097/EDE.0000000000000907 [DOI] [PMC free article] [PubMed] [Google Scholar]
13. Yih WK, Daley MF, Duffy J, et al. A broad assessment of covid-19 vaccine safety using tree-based data-mining in the vaccine safety datalink. Vaccine. 2023;41(3):826-835. 10.1016/j.vaccine.2022.12.026 [DOI] [PMC free article] [PubMed] [Google Scholar]
14. McGuire DK, Shih WJ, Cosentino F, et al. Association of SGLT2 inhibitors with cardiovascular and kidney outcomes in patients with type 2 diabetes: a meta-analysis. JAMA Cardiol. 2021;6(2):148-158. 10.1001/jamacardio.2020.4511 [DOI] [PMC free article] [PubMed] [Google Scholar]
15. Nuffield Department of Population Health Renal Studies G, Consortium SiM-AC-RT . Impact of diabetes on the effects of sodium glucose co-transporter-2 inhibitors on kidney outcomes: collaborative meta-analysis of large placebo-controlled trials. Lancet. 2022;400(10365):1788-1801. 10.1016/S0140-6736(22)02074-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
16. Zelniker TA, Wiviott SD, Raz I, et al. SGLT2 inhibitors for primary and secondary prevention of cardiovascular and renal outcomes in type 2 diabetes: a systematic review and meta-analysis of cardiovascular outcome trials. Lancet. 2019;393(10166):31-39. 10.1016/S0140-6736(18)32590-X [DOI] [PubMed] [Google Scholar]
17. Fadiran O, Nwabuo C. The evolution of sodium-glucose Co-Transporter-2 inhibitors in heart failure. Cureus. 2021;13(11):e19379. 10.7759/cureus.19379 [DOI] [PMC free article] [PubMed] [Google Scholar]
18. Heerspink HJL, Stefansson BV, Correa-Rotter R, et al. Dapagliflozin in patients with chronic kidney disease. N Engl J Med. 2020;383(15):1436-1446. 10.1056/NEJMoa2024816 [DOI] [PubMed] [Google Scholar]
19. Giorgino F, Vora J, Fenici P. Renoprotection with SGLT2 inhibitors in type 2 diabetes over a spectrum of cardiovascular and renal risk. Cardiovasc Diabetol. 2020;19(1):196. 10.1186/s12933-020-01163-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
20. Kaneto H, Obata A, Kimura T, et al. Unexpected pleiotropic effects of SGLT2 inhibitors: pearls and pitfalls of this novel antidiabetic class. Int J Mol Sci. 2021;22(6):3062. 10.3390/ijms22063062 [DOI] [PMC free article] [PubMed] [Google Scholar]
21. Butler AM, Nickel KB, Overman RA, et al. IBM MarketScan Research Databases. In: Sturkenboom M, Schink T, eds. Databases for Pharmacoepidemiological Research. Cham: Springer International Publishing; 2021:243-251. [Google Scholar]
22. Kulaylat AS, Schaefer EW, Messaris E. Truven health analytics MarketScan databases for clinical research in colon and rectal surgery. Clin Colon Rectal Surg. 2019;32(1):54-60. 10.1055/s-0038-1673354 [DOI] [PMC free article] [PubMed] [Google Scholar]
23. ElSayed NA, Aleppo G, Aroda VR, et al. 9. Pharmacologic approaches to glycemic treatment: standards of Care in Diabetes-2023. Diabetes Care. 2023;46(Suppl 1):S140-S157. 10.2337/dc23-S009 [DOI] [PMC free article] [PubMed] [Google Scholar]
24. Kosiborod M, Cavender MA, Fu AZ, et al. Lower risk of heart failure and death in patients initiated on sodium-glucose Cotransporter-2 inhibitors versus other glucose-lowering drugs: the CVD-REAL study (comparative effectiveness of cardiovascular outcomes in new users of sodium-glucose Cotransporter-2 inhibitors). Circulation. 2017;136(3):249-259. 10.1161/CIRCULATIONAHA.117.029190 [DOI] [PMC free article] [PubMed] [Google Scholar]
25. Huang W, Whitelaw J, Kishore K, et al. The comparative epidemiology and outcomes of hospitalized patients treated with SGLT2 or DPP4 inhibitors. J Diabetes Complications. 2021;35(12):108052. 10.1016/j.jdiacomp.2021.108052 [DOI] [PubMed] [Google Scholar]
26. D'Andrea E, Wexler DJ, Kim SC, et al. Comparing effectiveness and safety of SGLT2 inhibitors vs DPP-4 inhibitors in patients with type 2 diabetes and varying baseline HbA1c levels. JAMA Intern Med. 2023;183(3):242-254. 10.1001/jamainternmed.2022.6664 [DOI] [PMC free article] [PubMed] [Google Scholar]
27. Tan GSQ, Morton JI, Wood S, et al. SGLT-2 inhibitor use and cause-specific hospitalization rates: an outcome-wide study to identify novel associations of SGLT-2 inhibitors. Clin Pharmacol Ther. 2024;115(6):1304-1315. 10.1002/cpt.3194 [DOI] [PubMed] [Google Scholar]
28. McMurray JJV, Solomon SD, Inzucchi SE, et al. Dapagliflozin in patients with heart failure and reduced ejection fraction. N Engl J Med. 2019;381(21):1995-2008. 10.1056/NEJMoa1911303 [DOI] [PubMed] [Google Scholar]
29. Sakshaug JW, Weir DR, Nicholas LH. Identifying diabetics in Medicare claims and survey data: implications for health services research. BMC Health Serv Res. 2014;14(1):150. 10.1186/1472-6963-14-150 [DOI] [PMC free article] [PubMed] [Google Scholar]
30. Schneeweiss S, Rassen JA, Brown JS, et al. Graphical depiction of longitudinal study designs in health care databases. Ann Intern Med. 2019;170(6):398-406. 10.7326/M18-3079 [DOI] [PubMed] [Google Scholar]
31. Gagne JJ, Glynn RJ, Avorn J, et al. A combined comorbidity score predicted mortality in elderly patients better than existing scores. J Clin Epidemiol. 2011;64(7):749-759. 10.1016/j.jclinepi.2010.10.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
32. Sun JW, Rogers JR, Her Q, et al. Adaptation and validation of the combined comorbidity score for ICD-10-CM. Med Care. 2017;55(12):1046-1051. 10.1097/MLR.0000000000000824 [DOI] [PubMed] [Google Scholar]
33. Chang HY, Weiner JP, Richards TM, et al. Validating the adapted diabetes complications severity index in claims data. Am J Manag Care. 2012;18(11):721-726. [PubMed] [Google Scholar]
34. Ranganathan P, Pramesh CS, Buyse M. Common pitfalls in statistical analysis: the perils of multiple testing. Perspect Clin Res. 2016;7(2):106-107. 10.4103/2229-3485.179436 [DOI] [PMC free article] [PubMed] [Google Scholar]
35. Amrhein V, Greenland S, McShane B. Scientists rise up against statistical significance. Nature. 2019;567(7748):305-307. 10.1038/d41586-019-00857-9 [DOI] [PubMed] [Google Scholar]
36. Pottegard A, Friis S, Sturmer T, et al. Considerations for Pharmacoepidemiological studies of drug-cancer associations. Basic Clin Pharmacol Toxicol. 2018;122(5):451-459. 10.1111/bcpt.12946 [DOI] [PMC free article] [PubMed] [Google Scholar]
37. Watson RD, Gibbs CR, Lip GY. ABC of heart failure. Clinical features and complications. BMJ. 2000;320(7229):236-239. 10.1136/bmj.320.7229.236 [DOI] [PMC free article] [PubMed] [Google Scholar]
38. Webster AC, Nagler EV, Morton RL. Chronic kidney disease. The Lancet. 2017;389(10075):1238-1252. 10.1016/S0140-6736(16)32064-5 [DOI] [PubMed] [Google Scholar]
39. Silverberg DS, Wexler D, Blum M, et al. The use of subcutaneous erythropoietin and intravenous iron for the treatment of the anemia of severe, resistant congestive heart failure improves cardiac and renal function and functional cardiac class, and markedly reduces hospitalizations. J Am Coll Cardiol. 2000;35(7):1737-1744. 10.1016/S0735-1097(00)00613-6 [DOI] [PubMed] [Google Scholar]
40. Portoles J, Martin L, Broseta JJ, et al. Anemia in chronic kidney disease: from pathophysiology and current treatments, to future agents. Front Med (Lausanne). 2021;8:642296. 10.3389/fmed.2021.642296 [DOI] [PMC free article] [PubMed] [Google Scholar]
41. Docherty KF, Curtain JP, Anand IS, et al. Effect of dapagliflozin on anaemia in DAPA-HF. Eur J Heart Fail. 2021;23(4):617-628. 10.1002/ejhf.2132 [DOI] [PMC free article] [PubMed] [Google Scholar]
42. Murashima M, Tanaka T, Kasugai T, et al. Sodium-glucose cotransporter 2 inhibitors and anemia among diabetes patients in real clinical practice. J Diabetes Investig. 2022;13(4):638-646. 10.1111/jdi.13717 [DOI] [PMC free article] [PubMed] [Google Scholar]
43. Maruyama T, Takashima H, Oguma H, et al. Canagliflozin improves erythropoiesis in diabetes patients with anemia of chronic kidney disease. Diabetes Technol Ther. 2019;21(12):713-720. 10.1089/dia.2019.0212 [DOI] [PMC free article] [PubMed] [Google Scholar]
44. Ghanim H, Abuaysheh S, Hejna J, et al. Dapagliflozin suppresses hepcidin and increases erythropoiesis. J Clin Endocrinol Metab. 2020;105(4):e1056-e1063. 10.1210/clinem/dgaa057 [DOI] [PubMed] [Google Scholar]
45. Marathias KP, Lambadiari VA, Markakis KP, et al. Competing effects of renin angiotensin system blockade and sodium-glucose Cotransporter-2 inhibitors on erythropoietin secretion in diabetes. Am J Nephrol. 2020;51(5):349-356. 10.1159/000507272 [DOI] [PubMed] [Google Scholar]
46. Osonoi T, Shirabe S, Saito M, et al. Dapagliflozin improves erythropoiesis and iron metabolism in type 2 diabetic patients with renal anemia. Diabetes Metab Syndr Obes. 2023;16:1799-1808. 10.2147/DMSO.S411504 [DOI] [PMC free article] [PubMed] [Google Scholar]
47. Wang SV, Kulldorff M, Poor S, et al. Screening medications for association with progression to wet age-related macular degeneration. Ophthalmology. 2021;128(2):248-255. 10.1016/j.ophtha.2020.08.004 [DOI] [PubMed] [Google Scholar]
48. Suarez EA, Nguyen M, Zhang D, et al. Novel methods for pregnancy drug safety surveillance in the FDA sentinel system. Pharmacoepidemiol Drug Saf. 2023;32(2):126-136. 10.1002/pds.5512 [DOI] [PubMed] [Google Scholar]
49. Zhou P, Tan Y, Hao Z, et al. Effects of SGLT2 inhibitors on hepatic fibrosis and steatosis: a systematic review and meta-analysis. Front Endocrinol (Lausanne). 2023;14:1144838. 10.3389/fendo.2023.1144838 [DOI] [PMC free article] [PubMed] [Google Scholar]
50. Hsiang JC, Wong VW. SGLT2 inhibitors in liver patients. Clin Gastroenterol Hepatol. 2020;18(10):2168-2172.e2. 10.1016/j.cgh.2020.05.021 [DOI] [PubMed] [Google Scholar]
51. Androutsakos T, Nasiri-Ansari N, Bakasis AD, et al. SGLT-2 inhibitors in NAFLD: expanding their role beyond diabetes and Cardioprotection. Int J Mol Sci. 2022;23(6). 10.3390/ijms23063107 [DOI] [PMC free article] [PubMed] [Google Scholar]
52. Miyamoto Y, Honda A, Yokose S, et al. The effects of SGLT2 inhibitors on liver cirrhosis patients with refractory ascites: a literature review. J Clin Med. 2023;12(6):2253. 10.3390/jcm12062253 [DOI] [PMC free article] [PubMed] [Google Scholar]
53. Grimes DA, Schulz KF. Bias and causal associations in observational research. Lancet. 2002;359(9302):248-252. 10.1016/S0140-6736(02)07451-2 [DOI] [PubMed] [Google Scholar]
54. Davidson JA. SGLT2 inhibitors in patients with type 2 diabetes and renal disease: overview of current evidence. Postgrad Med. 2019;131(4):251-260. 10.1080/00325481.2019.1601404 [DOI] [PubMed] [Google Scholar]
55. Xie F, Beukelman T, Sun D, et al. Identifying inpatient mortality in MarketScan claims data using machine learning. Pharmacoepidemiol Drug Saf. 2023;32(11):1299-1305. 10.1002/pds.5658 [DOI] [PubMed] [Google Scholar]
56. Bundorf MK, Gupta S, Kim C. Trends in US health insurance coverage during the COVID-19 pandemic. JAMA Health Forum. 2021;2(9):e212487. 10.1001/jamahealthforum.2021.2487 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Web_Material_kwae355

web_material_kwae355.zip^{(185.4KB, zip)}

Data Availability Statement

[ref1] 1. Baker NC, Ekins S, Williams AJ. A bibliometric review of drug repurposing. Drug Discov Today. 2018;23(3):661-672. 10.1016/j.drudis.2018.01.018 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref2] 2. Parvathaneni V, Kulkarni NS, Muth A. Drug repurposing: a promising tool to accelerate the drug discovery process. Drug Discov Today. 2019;24(10):2076-2085. 10.1016/j.drudis.2019.06.014 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref3] 3. Pushpakom S, Iorio F, Eyers PA, et al. Drug repurposing: progress, challenges and recommendations. Nat Rev Drug Discov. 2019;18(1):41-58. 10.1038/nrd.2018.168 [DOI] [PubMed] [Google Scholar]

[ref4] 4. Roy S, Dhaneshwar S, Bhasin B. Drug repurposing: an emerging tool for drug reuse, recycling and discovery. Curr Drug Res Rev. 2021;13(2):101-119. 10.2174/2589977513666210211163711 [DOI] [PubMed] [Google Scholar]

[ref5] 5. Wang SV, Maro JC, Gagne JJ, et al. A general propensity score for signal identification using tree-based scan statistics. Am J Epidemiol. 2021;190(7):1424-1433. 10.1093/aje/kwab034 [DOI] [PubMed] [Google Scholar]

[ref6] 6. Tan GSQ, Sloan EK, Lambert P, et al. Drug repurposing using real-world data. Drug Discov Today. 2023;28(1):103422. 10.1016/j.drudis.2022.103422 [DOI] [PubMed] [Google Scholar]

[ref7] 7. Liu F, Panagiotakos D. Real-world data: a brief review of the methods, applications, challenges and opportunities. BMC Med Res Methodol. 2022;22(1):287. 10.1186/s12874-022-01768-6 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref8] 8. Brown JS, Maro JC, Nguyen M. Using and improving distributed data networks to generate actionable evidence: the case of real-world outcomes in the Food and Drug Administration's sentinel system. J Am Med Inform Assoc. 2020;27(5):793-797. 10.1093/jamia/ocaa028 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref9] 9. Kulldorff M, Fang Z, Walsh SJ. A tree-based scan statistic for database disease surveillance. Biometrics. 2003;59(2):323-331. 10.1111/1541-0420.00039 [DOI] [PubMed] [Google Scholar]

[ref10] 10. Kulldorff M, Dashevsky I, Avery TR, et al. Drug safety data mining with a tree-based scan statistic. Pharmacoepidemiol Drug Saf. 2013;22(5):517-523. 10.1002/pds.3423 [DOI] [PubMed] [Google Scholar]

[ref11] 11. Yih WK, Kulldorff M, Dashevsky I. Using the self-controlled tree-temporal scan statistic to assess the safety of live attenuated herpes zoster vaccine. Am J Epidemiol. 2019;188(7):1383-1388. 10.1093/aje/kwz104 [DOI] [PubMed] [Google Scholar]

[ref12] 12. Wang SV, Maro JC, Baro E, et al. Data Mining for Adverse Drug Events with a propensity score-matched tree-based scan statistic. Epidemiology. 2018;29(6):895-903. 10.1097/EDE.0000000000000907 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref13] 13. Yih WK, Daley MF, Duffy J, et al. A broad assessment of covid-19 vaccine safety using tree-based data-mining in the vaccine safety datalink. Vaccine. 2023;41(3):826-835. 10.1016/j.vaccine.2022.12.026 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref14] 14. McGuire DK, Shih WJ, Cosentino F, et al. Association of SGLT2 inhibitors with cardiovascular and kidney outcomes in patients with type 2 diabetes: a meta-analysis. JAMA Cardiol. 2021;6(2):148-158. 10.1001/jamacardio.2020.4511 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref15] 15. Nuffield Department of Population Health Renal Studies G, Consortium SiM-AC-RT . Impact of diabetes on the effects of sodium glucose co-transporter-2 inhibitors on kidney outcomes: collaborative meta-analysis of large placebo-controlled trials. Lancet. 2022;400(10365):1788-1801. 10.1016/S0140-6736(22)02074-8 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref16] 16. Zelniker TA, Wiviott SD, Raz I, et al. SGLT2 inhibitors for primary and secondary prevention of cardiovascular and renal outcomes in type 2 diabetes: a systematic review and meta-analysis of cardiovascular outcome trials. Lancet. 2019;393(10166):31-39. 10.1016/S0140-6736(18)32590-X [DOI] [PubMed] [Google Scholar]

[ref17] 17. Fadiran O, Nwabuo C. The evolution of sodium-glucose Co-Transporter-2 inhibitors in heart failure. Cureus. 2021;13(11):e19379. 10.7759/cureus.19379 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref18] 18. Heerspink HJL, Stefansson BV, Correa-Rotter R, et al. Dapagliflozin in patients with chronic kidney disease. N Engl J Med. 2020;383(15):1436-1446. 10.1056/NEJMoa2024816 [DOI] [PubMed] [Google Scholar]

[ref19] 19. Giorgino F, Vora J, Fenici P. Renoprotection with SGLT2 inhibitors in type 2 diabetes over a spectrum of cardiovascular and renal risk. Cardiovasc Diabetol. 2020;19(1):196. 10.1186/s12933-020-01163-9 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref20] 20. Kaneto H, Obata A, Kimura T, et al. Unexpected pleiotropic effects of SGLT2 inhibitors: pearls and pitfalls of this novel antidiabetic class. Int J Mol Sci. 2021;22(6):3062. 10.3390/ijms22063062 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref21] 21. Butler AM, Nickel KB, Overman RA, et al. IBM MarketScan Research Databases. In: Sturkenboom M, Schink T, eds. Databases for Pharmacoepidemiological Research. Cham: Springer International Publishing; 2021:243-251. [Google Scholar]

[ref22] 22. Kulaylat AS, Schaefer EW, Messaris E. Truven health analytics MarketScan databases for clinical research in colon and rectal surgery. Clin Colon Rectal Surg. 2019;32(1):54-60. 10.1055/s-0038-1673354 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref23] 23. ElSayed NA, Aleppo G, Aroda VR, et al. 9. Pharmacologic approaches to glycemic treatment: standards of Care in Diabetes-2023. Diabetes Care. 2023;46(Suppl 1):S140-S157. 10.2337/dc23-S009 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref24] 24. Kosiborod M, Cavender MA, Fu AZ, et al. Lower risk of heart failure and death in patients initiated on sodium-glucose Cotransporter-2 inhibitors versus other glucose-lowering drugs: the CVD-REAL study (comparative effectiveness of cardiovascular outcomes in new users of sodium-glucose Cotransporter-2 inhibitors). Circulation. 2017;136(3):249-259. 10.1161/CIRCULATIONAHA.117.029190 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref25] 25. Huang W, Whitelaw J, Kishore K, et al. The comparative epidemiology and outcomes of hospitalized patients treated with SGLT2 or DPP4 inhibitors. J Diabetes Complications. 2021;35(12):108052. 10.1016/j.jdiacomp.2021.108052 [DOI] [PubMed] [Google Scholar]

[ref26] 26. D'Andrea E, Wexler DJ, Kim SC, et al. Comparing effectiveness and safety of SGLT2 inhibitors vs DPP-4 inhibitors in patients with type 2 diabetes and varying baseline HbA1c levels. JAMA Intern Med. 2023;183(3):242-254. 10.1001/jamainternmed.2022.6664 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref27] 27. Tan GSQ, Morton JI, Wood S, et al. SGLT-2 inhibitor use and cause-specific hospitalization rates: an outcome-wide study to identify novel associations of SGLT-2 inhibitors. Clin Pharmacol Ther. 2024;115(6):1304-1315. 10.1002/cpt.3194 [DOI] [PubMed] [Google Scholar]

[ref28] 28. McMurray JJV, Solomon SD, Inzucchi SE, et al. Dapagliflozin in patients with heart failure and reduced ejection fraction. N Engl J Med. 2019;381(21):1995-2008. 10.1056/NEJMoa1911303 [DOI] [PubMed] [Google Scholar]

[ref29] 29. Sakshaug JW, Weir DR, Nicholas LH. Identifying diabetics in Medicare claims and survey data: implications for health services research. BMC Health Serv Res. 2014;14(1):150. 10.1186/1472-6963-14-150 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref30] 30. Schneeweiss S, Rassen JA, Brown JS, et al. Graphical depiction of longitudinal study designs in health care databases. Ann Intern Med. 2019;170(6):398-406. 10.7326/M18-3079 [DOI] [PubMed] [Google Scholar]

[ref31] 31. Gagne JJ, Glynn RJ, Avorn J, et al. A combined comorbidity score predicted mortality in elderly patients better than existing scores. J Clin Epidemiol. 2011;64(7):749-759. 10.1016/j.jclinepi.2010.10.004 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref32] 32. Sun JW, Rogers JR, Her Q, et al. Adaptation and validation of the combined comorbidity score for ICD-10-CM. Med Care. 2017;55(12):1046-1051. 10.1097/MLR.0000000000000824 [DOI] [PubMed] [Google Scholar]

[ref33] 33. Chang HY, Weiner JP, Richards TM, et al. Validating the adapted diabetes complications severity index in claims data. Am J Manag Care. 2012;18(11):721-726. [PubMed] [Google Scholar]

[ref34] 34. Ranganathan P, Pramesh CS, Buyse M. Common pitfalls in statistical analysis: the perils of multiple testing. Perspect Clin Res. 2016;7(2):106-107. 10.4103/2229-3485.179436 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref35] 35. Amrhein V, Greenland S, McShane B. Scientists rise up against statistical significance. Nature. 2019;567(7748):305-307. 10.1038/d41586-019-00857-9 [DOI] [PubMed] [Google Scholar]

[ref36] 36. Pottegard A, Friis S, Sturmer T, et al. Considerations for Pharmacoepidemiological studies of drug-cancer associations. Basic Clin Pharmacol Toxicol. 2018;122(5):451-459. 10.1111/bcpt.12946 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref37] 37. Watson RD, Gibbs CR, Lip GY. ABC of heart failure. Clinical features and complications. BMJ. 2000;320(7229):236-239. 10.1136/bmj.320.7229.236 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref38] 38. Webster AC, Nagler EV, Morton RL. Chronic kidney disease. The Lancet. 2017;389(10075):1238-1252. 10.1016/S0140-6736(16)32064-5 [DOI] [PubMed] [Google Scholar]

[ref39] 39. Silverberg DS, Wexler D, Blum M, et al. The use of subcutaneous erythropoietin and intravenous iron for the treatment of the anemia of severe, resistant congestive heart failure improves cardiac and renal function and functional cardiac class, and markedly reduces hospitalizations. J Am Coll Cardiol. 2000;35(7):1737-1744. 10.1016/S0735-1097(00)00613-6 [DOI] [PubMed] [Google Scholar]

[ref40] 40. Portoles J, Martin L, Broseta JJ, et al. Anemia in chronic kidney disease: from pathophysiology and current treatments, to future agents. Front Med (Lausanne). 2021;8:642296. 10.3389/fmed.2021.642296 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref41] 41. Docherty KF, Curtain JP, Anand IS, et al. Effect of dapagliflozin on anaemia in DAPA-HF. Eur J Heart Fail. 2021;23(4):617-628. 10.1002/ejhf.2132 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref42] 42. Murashima M, Tanaka T, Kasugai T, et al. Sodium-glucose cotransporter 2 inhibitors and anemia among diabetes patients in real clinical practice. J Diabetes Investig. 2022;13(4):638-646. 10.1111/jdi.13717 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref43] 43. Maruyama T, Takashima H, Oguma H, et al. Canagliflozin improves erythropoiesis in diabetes patients with anemia of chronic kidney disease. Diabetes Technol Ther. 2019;21(12):713-720. 10.1089/dia.2019.0212 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref44] 44. Ghanim H, Abuaysheh S, Hejna J, et al. Dapagliflozin suppresses hepcidin and increases erythropoiesis. J Clin Endocrinol Metab. 2020;105(4):e1056-e1063. 10.1210/clinem/dgaa057 [DOI] [PubMed] [Google Scholar]

[ref45] 45. Marathias KP, Lambadiari VA, Markakis KP, et al. Competing effects of renin angiotensin system blockade and sodium-glucose Cotransporter-2 inhibitors on erythropoietin secretion in diabetes. Am J Nephrol. 2020;51(5):349-356. 10.1159/000507272 [DOI] [PubMed] [Google Scholar]

[ref46] 46. Osonoi T, Shirabe S, Saito M, et al. Dapagliflozin improves erythropoiesis and iron metabolism in type 2 diabetic patients with renal anemia. Diabetes Metab Syndr Obes. 2023;16:1799-1808. 10.2147/DMSO.S411504 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref47] 47. Wang SV, Kulldorff M, Poor S, et al. Screening medications for association with progression to wet age-related macular degeneration. Ophthalmology. 2021;128(2):248-255. 10.1016/j.ophtha.2020.08.004 [DOI] [PubMed] [Google Scholar]

[ref48] 48. Suarez EA, Nguyen M, Zhang D, et al. Novel methods for pregnancy drug safety surveillance in the FDA sentinel system. Pharmacoepidemiol Drug Saf. 2023;32(2):126-136. 10.1002/pds.5512 [DOI] [PubMed] [Google Scholar]

[ref49] 49. Zhou P, Tan Y, Hao Z, et al. Effects of SGLT2 inhibitors on hepatic fibrosis and steatosis: a systematic review and meta-analysis. Front Endocrinol (Lausanne). 2023;14:1144838. 10.3389/fendo.2023.1144838 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref50] 50. Hsiang JC, Wong VW. SGLT2 inhibitors in liver patients. Clin Gastroenterol Hepatol. 2020;18(10):2168-2172.e2. 10.1016/j.cgh.2020.05.021 [DOI] [PubMed] [Google Scholar]

[ref51] 51. Androutsakos T, Nasiri-Ansari N, Bakasis AD, et al. SGLT-2 inhibitors in NAFLD: expanding their role beyond diabetes and Cardioprotection. Int J Mol Sci. 2022;23(6). 10.3390/ijms23063107 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref52] 52. Miyamoto Y, Honda A, Yokose S, et al. The effects of SGLT2 inhibitors on liver cirrhosis patients with refractory ascites: a literature review. J Clin Med. 2023;12(6):2253. 10.3390/jcm12062253 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref53] 53. Grimes DA, Schulz KF. Bias and causal associations in observational research. Lancet. 2002;359(9302):248-252. 10.1016/S0140-6736(02)07451-2 [DOI] [PubMed] [Google Scholar]

[ref54] 54. Davidson JA. SGLT2 inhibitors in patients with type 2 diabetes and renal disease: overview of current evidence. Postgrad Med. 2019;131(4):251-260. 10.1080/00325481.2019.1601404 [DOI] [PubMed] [Google Scholar]

[ref55] 55. Xie F, Beukelman T, Sun D, et al. Identifying inpatient mortality in MarketScan claims data using machine learning. Pharmacoepidemiol Drug Saf. 2023;32(11):1299-1305. 10.1002/pds.5658 [DOI] [PubMed] [Google Scholar]

[ref56] 56. Bundorf MK, Gupta S, Kim C. Trends in US health insurance coverage during the COVID-19 pandemic. JAMA Health Forum. 2021;2(9):e212487. 10.1001/jamahealthforum.2021.2487 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Tree-based scan statistics to generate drug repurposing hypotheses: a test case using sodium-glucose cotransporter-2 inhibitors

George S Q Tan

Judith C Maro

Shirley V Wang

Sengwee Toh

Jedidiah I Morton

Jenni Ilomäki

Jenna Wong

Xiaojuan Li

Abstract

Introduction

Methods

Data sources

Study design and cohort

Figure 1.

Propensity score matching

Hierarchical outcome tree

Follow-up for outcomes

Scan statistics

Repurposing signals

Figure 2.

Sensitivity analyses

Software

Results

Cohort characteristics

Table 1.

Table 1.

Table 1.

Figure 3.

Repurposing signals

Table 2.

Sensitivity analyses

Discussion

Conclusion

Author contributions

Supplementary Material

Acknowledgments

Contributor Information

Supplementary material

Funding

Conflict of interest

Data availability

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases