Key Points
Question
Can a phenome-wide association study enable the use of genetics to inform drug development?
Findings
In this phenome-wide association study using electronic health record and genetic data from 332 799 US veterans, the association between a genetic variant of interleukin 6 receptor (IL6R) with potential effects of IL6R blocker therapy was assessed. The study identified a recently approved indication for IL6R blocker therapy associated with aortic aneurysm and identified off-target effects observed from clinical trials.
Meaning
The phenome-wide association study approach using large biobanks and genetics is a promising tool to assess potential beneficial and adverse effects of therapeutic agents with known pathways and related genes.
This phenome-wide association study assesses clinical associations between interleukin 6 receptor (IL6R) single-nucleotide polyporphisms and known IL6R drug effects and whether large biobanks and genetics can be used to assess potential beneficial and adverse effects of therapeutic agents with known pathways and related genes.
Abstract
Importance
Electronic health record (EHR) biobanks containing clinical and genomic data on large numbers of individuals have great potential to inform drug discovery. Individuals with interleukin 6 receptor (IL6R) single-nucleotide polymorphisms (SNPs) who are not receiving IL6R blocking therapy have biomarker profiles similar to those treated with IL6R blockers. This gene–drug pair provides an example to test whether associations of IL6R SNPs with a broad range of phenotypes can inform which diseases may benefit from treatment with IL6R blockade.
Objective
To determine whether screening for clinical associations with the IL6R SNP in a phenome-wide association study (PheWAS) using EHR biobank data can identify drug effects from IL6R clinical trials.
Design, Setting, and Participants
Diagnosis codes and routine laboratory measurements were extracted from the VA Million Veteran Program (MVP); diagnosis codes were mapped to phenotype groups using published PheWAS methods. A PheWAS was performed by fitting logistic regression models for testing associations of the IL6R SNPs with 1342 phenotype groups and by fitting linear regression models for testing associations of the IL6R SNP with 26 routine laboratory measurements. Significance was reported using a false discovery rate of 0.05 or less. Findings were replicated in 2 independent cohorts using UK Biobank and Vanderbilt University Biobank data. The Million Veteran Program included 332 799 US veterans; the UK Biobank, 408 455 individuals from the general population of the United Kingdom; and the Vanderbilt University Biobank, 13 835 patients from a tertiary care center.
Exposures
Main Outcomes and Measures
Phenotypes defined by International Classification of Diseases, Ninth Revision codes.
Results
Of the 332 799 veterans included in the main cohort, 305 228 (91.7%) were men, and the mean (SD) age was 66.1 (13.6) years. The IL6R SNP was most strongly associated with a reduced risk of aortic aneurysm phenotypes (odds ratio, 0.87-0.90; 95% CI, 0.84-0.93) in the MVP. We observed known off-target effects of IL6R blockade from clinical trials (eg, higher hemoglobin level). The reduced risk for aortic aneurysms among those with the IL6R SNP in the MVP was replicated in the Vanderbilt University Biobank, and the reduced risk for coronary heart disease was replicated in the UK Biobank.
Conclusions and Relevance
In this proof-of-concept study, we demonstrated application of the PheWAS using large EHR biobanks to inform drug effects. The findings of an association of the IL6R SNP with reduced risk for aortic aneurysms correspond with the newest indication for IL6R blockade, giant cell arteritis, of which a major complication is aortic aneurysm.
Introduction
Naturally occurring variants in the human genome can serve as experiments of nature to study potential drug targets.1,2,3,4 Individuals with genetic variants detected as single-nucleotide polymorphisms (SNPs) can have profiles similar to individuals receiving a treatment. An example of a gene–drug pair is the interleukin 6 receptor (IL6R) genetic variant Asp358Ala (rs2228145) and the IL6R antagonists tocilizumab and sarilumab.5 Both are indicated for the treatment of rheumatoid arthritis (RA), and tocilizumab is indicated for giant cell arteritis (GCA). Individuals with the IL6R variant not taking IL6R blockade have biochemical parameters similar to individuals taking the drug. For example, patients initiating IL6R blockade experience a significant reduction in C-reactive protein (CRP) levels. Among individuals with the Asp358Ala IL6R genetic variant, carriers had an 8.3% lower CRP level compared with those without the variant.5 Interleukin 6 is a proinflammatory cytokine that triggers inflammation by binding IL6R on the cell membrane.6 Functional studies of the Asp328Ala genetic variant showed that carriers have reduced expression of membrane-bound IL6R, leading to an impaired response to IL6.7 Similarly, tocilizumab and sarilumab impair response to IL6 by blocking its ability to bind to IL6R.
The significance of this gene–drug relationship suggests that a large-scale screen of phenotypes or a phenome-wide association study (PheWAS) of the IL6R genetic variant may uncover potential therapeutic targets for IL6R antagonists (Figure). The phenome-wide association study is a bioinformatics approach that enables investigators to screen for associations of a genetic variant of interest with a broad range of phenotypes available in the electronic health record (EHR).8,9,10 A phenome-wide association study can also identify potential detrimental effects of the drug to inform screening for potential adverse effects. This concept and approach has been discussed in the literature.11 However, it is only recently that large biobanks with linked EHR data, such as the Veterans Affairs Million Veteran Program (MVP)12 and the UK Biobank,13 have become available to fully test this hypothesis.
In this proof-of-concept study, we performed a PheWAS on the Asp235Ala IL6R genetic variant to determine whether known effects for IL6R blockade as well as known off-target effects from clinical trials can be detected using data from a large US-based biobank study. Additionally, results were replicated in 2 independent biobank cohorts using freely available online data.
Methods
Study Populations
The Million Veteran Program served as the main cohort for this PheWAS. The Million Veteran Program is a longitudinal cohort study with clinical EHR data containing inpatient and outpatient data linked with genomic data. The Million Veteran Program recruits from approximately 50 Veterans Affairs facilities across the United States. Inclusion criteria in the MVP include age of 18 years or older, having a valid mailing address, and having the ability to provide informed consent. At recruitment, individuals completed baseline and lifestyle questionnaires, including self-reported race/ethnicity, and provided blood samples for genotyping and biomarker studies.12 All individuals in the study provided written informed consent as part of the MVP. This study was approved through the Veterans Affairs central institutional review board as part of the MVP.
The UK Biobank is a prospective study of the effects of lifestyle, environmental, and genomic factors on disease outcomes. The study recruited approximately 500 000 volunteers from the general population of the United Kingdom aged 40 to 69 years from 2006 to 2010.13 The phenotypes available in the UK Biobank are derived from diverse sources, including inpatient International Statistical Classification of Diseases and Related Health Problems, Tenth Revision (ICD-10) codes. Currently online, only grouped inpatient ICD-10 codes are available. Data for individual ICD-10 and any outpatient ICD-10 codes were not available. Estimates of the genome–phenome associations along with their P values from the UK Biobank were obtained using Gene ATLAS.14
The Vanderbilt University Biobank (BioVU) is a DNA biobank at Vanderbilt University Medical Center linked to a copy of their EHR containing inpatient and outpatient data with a goal to explore the connection between genetics and health outcomes.15 All estimates of the genome–phenome associations along with their P values (<.05) based on 13 835 BioVU participants are freely available online.8 Deidentified data from the UK Biobank and BioVU are available freely online, and informed consent was waived.
Statistical Methods
The phenome-wide association study analysis included both ICD-9–based phenotypes and a list of routine laboratory measurements that were available in 75% or more of patients in the MVP. The ICD-9–based phenotypes were defined by mapping ICD-9 codes to PheWAS codes, as published by Denny et al.8,16 Using the standard approach, a participant was defined as having a phenotype if they had 2 or more PheWAS codes. We excluded PheWAS codes with a prevalence of 0.1% or less from the analysis. The laboratory measurements, defined by the average of all available measurements for each patient, consisted of complete blood count, including white blood cell count, hemoglobin level, platelet count, creatinine level, estimated glomerular filtration rate, liver function tests, and lipid levels, as well as total cholesterol level, high-density lipoprotein cholesterol level, low-density lipoprotein cholesterol level, and triglyceride level.
The primary study screened for associations of rs2228145 (risk allele A; Asp358Ala) with each individual phenotype defined by PheWAS codes by fitting logistic regression models. Linear regression models were fitted to test for associations of the log-transformed laboratory measurements with the PheWAS codes. We applied standard quality control pipelines, such as discordance between sex inferred by genotyping vs self-report. We also excluded related individuals (halfway between second-degree and third-degree relatives or closer) as measured by the Kinship-Based Inference for GWAS software.17 All models were adjusted for age at the last visit, sex, and the 20 leading principle components to adjust for population stratification,18,19 follow-up time in months, and total number of ICD-9 codes. The follow-up time and total number of ICD-9 codes was included to adjust for the density of EHR data for each patient; both variables were log-transformed.
To adjust for multiple testing, we defined statistical significance as a P value less than a threshold controlling for a false discovery rate (FDR) of 5% using the Benjamini-Hochberg procedure.20 This Benjamini-Hochberg FDR control ensured that among the associations considered significant, at most 5% of the associations were false-positive. For reference, we also reported the threshold for Bonferroni correction at a familywise error rate of 5%. As a sensitivity analysis, we additionally performed the PheWAS using 1 or more PheWAS codes to define a phenotype.
Based on prior knowledge regarding the hypothesized effects of IL6R treatment on inflammatory and cardiovascular phenotypes, we additionally extracted laboratory data on CRP levels, cardiac troponin levels, creatine kinase–MB (CK-MB) level, and brain natriuretic peptide level. We then tested the associations of rs2228145 with these additional laboratory phenotypes using linear regression models.
Validation of Significant Outcomes With Medical Record Review
For phenotypes considered significant after FDR control, we validated the accuracy of each phenotype through medical record review; 20 participants were randomly selected among those who had 1 or more PheWAS codes and reviewed for evidence of the phenotype in the narrative notes. All reviewers were clinically trained health care professionals (T.A.C., J.H., S.D., and K.P.L.). We reported the positive predictive value (PPV) of participants with 1 or more PheWAS codes and 2 or more PheWAS codes. The PPV in general was calculated as the number of confirmed phenotypes based on medical record review divided by participants with either 1 or more or 2 or more PheWAS codes.
Replication Using UK Biobank and BioVU Online Data
The phenotypes with significant associations with the IL6R genetic variant in MVP were further examined in the BioVU8 and UK Biobank. Since the exact SNP used for the MVP PheWAS was not available in these data sets, we compiled a list of SNPs in high linkage disequilibrium with our SNP of interest, rs2228145. Fourteen SNPs in high linkage disequilibrium (ie, R2 > 0.9) with rs2228145 were identified from the African and European populations obtained from the LDlink website using the LDproxy function.21 From this list, rs4129267 (R2 = 0.99) was available in both the BioVU and UK Biobank and was used to replicate findings from the MVP.
In the online UK Biobank data, phenotypes were grouped by inpatient ICD-10 codes, and analyses could not be performed with individual ICD-10 data. To replicate significant findings from the MVP, mapping from ICD-10 groups in UK Biobank were mapped back to ICD-9 codes and then to PheWAS codes. In some cases, the UK Biobank ICD-10 groups could not be mapped to a PheWAS code because the group was too broad. Using the closest match based on phenotype description, we then extracted results on the association of rs4129267 with the available phenotypes of interest using the Gene ATLAS website.14
Since phenotypes from BioVU were defined by PheWAS code, a direct look-up online of the associations of rs4129267 with the phenotypes of interest was performed.8 For the replication studies, a P value less than .05 was considered significant. All analyses were implemented in R, version 3.2.2 (the R Foundation).
Results
The IL6R PheWAS in the MVP studied 332 799 participants, of whom 305 228 (91.7%) were men with a mean (SD) age of 66.1 (13.6) years and a mean (SD) follow-up time of 11.9 (4.9) years. General characteristics of the population, including the most common conditions based on PheWAS codes and representation by region of the United States, are shown in Table 1.
Table 1. Demographic and Clinical Characteristics of Million Veteran Program Participants With Genetic Data (n = 332 799)a.
Characteristic | No. (%) |
---|---|
Age, mean (SD), y | 66.1 (13.6) |
Male | 305 228 (91.7) |
Duration of EHR follow-up, mean (SD), y | 11.94 (4.92) |
Most common diagnosis by phenotype | |
Osteoarthritis and joint pain | 276 085 (83.0) |
Hypertension and complications | 244 866 (73.6) |
Dyslipidemia | 244 385 (73.4) |
Visual acuity | 230 318 (69.2) |
Mood disorder | 183 750 (55.2) |
Geographic region | |
Northeast | 41 190 (12.4) |
South | 157 214 (47.2) |
Midwest | 49 304 (14.8) |
West | 85 090 (25.6) |
Abbreviation: EHR, electronic health record.
Groupings based on phenome-wide association study codes.
Twenty-two significant phenotypes were associated with the IL6R genetic variant, of which 13 (59%) were associated with vascular and cardiac disease; the threshold for significance was P < 6.6 × 10−4. The phenotypes with the strongest association with IL6R were aortic aneurysm (odds ratio [OR], 0.90; 95% CI, 0.87-0.93) as well as a specific type of aortic aneurysm, abdominal aortic aneurysm (AAA) (OR, 0.87; 95% CI, 0.84-0.90), and coronary atherosclerosis and ischemic heart disease (OR, 0.95; 95% CI, 0.94-0.97) (Figure; Table 2) (eTable 1 in the Supplement). Based on medical record review, the PPV of the PheWAS codes ranged from 55% to 100% (eTable 2 in the Supplement).
Table 2. Significant Associations of IL6R With Phenome-Wide Association Study (PheWAS) Codes in the Million Veteran Program (MVP) and Replication Results From UK Biobank and Vanderbilt University Biobank (BioVU)a.
Clinical Phenotype | MVP | Replication | ||||
---|---|---|---|---|---|---|
PheWAS Code Description | Prevalence, % | OR (95% CI) | P Value | UK Biobankb | BioVUc | |
Aortic aneurysm | Abdominal aortic aneurysm | 2.4 | 0.87 (0.84-0.90) | 3.73 × 10−15 | Yesd | Yes |
Aortic aneurysm | 3.0 | 0.90 (0.87-0.93) | 1.02 × 10−11 | Yesd | NA | |
Other aneurysm | 3.4 | 0.92 (0.89-0.94) | 2.57 × 10−9 | NA | NA | |
Aneurysm of iliac artery | 0.3 | 0.83 (0.75-0.92) | 2.73 × 10−4 | NA | NA | |
Ischemic heart disease | Coronary atherosclerosis | 20.5 | 0.95 (0.94-0.97) | 3.43 × 10−12 | Yes | NA |
Ischemic heart disease | 25.4 | 0.95 (0.94-0.97) | 3.97 × 10−12 | Yes | NA | |
Other chronic ischemic heart disease | 14.9 | 0.95 (0.93-0.97) | 6.04 × 10−12 | Yes | NA | |
Myocardial infarction | 5.5 | 0.94 (0.92-0.97) | 1.69 × 10−6 | Yes | NA | |
Vascular disease | Atherosclerosis of native arteries of the extremities with intermittent claudication | 1.6 | 0.90 (0.87-0.94) | 3.92 × 10−6 | NA | NA |
Peripheral vascular disease, unspecified | 6.5 | 0.95 (0.93-0.97) | 5.02 × 10−6 | NA | NA | |
Atherosclerosis of the extremities | 2.3 | 0.92 (0.89-0.96) | 9.73 × 10−6 | NA | NA | |
Atherosclerosis | 3.1 | 0.93 (0.91-0.96) | 1.80 × 10−5 | NA | NA | |
Peripheral vascular disease | 6.9 | 0.96 (0.94-0.98) | 3.87 × 10−5 | NA | NA | |
Skin conditions | Degenerative skin conditions | 19.4 | 1.03 (1.01-1.04) | 1.72 × 10−4 | NA | NA |
Seborrheic dermatitis | 4.5 | 1.05 (1.02-1.07) | 3.30 × 10−4 | NA | NA | |
Atopic/contact dermatitis | 11.9 | 1.03 (1.01-1.05) | 3.72 × 10−4 | Yes | Yes | |
Erythematosquamous dermatosis | 4.6 | 1.04 (1.02-1.07) | 5.10 × 10−4 | Yes | NA | |
Musculoskeletal | Acquired deformities of finger | 0.3 | 1.19 (1.09-1.31) | 2.57 × 10−4 | NA | NA |
Gouty arthropathy | 1.5 | 1.08 (1.03-1.13) | 5.23 × 10−4 | Yes | NA | |
Pulmonary | Pleurisy/pleural effusion | 1.8 | 1.07 (1.03-1.12) | 3.47 × 10−4 | Yes | NA |
Renal | Disorders resulting from impaired renal function | 1.1 | 1.10 (1.04-1.16) | 4.92 × 10−4 | NA | NA |
Eye | Conjunctivitis, non-infectious | 1.4 | 1.08 (1.03-1.13) | 6.60 × 10−4 | Yes | NA |
Abbreviations: NA, not applicable; OR, odds ratio.
Details on mapping and association study in UK Biobank and BioVU can be found in eTable 3 in the Supplement.
For UK Biobank, NA indicates that direct mapping of International Statistical Classification of Diseases and Related Health Problems, Tenth Revision groups provided in UK Biobank to PheWAS codes could not be performed, and thus, the association was not tested.
For BioVU, NA indicates that the association was nonsignificant with a P value of .05 or greater.
Aortic aneurysms phenotype in UK Biobank grouped with aortic dissection; P = .05.
Association of IL6R Variant With Laboratory Findings
Each allele of the IL6R variant was associated with higher levels of hemoglobin and albumin (Figure). These 2 laboratory measurements are known to increase with IL6R antagonist therapy.5,22,23,24,25,26 The known reduction in CRP level was observed in the PheWAS, as were findings consistent with a reduced risk of ischemic heart disease, lower levels of troponin I, and lower CK-MB levels (Table 3).
Table 3. Association of the IL6R Variant With Biomarkers Associated With Inflammation and Cardiovascular Disease Risk.
Laboratory Measurement | β (95% CI) | P Value |
---|---|---|
C-reactive protein | −0.06 (−0.08 to −0.04) | 2.76 × 10−11 |
Troponin I | −0.04 (−0.06 to −0.02) | 7.12 × 10−5 |
CK-MB | −0.01 (−0.02 to −0.003) | .01 |
Pro–B-type natriuretic peptide | −0.02 (−0.05 to 0.01) | .18 |
Abbreviation: CK-MB, creatine kinase–MB.
The sensitivity analysis showed similar associations of the IL6R variant with reduced risk for aortic aneurysms and cardiovascular disease phenotypes (eFigure in the Supplement). Additionally, in a secondary analysis, we examined the association of the IL6R variant with procedure codes for aortic rupture repair (n = 1667), as these patients may be considered to have a more severe form of AAA. We observed a 16% reduction in risk for AAA rupture repair among individuals with the IL6R variant (OR, 0.84; 95% CI, 0.78-0.90).
Replicating MVP PheWAS Findings in the UK Biobank and BioVU
From UK Biobank, we replicated an association of the IL6R variant with coronary heart disease phenotypes, including chronic ischemic heart disease (OR, 0.99; 95% CI, 0.9968-0.9991; P = .002) among the 408 455 individuals with genomic and inpatient ICD-10 billing group data (Table 2) (eTable 3 in the Supplement). From BioVU, we performed the replication on 13 835 individuals and confirmed associations of the IL6R variant with abdominal aortic aneurysm (OR, 0.83; 95% CI, 0.7082-0.9726; P = .02) and atopic or contact dermatitis (OR, 1.08; 95% CI, 1.0020-1.1554; P = .04).
Discussion
This study demonstrated an application of large EHR biobanks as a research platform using genetics to inform drug development pipelines using a PheWAS approach. From previous studies, the IL6R genetic variant was known to have similar biochemical effects as IL6R antagonist therapy5; a higher number of IL6R alleles (0, 1, or 2) is associated with effects of a higher dose of IL6R blockers. This phenome-wide association study identified several potential associations. The discussion will focus on phenotypes associated with the IL6R variant in the MVP with replication in the UK Biobank or BioVU.
The strongest association observed in this PheWAS of IL6R was a reduced risk for aortic aneurysm phenotypes, including a 13% reduced risk for AAA and other aneurysms of the aorta. These findings are in line with a 2017 study27 demonstrating the effectiveness of IL6R blockade therapy for GCA, for which aortic aneurysm is a clinical presentation.28,29 Despite potential limitations stemming from the noisiness of EHR data, the PheWAS also corroborated prior studies showing an association of the IL6R variant with reduced risk of AAA.30 Additionally, results from mendelian randomization and network analyses suggest that IL6R is part of the causal pathway for AAA.30,31 In this study, we observed a 16% reduced risk of aortic rupture repair among individuals with the IL6R variant, suggesting that blocking the IL6R pathway may reduce severity of the condition.
The IL6R PheWAS detected known off-target effects of IL6R blockade observed from clinical trials of tocilizumab and sarilumab,5,22,23,24,25,26,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48 including associations with higher hemoglobin levels and atopic dermatitis. Data from the IL6R clinical trials25,37 showed improvements in hemoglobin levels attributed to control of inflammation and resolution of anemia of long-term disease. Mechanistic studies in humans found that the anemia is a result of increased plasma volume.49,50 The association of the IL6R variant with lower CRP levels was also anticipated because they are on the same pathway, and lower CRP levels were observed among individuals treated with IL6R blockade in the clinical trials.5,33,34,39,40,42,45,51,52,53 Atopic dermatitis, broadly characterized as a rash, is also a common adverse effect of IL6R blockers.36,54 Conjunctivitis and pleuritis, though uncommon, have also been reported in clinical trials.22,39,41,44 The association with gout was interesting; however, there is a paucity of data regarding a link with IL6R in the literature.
The IL6R PheWAS confirmed findings from a mendelian randomization study5 and a meta-analysis55 of a reduced risk for a range of coronary heart disease phenotypes among carriers of the IL6R variant in both the MVP and the UK Biobank. Further, studies of cardiac biomarkers in the MVP showed associations of the IL6R variant with lower CK-MB and troponin levels in a dose-dependent manner. These findings agree with results from a randomized clinical trial56 of tocilizumab compared with placebo administered to patients with RA within 2 days after a non–ST elevation myocardial infarction. Patients treated with tocilizumab post–ST elevation myocardial infarction had lower troponin elevations than those who received placebo. A randomized clinical trial, the Assessing the Effect of Anti-IL-6 Treatment in Myocardial Infarction (NCT03004703), is underway and will administer tocilizumab to patients without RA shortly after a myocardial infarction, which may serve to further validate these PheWAS findings.
A notable absent finding was an association of the IL6R variant with RA and GCA. The prevalence of the PheWAS codes for RA was 3.2% in the MVP; the code had a PPV of 66.7% for RA after medical record review. The prevalence of PheWAS codes including GCA was 0.45%, with a PPV of 77.8%. The low prevalence and relatively low accuracy of the codes likely limited the power to detect an association with the single IL6R variant, demonstrating a limitation of the PheWAS. In contrast, the aortic aneurysm phenotypes had a prevalence of 3% but a PPV of 80% to 90%. Additionally, aortic aneurysms related to GCA are included in the aneurysm codes, as there are no specific codes for aneurysms caused by GCA. In the MVP cohort, less than 0.002% received IL6R blockade therapy, and thus the drug itself was unlikely to affect findings.
Limitations
This study had limitations. Since the PheWAS used a single SNP to screen for associations, small effect sizes for associations were anticipated. As in the genome-wide association study,57 the expected effect size for an association of a common genetic variant with a phenotype is typically modest. In a meta-analysis for AAA30 where a precise phenotype, infrarenal diameter greater than 30 mm, was used, the OR of the IL6R variant with AAA was 0.85 (95% CI, 0.79-0.90). In the MVP IL6R PheWAS, a similar modest effect size was observed (OR, 0.87; 95% CI, 0.84-0.90).
Other limitations include the accuracy of ICD-9 codes for defining phenotypes. Studies are underway to improve the accuracy of phenotypes, which in turn would improve the power of PheWAS.58 Because the PheWAS is a hypothesis-generating tool, we applied an FDR of 5% to determine significant associations. Whether the threshold should be more or less stringent or if an alternate approach is needed is also under investigation. Additionally, the phenotypes used in the standard PheWAS mapping8,16 are not independent and can be highly correlated (eg, myocardial infarction and coronary atherosclerosis). Both the Bonferroni and the Benjamini-Hochberg FDR control procedures applied in this study are considered robust in accounting for the presence of positive correlations between outcomes of interest.59 As with all hypothesis-generating approaches, the results require validation in independent cohorts.
The lack of replication for more phenotypes from the MVP in UK Biobank and BioVU may also be because of a lack of power from differences in sample size and depth of data. The online data currently available for UK Biobank are inpatient ICD-10 code groups only. Therefore, the association analysis performed in UK Biobank was between the IL6R variant and an inpatient ICD-10 code group for aortic aneurysm and dissection, a broader phenotype. The prevalence of this broad ICD-10 group in UK Biobank was 0.34%, significantly lower than in the MVP or BioVU. Although BioVU contains both inpatient and outpatient data, it also has a significantly smaller population than the MVP or UK Biobank. We believe the ability to replicate the association of a single IL6R variant with AAA further supported the strength of the association and this approach. Lastly, while the PheWAS may identify potential associations of IL6R with potential drug targets, the correlation between the effect size of a genetic variant with a phenotype can differ from the actual treatment effect.4,60
Conclusions
In conclusion, our study highlighted the PheWAS as a promising approach leveraging clinical EHR data to query potential effects of new therapeutics in cases in which the drug mechanism of action has a clear link with a genetic variant. As a proof-of-concept study, we performed a PheWAS using a genetic variant with biochemical effects similar to a known therapy, tocilizumab and sarilumab. The strongest association observed was a reduced risk of aortic aneurysms, which was in line with the newest indication for IL6R blockade in GCA, in which aortic aneurysm can be a presenting sign. The findings of associations with different subphenotypes of aneurysms, such as AAA and aneurysm of the iliac artery, suggest that the beneficial effects of IL6R antagonist therapy may extend beyond treatment for aneurysms associated with large vessel vasculitis. Additionally, the IL6R PheWAS identified expected drug effects, such as reduction in CRP level, and off-target effects, including atopic dermatitis. Our findings support previous studies of reduced risk for coronary heart disease with IL6R blockade, and ongoing clinical trials may validate these findings. Importantly, this study also demonstrated the important role of biobanks with freely available data, such as UK Biobank and BioVU, as resources that can help to catalyze studies for the research community.
References
- 1.Plenge RM, Scolnick EM, Altshuler D. Validating therapeutic targets through human genetics. Nat Rev Drug Discov. 2013;12(8):581-594. doi: 10.1038/nrd4051 [DOI] [PubMed] [Google Scholar]
- 2.Robinson JR, Denny JC, Roden DM, Van Driest SL. Genome-wide and phenome-wide approaches to understand variable drug actions in electronic health records. Clin Transl Sci. 2018;11(2):112-122. doi: 10.1111/cts.12522 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Diogo D, Tian C, Franklin C, et al. Phenome-wide association studies (PheWAS) across large “real-world data” population cohorts support drug target validation [published online November 13, 2017]. bioRxiv. doi: 10.1101/218875 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Stitziel NO, Won HH, Morrison AC, et al. ; Myocardial Infarction Genetics Consortium Investigators . Inactivating mutations in NPC1L1 and protection from coronary heart disease. N Engl J Med. 2014;371(22):2072-2082. doi: 10.1056/NEJMoa1405386 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Swerdlow DI, Holmes MV, Kuchenbaecker KB, et al. ; Interleukin-6 Receptor Mendelian Randomisation Analysis (IL6R MR) Consortium . The interleukin-6 receptor as a target for prevention of coronary heart disease: a mendelian randomisation analysis. Lancet. 2012;379(9822):1214-1224. doi: 10.1016/S0140-6736(12)60110-X [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Heinrich PC, Behrmann I, Haan S, Hermanns HM, Müller-Newen G, Schaper F. Principles of interleukin (IL)-6-type cytokine signalling and its regulation. Biochem J. 2003;374(pt 1):1-20. doi: 10.1042/bj20030407 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Ferreira RC, Freitag DF, Cutler AJ, et al. Functional IL6R 358Ala allele impairs classical IL-6 receptor signaling and influences risk of diverse inflammatory diseases. PLoS Genet. 2013;9(4):e1003444. doi: 10.1371/journal.pgen.1003444 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Denny JC, Bastarache L, Ritchie MD, et al. Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nat Biotechnol. 2013;31(12):1102-1110. doi: 10.1038/nbt.2749 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Pendergrass SA, Brown-Gentry K, Dudek S, et al. Phenome-wide association study (PheWAS) for detection of pleiotropy within the Population Architecture using Genomics and Epidemiology (PAGE) Network. PLoS Genet. 2013;9(1):e1003087. doi: 10.1371/journal.pgen.1003087 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Bush WS, Oetjens MT, Crawford DC. Unravelling the human genome-phenome relationship using phenome-wide association studies. Nat Rev Genet. 2016;17(3):129-145. doi: 10.1038/nrg.2015.36 [DOI] [PubMed] [Google Scholar]
- 11.Rastegar-Mojarad M, Ye Z, Kolesar JM, Hebbring SJ, Lin SM. Opportunities for drug repositioning from phenome-wide association studies. Nat Biotechnol. 2015;33(4):342-345. doi: 10.1038/nbt.3183 [DOI] [PubMed] [Google Scholar]
- 12.Gaziano JM, Concato J, Brophy M, et al. Million Veteran Program: a mega-biobank to study genetic influences on health and disease. J Clin Epidemiol. 2016;70:214-223. doi: 10.1016/j.jclinepi.2015.09.016 [DOI] [PubMed] [Google Scholar]
- 13.Sudlow C, Gallacher J, Allen N, et al. UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12(3):e1001779. doi: 10.1371/journal.pmed.1001779 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Canela-Xandri O, Rawlik K, Tenesa A. An atlas of genetic associations in UK Biobank [published online August 16, 2017]. bioRxiv. doi: 10.1101/176834 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Crawford D. Exploring data and cohort discovery in the synthetic derivative. https://slideplayer.com/slide/8161934/. Accessed April 30, 2018.
- 16.Denny JC, Ritchie MD, Basford MA, et al. PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations. Bioinformatics. 2010;26(9):1205-1210. doi: 10.1093/bioinformatics/btq126 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen WM. Robust relationship inference in genome-wide association studies. Bioinformatics. 2010;26(22):2867-2873. doi: 10.1093/bioinformatics/btq559 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006;38(8):904-909. doi: 10.1038/ng1847 [DOI] [PubMed] [Google Scholar]
- 19.Cook JP, Morris AP. Multi-ethnic genome-wide association study identifies novel locus for type 2 diabetes susceptibility. Eur J Hum Genet. 2016;24(8):1175-1180. doi: 10.1038/ejhg.2016.17 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B. 1995;57(1):289-300. [Google Scholar]
- 21.Machiela MJ, Chanock SJ. LDlink: a web-based application for exploring population-specific haplotype structure and linking correlated alleles of possible functional variants. Bioinformatics. 2015;31(21):3555-3557. doi: 10.1093/bioinformatics/btv402 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.De Benedetti F, Brunner HI, Ruperto N, et al. ; PRINTO; PRCSG . Randomized trial of tocilizumab in systemic juvenile idiopathic arthritis. N Engl J Med. 2012;367(25):2385-2395. doi: 10.1056/NEJMoa1112802 [DOI] [PubMed] [Google Scholar]
- 23.Genovese MC, Kremer JM, van Vollenhoven RF, et al. Transaminase levels and hepatic events during tocilizumab treatment: pooled analysis of long-term clinical trial safety data in rheumatoid arthritis. Arthritis Rheumatol. 2017;69(9):1751-1761. doi: 10.1002/art.40176 [DOI] [PubMed] [Google Scholar]
- 24.Hammoudeh M, Al Awadhi A, Hasan EH, Akhlaghi M, Ahmadzadeh A, Sadeghi Abdollahi B. Safety, tolerability, and efficacy of tocilizumab in rheumatoid arthritis: an open-label phase 4 study in patients from the Middle East [published online April 30, 2015]. Int J Rheumatol. doi: 10.1155/2015/975028 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Hashimoto M, Fujii T, Hamaguchi M, et al. Increase of hemoglobin levels by anti-IL-6 receptor antibody (tocilizumab) in rheumatoid arthritis. PLoS One. 2014;9(5):e98202. doi: 10.1371/journal.pone.0098202 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Navarro G, Taroumian S, Barroso N, Duan L, Furst D. Tocilizumab in rheumatoid arthritis: a meta-analysis of efficacy and selected clinical conundrums. Semin Arthritis Rheum. 2014;43(4):458-469. doi: 10.1016/j.semarthrit.2013.08.001 [DOI] [PubMed] [Google Scholar]
- 27.Stone JH, Tuckwell K, Dimonaco S, et al. Trial of tocilizumab in giant-cell arteritis. N Engl J Med. 2017;377(4):317-328. doi: 10.1056/NEJMoa1613849 [DOI] [PubMed] [Google Scholar]
- 28.Austen WG, Blennerhassett JB. Giant-cell aortitis causing an aneurysm of the ascending aorta and aortic regurgitation. N Engl J Med. 1965;272:80-83. doi: 10.1056/NEJM196501142720205 [DOI] [PubMed] [Google Scholar]
- 29.Evans JM, Bowles CA, Bjornsson J, Mullany CJ, Hunder GG. Thoracic aortic aneurysm and rupture in giant cell arteritis: a descriptive study of 41 cases. Arthritis Rheum. 1994;37(10):1539-1547. doi: 10.1002/art.1780371020 [DOI] [PubMed] [Google Scholar]
- 30.Jones GT, Tromp G, Kuivaniemi H, et al. Meta-analysis of genome-wide association studies for abdominal aortic aneurysm identifies four new disease-specific risk loci. Circ Res. 2017;120(2):341-353. doi: 10.1161/CIRCRESAHA.116.308765 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Harrison SC, Smith AJ, Jones GT, et al. ; Aneurysm Consortium . Interleukin-6 receptor pathways in abdominal aortic aneurysm. Eur Heart J. 2013;34(48):3707-3716. doi: 10.1093/eurheartj/ehs354 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Bijlsma JWJ, Welsing PMJ, Woodworth TG, et al. Early rheumatoid arthritis treated with tocilizumab, methotrexate, or their combination (U-Act-Early): a multicentre, randomised, double-blind, double-dummy, strategy trial. Lancet. 2016;388(10042):343-355. doi: 10.1016/S0140-6736(16)30363-4 [DOI] [PubMed] [Google Scholar]
- 33.Burmester GR, Lin Y, Patel R, et al. Efficacy and safety of sarilumab monotherapy versus adalimumab monotherapy for the treatment of patients with active rheumatoid arthritis (MONARCH): a randomised, double-blind, parallel-group phase III trial. Ann Rheum Dis. 2017;76(5):840-847. doi: 10.1136/annrheumdis-2016-210310 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Fleischmann R, van Adelsberg J, Lin Y, et al. Sarilumab and nonbiologic disease-modifying antirheumatic drugs in patients with active rheumatoid arthritis and inadequate response or intolerance to tumor necrosis factor inhibitors. Arthritis Rheumatol. 2017;69(2):277-290. doi: 10.1002/art.39944 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Gabay C, Emery P, van Vollenhoven R, et al. ; ADACTA Study Investigators . Tocilizumab monotherapy versus adalimumab monotherapy for treatment of rheumatoid arthritis (ADACTA): a randomised, double-blind, controlled phase 4 trial [published correction appears in Lancet. 2013;381(9877):1540]. Lancet. 2013;381(9877):1541-1550. doi: 10.1016/S0140-6736(13)60250-0 [DOI] [PubMed] [Google Scholar]
- 36.Genovese MC, Fleischmann R, Kivitz AJ, et al. Sarilumab plus methotrexate in patients with active rheumatoid arthritis and inadequate response to methotrexate: results of a phase III study. Arthritis Rheumatol. 2015;67(6):1424-1437. doi: 10.1002/art.39093 [DOI] [PubMed] [Google Scholar]
- 37.Isaacs JD, Harari O, Kobold U, Lee JS, Bernasconi C. Effect of tocilizumab on haematological markers implicates interleukin-6 signalling in the anaemia of rheumatoid arthritis. Arthritis Res Ther. 2013;15(6):R204. doi: 10.1186/ar4397 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Kim GW, Lee NR, Pi RH, et al. IL-6 inhibitors for treatment of rheumatoid arthritis: past, present, and future. Arch Pharm Res. 2015;38(5):575-584. doi: 10.1007/s12272-015-0569-8 [DOI] [PubMed] [Google Scholar]
- 39.Maini RN, Taylor PC, Szechinski J, et al. ; CHARISMA Study Group . Double-blind randomized controlled clinical trial of the interleukin-6 receptor antagonist, tocilizumab, in European patients with rheumatoid arthritis who had an incomplete response to methotrexate. Arthritis Rheum. 2006;54(9):2817-2829. doi: 10.1002/art.22033 [DOI] [PubMed] [Google Scholar]
- 40.McInnes IB, Thompson L, Giles JT, et al. Effect of interleukin-6 receptor blockade on surrogates of vascular risk in rheumatoid arthritis: MEASURE, a randomised, placebo-controlled study. Ann Rheum Dis. 2015;74(4):694-702. doi: 10.1136/annrheumdis-2013-204345 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Nishimoto N, Ito K, Takagi N. Safety and efficacy profiles of tocilizumab monotherapy in Japanese patients with rheumatoid arthritis: meta-analysis of six initial trials and five long-term extensions. Mod Rheumatol. 2010;20(3):222-232. doi: 10.3109/s10165-010-0279-5 [DOI] [PubMed] [Google Scholar]
- 42.Sieper J, Braun J, Kay J, et al. Sarilumab for the treatment of ankylosing spondylitis: results of a phase II, randomised, double-blind, placebo-controlled study (ALIGN). Ann Rheum Dis. 2015;74(6):1051-1057. doi: 10.1136/annrheumdis-2013-204963 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Strang AC, Bisoendial RJ, Kootte RS, et al. Pro-atherogenic lipid changes and decreased hepatic LDL receptor expression by tocilizumab in rheumatoid arthritis. Atherosclerosis. 2013;229(1):174-181. doi: 10.1016/j.atherosclerosis.2013.04.031 [DOI] [PubMed] [Google Scholar]
- 44.Villiger PM, Adler S, Kuchen S, et al. Tocilizumab for induction and maintenance of remission in giant cell arteritis: a phase 2, randomised, double-blind, placebo-controlled trial. Lancet. 2016;387(10031):1921-1927. doi: 10.1016/S0140-6736(16)00560-2 [DOI] [PubMed] [Google Scholar]
- 45.Zhang X, Georgy A, Rowell L. Pharmacokinetics and pharmacodynamics of tocilizumab, a humanized anti-interleukin-6 receptor monoclonal antibody, following single-dose administration by subcutaneous and intravenous routes to healthy subjects. Int J Clin Pharmacol Ther. 2013;51(6):443-455. doi: 10.5414/CP201819 [DOI] [PubMed] [Google Scholar]
- 46.Nishimoto N, Yoshizaki K, Miyasaka N, et al. Treatment of rheumatoid arthritis with humanized anti-interleukin-6 receptor antibody: a multicenter, double-blind, placebo-controlled trial. Arthritis Rheum. 2004;50(6):1761-1769. doi: 10.1002/art.20303 [DOI] [PubMed] [Google Scholar]
- 47.Nishimoto N, Hashimoto J, Miyasaka N, et al. Study of active controlled monotherapy used for rheumatoid arthritis, an IL-6 inhibitor (SAMURAI): evidence of clinical and radiographic benefit from an x ray reader-blinded randomised controlled trial of tocilizumab. Ann Rheum Dis. 2007;66(9):1162-1167. doi: 10.1136/ard.2006.068064 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Patel AM, Moreland LW. Interleukin-6 inhibition for treatment of rheumatoid arthritis: a review of tocilizumab therapy. Drug Des Devel Ther. 2010;4:263-278. doi: 10.2147/DDDT.S14099 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Nieken J, Mulder NH, Buter J, et al. Recombinant human interleukin-6 induces a rapid and reversible anemia in cancer patients. Blood. 1995;86(3):900-905. [PubMed] [Google Scholar]
- 50.Atkins MB, Kappler K, Mier JW, Isaacs RE, Berkman EM. Interleukin-6-associated anemia: determination of the underlying mechanism. Blood. 1995;86(4):1288-1291. [PubMed] [Google Scholar]
- 51.Dougados M, Kissel K, Sheeran T, et al. Adding tocilizumab or switching to tocilizumab monotherapy in methotrexate inadequate responders: 24-week symptomatic and structural results of a 2-year randomised controlled strategy trial in rheumatoid arthritis (ACT-RAY). Ann Rheum Dis. 2013;72(1):43-50. doi: 10.1136/annrheumdis-2011-201282 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Huizinga TW, Fleischmann RM, Jasson M, et al. Sarilumab, a fully human monoclonal antibody against IL-6Rα in patients with rheumatoid arthritis and an inadequate response to methotrexate: efficacy and safety results from the randomised SARIL-RA-MOBILITY Part A trial. Ann Rheum Dis. 2014;73(9):1626-1634. doi: 10.1136/annrheumdis-2013-204405 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Lee EB, Daskalakis N, Xu C, et al. Disease-drug interaction of sarilumab and simvastatin in patients with rheumatoid arthritis. Clin Pharmacokinet. 2017;56(6):607-615. doi: 10.1007/s40262-016-0462-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Lin CT, Huang WN, Hsieh CW, et al. Safety and effectiveness of tocilizumab in treating patients with rheumatoid arthritis: a three-year study in Taiwan. J Microbiol Immunol Infect. 2017;S1684-1182(17)30105-6. [DOI] [PubMed] [Google Scholar]
- 55.Sarwar N, Butterworth AS, Freitag DF, et al. ; IL6R Genetics Consortium Emerging Risk Factors Collaboration . Interleukin-6 receptor pathways in coronary heart disease: a collaborative meta-analysis of 82 studies. Lancet. 2012;379(9822):1205-1213. doi: 10.1016/S0140-6736(11)61931-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Kleveland O, Kunszt G, Bratlie M, et al. Effect of a single dose of the interleukin-6 receptor antagonist tocilizumab on inflammation and troponin T release in patients with non-ST-elevation myocardial infarction: a double-blind, randomized, placebo-controlled phase 2 trial. Eur Heart J. 2016;37(30):2406-2413. doi: 10.1093/eurheartj/ehw171 [DOI] [PubMed] [Google Scholar]
- 57.Bush WS, Moore JH. Chapter 11: genome-wide association studies. PLoS Comput Biol. 2012;8(12):e1002822. doi: 10.1371/journal.pcbi.1002822 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Sinnott JA, Cai F, Yu S, et al. PheProb: probabilistic phenotyping using diagnosis codes to improve power for genetic association studies [published online May 17, 2018]. J Am Med Inform Assoc. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Benjamini Y, Yekutieli D. The control of the false discovery rate in multiple testing under dependency. Ann Stat. 2001;29(4):1165-1188. [Google Scholar]
- 60.Cannon CP, Blazing MA, Giugliano RP, et al. ; IMPROVE-IT Investigators . Ezetimibe added to statin therapy after acute coronary syndromes. N Engl J Med. 2015;372(25):2387-2397. doi: 10.1056/NEJMoa1410489 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.