Abstract
Platelet aggregation at the site of atherosclerotic vascular injury is the underlying pathophysiology of myocardial infarction and stroke. To build upon prior GWAS, here we report on 16 loci identified through a whole genome sequencing (WGS) approach in 3,855 NHLBI Trans-Omics for Precision Medicine (TOPMed) participants deeply phenotyped for platelet aggregation. We identify the RGS18 locus, which encodes a myeloerythroid lineage-specific regulator of G-protein signaling that co-localizes with expression quantitative trait loci (eQTL) signatures for RGS18 expression in platelets. Gene-based approaches implicate the SVEP1 gene, a known contributor of coronary artery disease risk. Sentinel variants at RGS18 and PEAR1 are associated with thrombosis risk and increased gastrointestinal bleeding risk, respectively. Our WGS findings add to previously identified GWAS loci, provide insights regarding the mechanism(s) by which genetics may influence cardiovascular disease risk, and underscore the importance of rare variant and regulatory approaches to identifying loci contributing to complex phenotypes.
Subject terms: Genome-wide association studies, Quantitative trait loci, Coagulation system, Cardiovascular genetics, Platelets
Platelet aggregation is associated with myocardial infarction and stroke. Here, the authors have conducted a whole genome sequencing association study on platelet aggregation, discovering a locus in RGS18, where enhancer assays suggest an effect on activity of haematopoeitic lineage transcription factors.
Introduction
Atherosclerotic cardiovascular diseases (ASCVD) have remained the major cause of morbidity and mortality worldwide. The hallmark of ASCVD is aggregation of activated platelets on a ruptured atherosclerotic plaque followed by thrombus formation1. Hemostasis and platelet aggregation is an evolutionary conserved process that is maintained by a delicate balance between agonists like ADP and epinephrine and antagonists like prostaglandins2. Prior studies have shown that platelet aggregation in response to agonists is highly heritable with heritability estimates between 40 and 60%3–5. High platelet reactivity at baseline and after inhibition with aspirin is associated with poor cardiovascular outcome6,7. Antiplatelet therapies are standard-of-care for secondary prevention of the complications of occlusions in coronary, cerebral, and peripheral arteries. Prior genome- and exome-wide association studies have identified at least 8 common variants for platelet aggregation in response to different agonists8–11. With the exception of a few limited gene-based scans9,12, no previous genome-wide studies have systematically evaluated the contribution of both common and rare variants to heritability of agonist-induced platelet reactivity. Thus, it is likely that significant missing heritability remains for platelet function traits.
In this work leveraging the scientific resources of the NHLBI Trans-Omics for Precision Medicine (TOPMed) Program, we report the first association study of platelet aggregation in response to variety of physiological stimuli using whole-genome sequencing (WGS) data. We sought to 1) refine previously identified GWAS loci, 2) identify novel loci that determine platelet aggregation in response to different doses of ADP, epinephrine and collagen, 3) examine the collective burden of coding variants on platelet aggregation, and 4) evaluate the collective burden of rare non-coding variants of megakaryocyte-specific super-enhancer regions on platelet aggregation. Extension of genetic findings using biobank resources as well as ex vivo cell-based functional systems were also performed.
Results
Single-variant based tests for association
There were a total of 19 harmonized phenotypic measures of platelet aggregation evaluated in this investigation (Supplementary Table 1). This includes 9 phenotypes for adenosine diphosphate (ADP) as an agonist, 9 for epinephrine, and 4 for collagen. Genome-wide single variant tests for association were performed on ~28 million variants in 3,125 European Americans (EA) and 730 African Americans (AA) (Supplementary Table 2) from the Framingham Heart Study (FHS), Older Order Amish Study (OOA), and the Genetic Study of Atherosclerosis Risk (GeneSTAR). We identified 101 variants associated with platelet aggregation in response to ADP, epinephrine, or collagen (P value < 5 × 10−8, Fig. 1, Supplementary Fig. 1). Using iterative conditional analyses, genome-wide significant variants were refined down to 16 independent loci (Table 1). With the exception of two variants (rs12041331 and chr17:21960955) all loci were associated with platelet aggregation in response to a single agonist (Fig. 1B), and most of the identified loci were not present in the prior array-based approaches8–10 (Table 1, Supplementary Figs. 2–4).
Table 1.
Known vs Novela | chr:pos (hg38) | rsID | ref/alt | Nearest Gene | MAF | ADP | Collagen lag time | Epinephrine | |||
---|---|---|---|---|---|---|---|---|---|---|---|
P | beta | P | beta | P | beta | ||||||
Novel | 1:20567949 | rs12137738 | A/T | FAM43B,CDA | 0.077 | 2.62E−04 | 0.22 | 1.71E−02 | −0.108 | 1.04E−08 | 0.306 |
Novel | 1:67128641 | rs142001088 | C/T | C1orf141 | 0.018 | 2.88E−02 | 0.203 | 9.25E−09 | −0.503 | 2.51E−03 | 0.276 |
Known | 1:156899922 | rs12041331 | G/A | PEAR1 | 0.148 | 7.61E−17 | −0.329 | 7.58E−17 | 0.317 | 2.31E−18 | −0.358 |
Novel | 1:192194880 | rs1175170 | G/C | RGS18,RGS21 | 0.442 | 7.86E−06 | 0.123 | 2.37E−02 | −0.057 | 1.96E−09 | 0.155 |
Novel | 5:19109993 | rs112157462 | T/C | LINC02223,CDH18 | 0.022 | 1.64E−02 | −0.281 | 1.19E−08 | 0.458 | 5.90E−03 | −0.23 |
Novel | 6:121921871 | rs58250884 | A/G | GJA1,HSF2 | 0.087 | 1.69E−03 | −0.153 | 2.34E−02 | 0.106 | 2.22E−08 | −0.273 |
Novel | 9:28873884 | rs185159562 | T/A | LINGO2 | 0.005 | 1.16E−02 | −0.447 | 1.45E−01 | 0.243 | 3.87E−08 | −0.988 |
Novel | 10:75490891 | rs138028657 | A/G | LRMDA | 0.006 | 6.44E−01 | 0.073 | 1.52E−08 | −0.858 | 6.14E−01 | −0.102 |
Known | 10:111139289 | rs7097060 | T/A | ADRA2A,GPAM | 0.137 | 5.45E−01 | −0.028 | 6.73E−01 | −0.015 | 6.68E−12 | −0.251 |
Novel | 11:92185065 | rs183146849 | A/T | DISC1FP1,FAT3 | 0.012 | 3.11E−08 | −0.702 | 4.38E−01 | −0.084 | 5.73E−04 | −0.376 |
Novel | 12:132589485 | rs140148392 | G/A | FBRSL1,LRCOL1 | 0.009 | 1.51E−01 | 0.254 | 2.88E−08 | 0.669 | 3.54E−01 | 0.138 |
Novel | 13:96912429 | rs61974290 | A/G | HS6ST3,LINC00359 | 0.057 | 9.58E−02 | −0.088 | 9.43E−03 | 0.134 | 3.36E−09 | −0.4 |
Novel | 17:16451482 | rs575524466 | G/A | LRRC75A-AS1 | 0.003 | 1.90E−01 | −0.315 | 3.13E−08 | 1.169 | 5.35E−02 | −0.463 |
Novel | 17:21960955 | – | A/T | KCNJ18,UBBP4 | 0.276 | 2.87E−08 | 0.167 | 1.88E−03 | −0.089 | 7.73E−19 | 0.26 |
Novel | 18:29059923 | rs138845468 | TAAATA/T | CDH2,MIR302F | 0.082 | 6.99E−02 | 0.087 | 3.23E−08 | −0.25 | 1.21E−01 | 0.087 |
Novel | 20:50142397 | rs542707094 | CTG/C | TMEM189,TMEM189-UBE2V1 | 0.003 | 3.16E−02 | −0.486 | 3.53E−08 | 1.194 | 3.05E−02 | −0.487 |
P values presented are a summary across all individual phenotypes for the single agonist (i.e., the minimum P value for the SNV from 8, 7 and 4 individual phenotypes for epinephrine, ADP, and collagen, respectively), alternate allele (alt) represents minor allele (see Supplementary Table 1). P values are from a two-sided score test with no adjustment for multiple testing.
aLoci were defined as known if they were identified in the prior array-based GWAS approaches8–10, else they were labeled as novel.
Replication of the single-variant results
Replication of discovery findings was performed in up to 2,009 independent samples from FHS, OOA, and GeneSTAR (Supplementary Data 1), and extended into an independent cohort (the Caerphilly Prospective Study [CaPS], N = 1183) for ADP and collagen-induced platelet aggregation phenotypes8,13 (Supplementary Table 4). Among the 7 previously reported loci10, 2 were replicated in this investigation (PEAR1 and ADRA2A, Table 1). Reduction in sample size, a low overlapping percentage (<75%) of participants in 2 of the previously studied cohorts (FHS and GeneSTAR European samples), addition of subjects (OOA and GeneSTAR African Americans), and the difference between WGS data and HapMap imputed dosage data (Supplementary Table 5a) may explain, in part, the lack of association observed with the other 5 previously identified loci. Meta-analysis, as opposed to mega-analysis approaches, did not meaningfully change the interpretation of these findings (Supplementary Table 5a, b) comparing the current WGS results to prior studies. Comparison of previous results with the current investigation for the RGS18 variant is shown in Supplementary Table 5b; all other newly-identified WGS variants from this study were not available in the previous investigation.
Co-localization of the genetic loci with eQTLs in platelets
Given that all 16 loci identified using single-variant approaches are located in non-coding regions of the genome, we tested for co-localization between these regions and eQTL data available through RNA sequencing of platelets in 180 European Americans from GeneSTAR (Supplementary Table 6). We found that sentinel variants in the PEAR1 and RGS18 loci were eQTLs for PEAR1 and RGS18, respectively. No co-localization was noted for any of the remaining 14 loci. As noted in Fig. 2a, there is likely only a single variant accounting for the PEAR1 GWAS peak, in contrast to RGS18 where there are likely several causal variants.
PheWAS in external Biobanks
An examination of the sentinel variants reported in Table 1 was performed in the UK Biobank and BioVU as presented in Supplementary Data 2. The minor allele (A) of PEAR1 at rs12041331, which is known to be associated with lesser platelet aggregation, PEAR1 RNA, and protein expression14,15, was associated with increased odds of gastrointestinal bleeding in both EAs and AAs in the BioVU Biobank PheWAS.
Functional follow up of the RGS18 locus
In the RGS18 region, several variants were replicated using independent samples (Supplementary Data 1), and additional evidence was also observed for ADP and collagen aggregation phenotypes in the CaPS study (Supplementary Table 4). Overlaying the associated variants with platelet eQTLs and megakaryocytic epigenome features, there are several potential candidate polymorphisms (Supplementary Table 8). Consistent with our human results, independent Rgs18−/− mouse studies suggest Rgs18 inhibits pre-agonist stimulated platelet reactivity, with knockouts exhibiting exaggerated platelet reactivity to multiple agonist pathways, decreased bleeding times, and increased arterial occlusion16,17. This is attributed to a loss of inhibition of multiple G-protein coupled receptor signaling pathways in platelets18. The minor allele (C) of RGS18 at rs1175170, is associated with arterial thrombosis/embolization in both EAs and AAs in the BioVU BioBank (Supplementary Data 2). Allele-specific and transcription-factor overexpression studies suggest that rs12070423, which may disrupt a GATA1 target site, and rs4495675, which may disrupt a NFE2 target site, both reduce RGS18 expression (Fig. 2b, Supplementary Table 9, Supplementary Figs. 4 and 5). These SNPs are in LD (minimum r2 0.614) with rs1175170 suggesting they may be functional variants on the same haplotype that affect RGS18-mediated platelet activation.
Genes identified through rare variant based approaches
SKAT19 gene-based tests using a MAF threshold of 0.05 were conducted for deleterious variants mapping to 17,774 protein-coding genes (Supplementary Table 10, Supplementary Fig. 6) with significant findings after Bonferroni correction for SVEP1 (ADP-induced platelet aggregation, P value = 2.6 × 10−6), BCO1 (epinephrine-induced platelet aggregation, P = 8.9 × 10−7), NELFA (collagen-induced platelet aggregation, P = 1.7 × 10−6) and IDH3A (collagen-induced platelet aggregation, P value = 2.6 × 10−6). Through leave-one-out analysis, we observed that these associations were driven mainly by single or limited sets of rare variants (Supplementary Fig. 7, Supplementary Table 11). For example, the SVEP1 association with ADP-induced platelet aggregation was solely driven by a nonsynonymous variant (Gly229Arg) in the second exon (rs61751937, MAF 0.028, P value = 5.8 × 10−6). This variant alters a highly conserved residue located in the protein’s VWFa domain (Fig. 3A–C). The finding remained significant in the replication cohort (P value = 0.004, Supplementary Table 3) and CaPS (P value = 0.008, Supplementary Table 4), both of which demonstrated an association with increased ADP-induced platelet reactivity. Both variants are modestly associated with CVD outcomes in the UK BioBank (Supplementary Table 7).
The role of genetic variants in MK-specific super-enhancers
To investigate the role of genetic variation on regulatory importance in the context of super-enhancers, we aggregated rare non-coding variants across a set of 1,065 published MK-specific super-enhancers (Supplementary Fig. 8)20. We found rare non-coding variants in a super-enhancer at the PEAR1 locus were significantly associated with ADP- (P = 2.4 × 10−8), epinephrine- (P value = 1.1 × 10−7) and collagen- (P value = 2.7 × 10−5) induced platelet aggregation. We observed, in marked contrast to our gene-based coding variant analyses, that the association signal in the PEAR1 super-enhancer is driven by multiple rare variants in the region (Supplementary Fig. 9).
Discussion
In this WGS study of platelet aggregation, we identify and replicate several loci contributing to trait variation. A WGS approach continues to validate the importance of the PEAR1 locus. Previous work demonstrated a single, common (~14% MAF) intronic peak variant in PEAR1 (rs12041331) is associated with platelet phenotypes using GWAS, regional sequencing, and exonic approaches8,12,14,21. The minor allele of rs12041331 is linked to decreased PEAR1 platelet protein levels14,15, potentially through alteration of a methylation site in MKs22. In addition, the role of this gene in platelet signaling is supported by mechanistic studies23,24. Here, a sequencing-based approach followed by co-localization with platelet eQTLs reveal that results are consistent with a model that a single, common causal variant explains the platelet reactivity signal with respect to PEAR1. Similar to the case of PEAR1, we recently identified a single strong regulatory SNP, rs10886430 intronic to GRK5, that affects a GATA1 transcription factor site and regulates platelet gene expression in a highly cell-type specific manner, ultimately accounting for ~20% of variation in thrombin-platelet reactivity via PAR4 receptor regulation, and being causally related to both venous and arterial disease risk11. These examples demonstrate how single SNPs of large effect can be identified and ultimately associated with CVD endpoints but require detailed studies of agonist-specific phenotypes and cell-specific expression patterns that will otherwise be missed.
The proteins RGS10 and RGS18 are highly expressed in platelets and are important regulators of G protein signaling that plays a role in multiple pathways of activation in platelets. Our results indicate common RGS18 platelet regulatory alleles modulate human platelet function likely through GATA1 and/or NFE2 interacting sites. Furthermore, our findings in independent biobanks and ancestry groups that the allele that leads to increased platelet reactivity is also associated with cardiovascular and thrombotic outcomes including occlusions, cerebrovascular disease, cardiac arrest, embolism and deep vein thrombosis suggest that RGS18 may be a critical node for intervention in platelets.
The WGS approach allowing for a rare-variant gene-based analysis suggests that SVEP1 may have previously unappreciated and multifactorial roles in contributing to CVD. Homozygous Svep1−/− mice die from edema, and heterozygous mice, as well as zebrafish, experience arterial and lymphatic vessel malformations25–27. Consistent with previous investigations28, RNAseq data in a subset of GeneSTAR participants do not indicate expression of SVEP1 in platelets. We find that rs61751937 is the strongest plasma protein QTL for SVEP1 (P value = 5.2 × 10−64), reducing expression29, suggesting the effects may be mediated through interactions of platelets with other cell types in circulation. This conserved protein could potentially affect platelet function and CVD through several mechanisms including cell-cell adhesion, cell differentiation, and functions in bone marrow niches30. Recent functional work demonstrates that SVEP1 is expressed in plaques. Further experiments suggested that deficiency of Svep1 affects Cxcl1 endothelial release and promotes proinflammatory leukocyte recruitment to plaques31. Given our assays are ex vivo assessments of platelet function in PRP lacking endothelial, leukocyte and smooth muscle cells, this suggests that alteration of SVEP1 levels or other related factors in plasma may also have direct effects on platelets that may influence thrombus formation.
In conclusion, there is a large body of evidence supporting the hypothesis that hyper-reactive platelets may predict future thromboses in both healthy individuals6 and those who have already experienced thrombosis32,33. Therefore, better understanding of the genetic determinants of heightened platelet aggregation is likely critical in the early prediction of thrombosis events as well as aiding in pharmacogenetic efforts pertaining to antiplatelet therapy. By applying contemporary WGS strategies in participants with extensive platelet reactivity phenotype data, we show the potential for such approaches to identify genetic determinants that may impact such traits.
Methods
Description of study populations
GeneSTAR
The Genetic Study of Atherosclerosis Risk (GeneSTAR) is an ongoing, prospective family-based study designed to explore environmental, phenotypic, and genetic causes of premature cardiovascular disease. Participants were recruited from European- and African-American families (n = 891) identified from probands who were hospitalized for a coronary disease event prior to 60 years of age in any of 10 Baltimore, Maryland area hospitals. Apparently healthy siblings of the probands, offspring of the siblings and probands, and the co-parents of the offspring were screened for traditional coronary disease and stroke risk factors as part of a study of platelet function prior to and following a 2-week trial of 81 mg/day of aspirin from 2003 to 20063,34. All measures described here were obtained prior to the commencement of aspirin. Exclusion criteria included: 1) any coronary heart disease or vascular thrombotic event, 2) any bleeding disorder or hemorrhagic event (e.g., stroke or gastrointestinal bleed), 3) current use of any anticoagulants or antiplatelet agents (i.e., warfarin, persantin, clopidogrel), 4) current use of chronic or acute nonsteroidal anti-inflammatory agents, including COX-2 inhibitors that could not be discontinued, 5) recent active gastrointestinal disorder, 6) current pharmacotherapy for a gastrointestinal disorder, 7) pregnancy or risk of pregnancy during the trial, 8) recent menorrhagia, 9) known aspirin intolerance or allergic side effects, 10) serious medical disorders, (e.g., autoimmune diseases, renal or hepatic failure, cancer or HIV-AIDS), 11) current chronic or acute use of glucocorticosteroid therapy or any drug that may interfere with the measured outcomes, 12) serious psychiatric disorders, and, 13) inability to independently make a decision to participate. Of the 3003 participants in the aspirin trial, 1786 were selected for whole-genome sequencing (WGS) in the Trans-Omics for Precision Medicine (TOPMed) Program based on 1) complete platelet function phenotyping and 2) largest family size.
Framingham Heart Study
The Framingham Heart Study (FHS) is a longitudinal family-based study that started to recruit participants of European ancestry in 1948 and now is on its third generation of participants. The Original cohort (first generation) contains 5209 participants, the Offspring cohort (second generation), began to recruit in 1971, contains 5124 participants, and the Third Generation cohort, began to recruit in 2002, contains 4095 participants. In the present study, we use data from the Offspring cohort10. For FHS, aspirin use was determined based on arachidonic acid and review of platelet aggregation curves.
Old Order Amish (OOA)
As part of the Amish Complex Disease Research Program, a prospective cohort trial examining the relationship between genetic variants and agonist-induced platelet function at baseline and in response to clopidogrel and aspirin was performed. Characteristics of this cohort have been described previously35. Briefly, Amish participants who were over age 20, generally healthy, and agreed to discontinue the use of medications, supplements, and vitamins for at least one week prior to study initiation were eligible for recruitment. Medical and family histories, anthropometry, physical examinations, and blood samples were obtained after an overnight fast. All measures described here were obtained prior to clopidogrel or aspirin administration. Participants were excluded from participation if any of the following criteria were met: 1) currently pregnant or breastfeeding, 2) history of a bleeding disorder or major spontaneous bleed, 3) severe hypertension (bp >160/95 mm Hg), 4) coexisting malignancy, 5) creatinine >2.0 mg/dl, 6) AST or ALT >2 times the upper limit of normal, 7) Hct <32%, 8) TSH <0.4 or >5.5 mIU/L, 9) platelet count >500,000/ul or <75,000/ul, 10) surgery within the last 6 months, 11) allergy to aspirin or clopidogrel, or 12) unwilling or unable to discontinue any medications that may interfere with the results of the study outcomes.
Written informed consent was obtained from all participants, and each study was approved by their local review board (GeneSTAR- Johns Hopkins Institutional Review Board; FHS- Boston University Institutional Review Board; and OOA- University of Maryland, Baltimore Institutional Review Board).
Platelet function tests and phenotype harmonization
Methods to assess ex vivo platelet function have been described in detail previously8,10. In brief, blood samples were obtained after an overnight fast into 3.2% (or 3.8% in FHS) citrated vacutainer tubes. Platelet-rich and platelet-poor plasma (PRP and PPP, respectively) were isolated by centrifugation (PRP, 180 × g for 15 min in GeneSTAR and OOA, 160 × g for 5 min in FHS; PPP 2000 × g for 10 min in GeneSTAR and OOA, 2500 × g for 20 min for FHS). Light transmittance aggregometry was performed in PRP using a PAP-4 (GeneSTAR and FHS) or a PAP-8E (OOA) aggregometer after stimulation with ADP, epinephrine, or collagen using PPP as a referent. In GeneSTAR, maximal aggregation (% aggregation) was recorded for periods of 5 min after stimulation with ADP (2.0 and 10.0 μM, Chronolog Corp, Haverton, PA) or epinephrine (2.0 and 10.0 μM, Chronolog Corp, Haverton, PA); and lag time to initiation of aggregation was recorded after stimulation with equine tendon–derived type I collagen (1, 2, 5 and 10 μg/ml; Chronolog Corp, Haverton, PA). The same methods, agonists, and agonist concentrations were used in the OOA cohort with the exception that only one concentration of epinephrine (10 μM) was used and an extra concentration of ADP (5 μM) was tested. FHS tested aggregation for periods of 4 min after administration of ADP (1.0, 3.0, 5.0, and 10.0 μM) and 5 min after administration of epinephrine (0.5, 1.0, 3.0, 5.0 and 10.0 μM); and, lag time to aggregation was assessed after stimulation with 190 μg/ml calf skin–derived type I collagen (Bio/Data Corporation, Horsham, PA). Threshold concentrations to ADP and epinephrine (EC50) were determined as the minimal concentration of agonist required to produce >50% aggregation.
Using an adapted two stage procedure36, platelet aggregation traits were adjusted for age, sex and aspirin-use using linear model, and the residuals from the linear model were inverse normal transformed within each cohort. Given the difference in agonist concentrations used between GeneSTAR/OOA and FHS cohorts, predefined phenotypes were identified and harmonized across studies. For ADP, epinephrine, and collagen independently, identical or closely matching agonist concentrations, the transformed residuals were combined across studies for analysis to test for association between genetic variants and low as well as high concentrations of each agonists. In total, 19 traits were defined: three low-dose ADP traits, four high-dose ADP traits, five low-dose epinephrine traits, three high-dose epinephrine traits, two low-dose collagen traits, and two high-dose collagen traits. Additional details regarding platelet phenotype harmonization are shown in Supplementary Table 1.
TOPMed whole-genome sequencing
WGS was performed to an average depth of 38X using DNA isolated from blood, PCR-free library construction, and Illumina HiSeq X technology. All samples used in this set of TOPMed genomes were from Freeze 5b. Details for variant calling and quality control are described in a companion paper by Taliun et al.37. Briefly, variant discovery and genotype calling was performed jointly, across all the available TOPMed Freeze 5b studies, using the GotCloud pipeline resulting in a single, multi-study, genotype call set. Sample-level quality control was performed to check for pedigree errors, discrepancies between self-reported and genetic sex, and concordance with prior genotyping array data.
Variant annotation
Variant annotation was performed using the WGSA738 and dbNSFP39. Variants were annotated as exonic, splicing, ncRNA, UTR5, UTR3, intronic, upstream, downstream, or intergenic. Exonic variants were further annotated as frameshift insertion, frameshift deletion, frameshift block substitution, stopgain, stoploss, nonframeshift insertion, nonframeshift deletion, nonframeshift block substitution, nonsynonymous variant, synonymous variant, or unknown. Additional scores available included REVEL40, MCAP41 or CADD42 effect prediction algorithms.
Single variant tests for association
All analyses in this study were performed on the Analysis Commons43. Variants with minor allele count (MAC) of at least 5 and depth of coverage (DP) of at least 10 were selected for single variant analyses. The GWAS were conducted using GENetic EStimation and Inference in Structured samples (GENESIS)44,45 apps on Analysis Commons. GENESIS uses a linear mixed model with a genetic relationship matrix (GRM) that is robust to population structure and can account for known or cryptic relatedness. The combined transformed residuals were used to conduct null model analysis adjusting for cohort indicators using genesis_nullmodel app (https://github.com/AnalysisCommons/). Single variant analysis and Sequence Kernel Association Test (SKAT) gene-based analyses were performed using genesis_tests app.
We used p < 5 × 10−8 as our genome-wide significant threshold in single variant analysis including conditional analysis for identifying independent signals. Conditional analysis was conducted by selecting the genome-wide significant variant with lowest p value on a chromosome for conditioning and performing single variant analysis on the same chromosome. The procedure was repeated until no genome-wide significant variant is identified in conditional analysis by chromosome. Any variant surpassing genome-wide significance in conditional analysis was considered to be an additional signal independent of conditioned variant(s).
Gene-based coding variant tests for association
To improve the power to identify rare variants in coding regions, we aggregated deleterious rare coding variants in 17,774 protein-coding genes and then tested for association with platelet aggregation phenotypes. To enrich for functional variants, only variants with a “deleterious” consequence for its corresponding gene or genes (http://www.ensembl.org/info/genome/variation/predicted_data.html#consequences), were included. For each protein-coding gene, a set of rare coding variants (MAF < 0.05) was constructed, which was composed of all stop-gain, stop-loss and frameshift variants as well as exonic missense variants that fulfilled one of these criteria: 1) REVEL score >0.5, 2) M_CAP score was “Deleterious”, or 3) CADD score >30. The protein coding variant groupings were tested using SKAT with the beta-distribution parameters of 1 and 25 as proposed by Wu et al.19. Significance was evaluated for each platelet aggregation trait after Bonferroni correction (0.05/17,744 = 2.82 × 10−6).
Next we sought to determine which rare deleterious variants in each significant gene were driving the association signal. We iterated through the variants and removed one variant at a time (leave-one-out approach) and repeated the SKAT analysis. If a variant made a large contribution to the original association signal, one would expect the signal to significantly weaken with removal of the variant from the gene set.
Super-enhancer based rare variant tests for association
We investigated rare non-coding variants with putative regulatory potential by focusing on megakaryocyte-specific super enhancers (MK SEs). The published MK SEs20 were called based on regions identified as enhancers through genome segmentation across a set of six histone modifications (H3K4me1, H3K4me3, H3K9me3, H3K27ac, H3K27me3 and H3K36me3) profiled in the BLUEPRINT project46, aggregating together elements within 12.5 kb and then ranking upon H3K27ac signal with the ROSE algorithm47,48. We annotated rare (MAF < 0.05) non-coding variants located within megakaryocyte DNase I Hypersensitivity Site (DHS) peaks generated by BLUEPRINT and subset to those overlapping with MK SEs. We then applied SKAT, aggregating these non-coding variants on the set of MK SEs (n = 1065) to identify the association of these regulatory elements with platelet aggregation phenotypes. Significance was evaluated for each platelet aggregation trait after Bonferroni correction (0.05/1065 = 4.69 × 10−5). When a gene was identified, we conducted leave-one-out analysis to identify if a variant(s) contributed to the observed signal.
Replication
Additional samples from each cohort which were not included in TOPMed and therefore not included in the discovery analyses were used to replicate the genome-wide significant variants identified in the discovery analyses. In brief, genotype imputation and replication analyses were conducted by each cohort, and then meta-analysis was used to combine cohort replication analysis results. For signals identified in our gene-based tests, instead of conducting gene-based replication analysis, we replicated the single rare variants that drove the signals and that were identified from leave-one-out analyses, as not all selected rare variants had good imputation quality and were available in each cohort49. Each cohort independently, and separately by race for GeneSTAR, imputed the 22 autosomes using the TOPMed Freeze5b reference panel with Minimac450. We implemented sample quality control procedures (excluding duplicate/reference samples and gender mismatches) and genotyping quality control procedures (excluding variants with call rate <95%, HWE p value <10−6, or MAF < 0.5%). After lifting over to build 38, non-ambiguous strand flips were resolved and ambiguous strand flips were removed. Post-imputation quality control was performed considering 6 MAF bins and using an imputation R2 cutoff between 0.3 and 0.8, incrementing by 0.1 such that the mean R2 exceeded 0.8 for each MAF bin. A maximum of 2009 samples were available for replication analyses: 395 OOA, 1289 FHS, 246 GS-EA, and 100 GS-AA.
GS-EA, GS-AA and OOA used the same statistical methods as the discovery analysis for the replication analysis, again using the Analysis Commons. In brief, the platelet traits were adjusted for age, sex and aspirin-use using a linear model separately in each cohort (OOA, GS-EA, GS-AA), and the residuals from these models were then inverse normal transformed within each cohort. Using GENESIS, null models were fitted within each cohort using a linear mixed model with a GRM and no covariates. Using these null models, single variant analyses were performed using GENESIS, again within each cohort. FHS used a linear mixed effects model with a relationship coefficient matrix that accounts for familial correlation implemented in the coxme51 R package to conduct replication analysis, where linear model adjusting for age, sex and aspirin use was used to obtain residuals, the residuals were inverse normalized and then used for testing genetic association. Replication meta-analyses were performed using the sample size weighted approach implemented in METAL52. An imputation quality filter of R2 ≥ 0.7 was applied prior to meta-analysis, that is, at a particular variant, any cohorts with R2 < 0.7 did not contribute to the variant’s meta-analysis. Replication p-values were reported based on a one-sided test since the same effect direction was the expected result to reject the null hypothesis.
Extension of results to an independent GWAS cohort
The Caerphilly Prospective Study (CaPS) participants were relatively healthy, middle aged males at recruitment and their ages at time of blood draw (Exam 2) ranged from 47 to 66. The extent of platelet aggregation to three agonists was measured in PRP adjusted to 300,000 platelets/µl with autologous PPP53. Agonists included collagen (42.7 µg/ml), ADP (0.725 µM/l), and full-length thrombin (0.056 units/ml). The maximal optical density increase due to platelet aggregation was measured and expressed as a proportion of the difference in optimal density between PRP and PPP. Genotyping was performed with the Affymetrix UK BioBank array using the Affymetrix Axiom Analysis program. Following sample and genotyping quality control, imputation on 22 autosomes was performed using the HRC 1.1 reference panel, resulting in ~7.6 million variants with MAF > 0.01 and R2 > 0.4. GWAS was performed with the Efficient Mixed-Model Association eXpedited (EMMAX) package. For each trait, a linear model was constructed adjusting for age and medication use (anticoagulant, antiplatelet, antilipid, hypoglycemics) and single variant analyses were performed on transformed platelet reactivity values. Maximum sample sizes and phenotype transformations were as follows: ADP (n = 1177, natural log), thrombin (n = 1183, square root), and collagen (n = 811, cube root). Although the agonists differed in some cases in dose or type from the discovery efforts, we had the prior hypothesis that platelet-reactivity increasing alleles for one agonist are likely to be reactivity increasing for other agonists/doses. Note that in CaPS collagen maximal aggregation was measured, and collagen lag time unavailable, thus, the expected effect direction would be opposite to our discovery analyses (as observed for PEAR1 rs12041331 in Supplementary Table 4 versus collagen lag time discovery results in Supplementary Data 1). Association extension results in CaPS are reported with beta, standard-error, and one-sided p values relative to the hypothesized direction.
Co-localization of expression quantitative trait loci (eQTL) signatures from platelets in GeneSTAR European Americans
A subset of 180 TOPMed GeneSTAR European Americans samples also had RNA-seq data generated using platelets. eQTL analysis was performed as previously described54. Here, formal Bayesian co-localization was performed using the coloc55–57 package in R for each of the 16 independent loci (Table 1) against all gene transcripts where there was at least one SNP with an eQTL p value p < 0.003125 (0.05/16) for the specific gene within 20 KB of the peak variant. This yielded 10 locus-gene pairs (Supplementary Table 4). coloc tests five mutually exclusive hypotheses: H0, no GWAS and no eQTL association; H1, association with GWAS, but no eQTL; H2, association with eQTL, but no GWAS; H3, eQTL and GWAS association, but with two independent causal variants; and H4, shared causal variants for both eQTL and GWAS. The main interest is to assess whether there is a shared causal variant between eQTL and GWAS (i.e., H4). The package provides five posterior probabilities for these hypotheses (PP0, PP1, PP2, PP3, and PP4) and PP4 of >75% is considered evidence of a colocalization of GWAS and eQTL. Posterior probabilities for individual variants were evaluated once PP4 was met.
Allele-specific and transcription-factor enriched enhancer assays
Cell culture
K562 is a lymphoblastoid human erythroleukemia cell line derived from a female donor. It is a suspension cell line. K562 cells were cultured and maintained in RPMI 1640 media supplemented with 10% FBS (Sigma-Aldrich), Pen/Strep and L-Gln. Cultures were maintained in a humidified environment at 37 °C with 5% CO2. K562 cells were passaged every 24–48 h. HEK293 cells were cultured and maintained at low passage in DMEM media supplemented with 5% FBS (Sigma-Aldrich). Cultures were maintained in a humidified environment at 37 °C with 5% CO2.
Lentivirus production
For Lentivirus production following vectors were used: pInducer-21 lentiviral vector (Addgene), pMD2.G envelope plasmid (Addgene), psPAX2 packaging plasmid (Addgene). 293T-17 cells (ATCC) were cultured and maintained at low passage in DMEM media supplemented with 5% FBS (Sigma-Aldrich). Cultures were maintained in a humidified environment at 37 °C with 5% CO2. 293T-17 cells were passaged every 24–48 h. Lentiviral plasmids possessing open reading frames of GFP (Empty), POLR2A, NRF1, CTCF, FOSL1, GATA1, GATA2, CEBPB, and NFE2 were cloned into pInducer-21 lentiviral vector. For lentivirus production, 293T-17 cells (ATCC) were transfected with third generation packaging plasmids pMD2.G and psPAX2 (Addgene) and lentiviral plasmids POLR2A, NRF1, CTCF, FOSL1, GATA1, GATA2, CEBPB, and NFE2. Viruses were harvested 48 h post transfections and concentrated by ultracentrifugation at 71,286 × g for 2 h at 4 °C. Viruses were titrated by serial dilution on 293 T cells using GFP as an indicator.
RNA extraction, reverse transcription, and RT-qPCR
RNA extraction from variously transduced HEK293 and K562 cells was performed using an RNAeasy kit (Qiagen). Reverse transcription was performed using Superscript III (Invitrogen), using Oligo (dT) 15 primer. Quantitative PCRs were performed in triplicate with Taqman primer prober assays, shown in Methods Table 1 and CFX96 real-time PCR detection system (Bio-rad). Target transcript abundance was calculated relative to ACTB (reference gene) using the 2-ΔΔCT method. Gene specific primer pairs are present in methods section Oligonucleotides.
Enhancer function reporter assay
The following vectors were used in this protocol: pGL3 luciferase reporter (Promega), pGL4.74[hRluc/TK] control vector (Promega). ~200–300 base pair non-coding regions of RGS18, ADRA2A and PEAR1 and the associated alleles surrounding the various SNP variants were cloned into the pGL3 luciferase vector. We created two modified constructs to assess functionality of the various loci: wild-type loci carrying no SNP and knock-in of the various SNPs into the respective loci. The constructs were generated via Vectorbuilder. Constructs were sequenced to confirm the expected genotype and to ensure no off-target mutations were introduced. Dual luciferase reporter assays were performed as described previously with minor modifications11,58. Briefly, GFP, POLR2A, NRF1, CTCF, FOSL1, GATA1, GATA2, CEBPB, or NFE2 overexpressing HEK293 cells or K562 cells were co-transfected with one of the two pGL3 luciferase vectors described above as well pGL3 control according to the manufacturer’s instructions. Firefly and Renilla luciferase reporter activity of cell extracts were measured using the Dual-Glo Luciferase Assay System (Promega) on a microplate reader according to the manufacturer’s instructions. Each treatment was performed in duplicate and the experiment was repeated three times. Assay primer information is given in Supplementary Table 12.
Phenome-wide association study (PheWAS)
The 16 genome-wide significant variants identified from WGS in our discovery cohort (Table 1) as well as significant variants identified in gene-based tests, were examined against clinical phenotypes in the UKBB and BioVU cohorts where available. We queried UKBB GWAS results using SAIGE calculated summary statistics (http://www.nealelab.is/uk-biobank/faq). The BioVU repository contains >250,000 DNA samples obtained from discarded blood samples of consented patients at Vanderbilt University Medical Center (Nashville, TN). De-identified DNA samples in the BioVU repository are linked to 1543 clinical diagnostic codes. Of these 1543 clinical diagnostic codes, we identified 71 diagnoses for which platelet function could be in the pathophysiologic pathway to disease expression. Numerous overlapping disease processes were represented among these 71 codes, which we further categorized into the following 6 phenotypes: arterial thrombosis (30 codes), venous thrombosis (3 codes), hypercoagulable state (2 codes), platelet (5 codes), bleeding (26 codes), and anti-thrombotic medication usage (5 codes). The phecodes were matched between UKBB and BioVU for corresponding allelic results.
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Supplementary information
Acknowledgements
Whole genome sequencing (WGS) for the Trans-Omics in Precision Medicine (TOPMed) program was supported by the National Heart, Lung and Blood Institute (NHLBI). WGS for GeneSTAR (Genetic Study of Atherosclerosis Risk) was performed at Macrogen, Illumina, and the Broad Institute of MIT and Harvard (HHSN268201500014C). WGS for the Old Order Amish (Genetics of Cardiometabolic Health in the Amish) was performed at the Broad Institute of MIT and Harvard (3R01HL121007-01S1). WGS for The Framingham Heart Study (Whole Genome Sequencing and Related Phenotypes in the Framingham Heart Study) was performed at the Broad Institute of MIT and Harvard (HHSN268201500014C). Centralized read mapping and genotype calling, along with variant quality metrics and filtering were provided by the TOPMed Informatics Research Center (3R01HL-117626-02S1). Phenotype harmonization, data management, sample-identity QC, and general study coordination, were provided by the TOPMed Data Coordinating Center (3R01HL-120393-02S1). For the Old Order Amish this investigation was supported by National Institutes of Health grants U01 GM074518, U01 HL105198, R01 HL137922, R01 HL121007, and the University of Maryland Mid-Atlantic Nutrition and Obesity Research Center (P30 DK072488). GeneSTAR was supported by the National Institutes of Health/National Heart, Lung, and Blood Institute (U01 HL72518, HL087698, HL112064, HL11006, HL118356) and by a grant from the National Institutes of Health/National Center for Research Resources (M01-RR000052) to the Johns Hopkins General Clinical Research Center. The Framingham Heart Study is conducted and supported by the NHLBI in collaboration with Boston University (Contract No. N01-HC-25195), its contract with Affymetrix, Inc., for genome-wide genotyping services (Contract No. N02-HL-6–4278 and Contract No. HHSN268201500001I). MHC, BATC and ADJ were supported by NHLBI Intramural funding. The Caerphilly Prospective study was undertaken by the former MRC Epidemiology Unit (South Wales) and was funded by the Medical Research Council of the UK. The data archive is maintained by the School of Social and Community Medicine, University of Bristol. This study makes use of data generated by the BLUEPRINT Consortium. A full list of the investigators who contributed to the generation of the data is available from www.blueprint-epigenome.eu. Funding for the project was provided by the European Union’s Seventh Framework Programme (FP7/2007-2013) under grant agreement no 282510 BLUEPRINT. Additional support came from the National Blood Foundation/American Association of Blood Banks (FP01021164), the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK; U54DK110805) and the National Research Service Award (NRSA)’s Joint Program in Transfusion Medicine (T32 4T32HL066987-15 to A.B.). BioVU resource analyses were supported by National Institutes of Health/National Genome Research Institute grant U01HG009086. The views expressed in this manuscript are those of the authors and do not necessarily represent the views of the National Heart, Lung, and Blood Institute; the National Institutes of Health; or the U.S. Department of Health and Human Services.
Author contributions
A.R.K., M.-H.C., B.A.T.R. led equal roles in analysis, interpretation and writing of the paper. J.P.L., R.A.M., A.D.J. led equal senior roles for the study. A.R.K. wrote the first draft of manuscript with contribution and editing from M.-H.C., B.A.T.R., L.R.Y., M.A.T., J.A.B., L.C.B., N.F., J.P.L., R.A.M. and A.D.J. Genome wide analyses were performed by A.R.K., B.A.T.R., B.J.G., L.R.Y., M.-H.C. and K.R. RNA-sequencing and eQTL analyses were performed by R.A.M., Kai.K., M.A.T., I.R., L.R.Y., A.R.K., Kan.K. and K.I. Imputation of genomic data and replication analyses were performed by M.-H.C., B.A.T.R., L.R.Y., B.J.G., A.P., L.A.C., and M.H.K. L.R.Y., N.F., L.C.B. and R.A.M. were involved in the guidance, collection and analysis for Genetic Study of Atherosclerosis Risk (GeneSTAR) phenotype data. B.J.G., K.R., B.D.M., J.P.L., J.R.O. and A.R.S. were involved in the guidance, collection and analysis for Older Order Amish Study (OAA) phenotype data. B.A.T.R., M.-H.C. and A.D.J. were involved in the guidance and analysis for Framingham Heart Study (FHS) phenotype data. T.M.S. and A.B. established the imMKCL system, and A.B. designed all functional experiments with input from B.A.T.R. and A.D.J. and carried them out. X.Z., Q.W. and B.L. carried out BioVU pheWAS analyses. A.D.J. funded genotyping of the CaPS cohort. M.-H.C. performed genotype QC, calling and imputation of the CaPS cohort with input from A.D.J., B.A.T.R., M.-H.C. and A.D.J. conducted CaPS genotype-phenotype analyses.
Funding
Open Access funding provided by the National Institutes of Health (NIH).
Data availability
TOPMed WGS variant calls are available for all samples through the Database of Genotypes and Phenotypes (dbGaP) under accession number phs001218 for GeneSTAR, phs000956 and phs000391 for OOA and phs000974 for Framingham. Phenotype data for GeneSTAR, OOA and Framingham are also available through this mechanism. Summary statistics are being deposited in the TOPMed GSR (Genomic Summary Results) site. eQTL analysis results used in the co-localization analysis are hosted on a website at: http://www.biostat.jhsph.edu/~kkammers/GeneSTAR/.
Competing interests
The authors declare no competing interests.
Footnotes
Peer review information Nature Communications thanks Willem Ouwehand and the other, anonymous, reviewer for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Ali R. Keramati, Ming-Huei Chen, Benjamin A.T. Rodriguez.
These authors jointly supervised this work: Joshua P. Lewis, Rasika A. Mathias, Andrew D. Johnson.
A list of authors and their affiliations appears at the end of the paper.
Contributor Information
Joshua P. Lewis, Email: jlewis2@som.umaryland.edu
Rasika A. Mathias, Email: rmathias@jhmi.edu
Andrew D. Johnson, Email: johnsonad2@nhlbi.nih.gov
NHLBI Trans-Omics for Precision (TOPMed) Consortium:
Namiko Abe, Goncalo Abecasis, Francois Aguet, Christine Albert, Laura Almasy, Alvaro Alonso, Seth Ament, Peter Anderson, Pramod Anugu, Deborah Applebaum-Bowden, Kristin Ardlie, Dan Arking, Donna K. Arnett, Allison Ashley-Koch, Stella Aslibekyan, Tim Assimes, Paul Auer, Dimitrios Avramopoulos, Najib Ayas, Adithya Balasubramanian, John Barnard, Kathleen Barnes, R. Graham Barr, Emily Barron-Casella, Lucas Barwick, Terri Beaty, Gerald Beck, Diane Becker, Lewis Becker, Rebecca Beer, Amber Beitelshees, Emelia Benjamin, Takis Benos, Marcos Bezerra, Larry Bielak, Joshua Bis, Thomas Blackwell, John Blangero, Eric Boerwinkle, Donald W. Bowden, Russell Bowler, Jennifer Brody, Ulrich Broeckel, Jai Broome, Deborah Brown, Karen Bunting, Esteban Burchard, Carlos Bustamante, Erin Buth, Brian Cade, Jonathan Cardwell, Vincent Carey, Julie Carrier, Cara Carty, Richard Casaburi, Juan P. Casas Romero, James Casella, Peter Castaldi, Mark Chaffin, Christy Chang, Yi-Cheng Chang, Daniel Chasman, Sameer Chavan, Bo-Juen Chen, Wei-Min Chen, Yii-DerIda Chen, Michael Cho, Seung Hoan Choi, Lee-Ming Chuang, Mina Chung, Ren-Hua Chung, Clary Clish, Suzy Comhair, Matthew Conomos, Elaine Cornell, Adolfo Correa, Carolyn Crandall, James Crapo, L. Adrienne Cupples, Joanne Curran, Jeffrey Curtis, Brian Custer, Coleen Damcott, Dawood Darbar, Sean David, Colleen Davis, Michelle Daya, Mariza de Andrade, Lisa de las Fuentes, Paul de Vries, Michael DeBaun, Ranjan Deka, Dawn DeMeo, Scott Devine, Huyen Dinh, Harsha Doddapaneni, Qing Duan, Shannon Dugan-Perez, Ravi Duggirala, Jon Peter Durda, Susan K. Dutcher, Charles Eaton, Lynette Ekunwe, Adel El Boueiz, Patrick Ellinor, Leslie Emery, Serpil Erzurum, Charles Farber, Jesse Farek, Tasha Fingerlin, Matthew Flickinger, Myriam Fornage, Nora Franceschini, Chris Frazar, Mao Fu, Stephanie M. Fullerton, Lucinda Fulton, Stacey Gabriel, Weiniu Gan, Shanshan Gao, Yan Gao, Margery Gass, Heather Geiger, Bruce Gelb, Mark Geraci, Soren Germer, Robert Gerszten, Auyon Ghosh, Richard Gibbs, Chris Gignoux, Mark Gladwin, David Glahn, Stephanie Gogarten, Da-Wei Gong, Harald Goring, Sharon Graw, Kathryn J. Gray, Daniel Grine, Colin Gross, C.Charles Gu, Yue Guan, Xiuqing Guo, Namrata Gupta, David M. Haas, Jeff Haessler, Michael Hall, Yi Han, Patrick Hanly, Daniel Harris, Nicola L. Hawley, Jiang He, Ben Heavner, Susan Heckbert, Ryan Hernandez, David Herrington, Craig Hersh, Bertha Hidalgo, James Hixson, Brian Hobbs, John Hokanson, Elliott Hong, Karin Hoth, Chao Agnes Hsiung, Jianhong Hu, Yi-Jen Hung, Haley Huston, Chii Min Hwu, Marguerite Ryan Irvin, Rebecca Jackson, Deepti Jain, Cashell Jaquish, Jill Johnsen, Andrew Johnson, Craig Johnson, Rich Johnston, Kimberly Jones, Hyun Min Kang, Robert Kaplan, Sharon Kardia, Shannon Kelly, Eimear Kenny, Michael Kessler, Alyna Khan, Ziad Khan, Wonji Kim, John Kimoff, Greg Kinney, Barbara Konkle, Charles Kooperberg, Holly Kramer, Christoph Lange, Ethan Lange, Leslie Lange, Cathy Laurie, Cecelia Laurie, Meryl LeBoff, Jiwon Lee, Sandra Lee, Wen-Jane Lee, Jonathon LeFaive, David Levine, Dan Levy, Joshua Lewis, Xiaohui Li, Yun Li, Henry Lin, Honghuang Lin, Xihong Lin, Simin Liu, Yongmei Liu, Yu Liu, Ruth J. F. Loos, Steven Lubitz, Kathryn Lunetta, James Luo, Ulysses Magalang, Michael Mahaney, Barry Make, Ani Manichaikul, Alisa Manning, JoAnn Manson, Lisa Martin, Melissa Marton, Susan Mathai, Rasika Mathias, Susanne May, Patrick McArdle, Merry-Lynn McDonald, Sean McFarland, Stephen McGarvey, Daniel McGoldrick, Caitlin McHugh, Becky McNeil, Hao Mei, James Meigs, Vipin Menon, Luisa Mestroni, Ginger Metcalf, Deborah A. Meyers, Emmanuel Mignot, Julie Mikulla, Nancy Min, Mollie Minear, Ryan L. Minster, Braxton D. Mitchell, Matt Moll, Zeineen Momin, May E. Montasser, Courtney Montgomery, Donna Muzny, Josyf C. Mychaleckyj, Girish Nadkarni, Rakhi Naik, Take Naseri, Pradeep Natarajan, Sergei Nekhai, Sarah C. Nelson, Bonnie Neltner, Caitlin Nessner, Deborah Nickerson, Osuji Nkechinyere, Kari North, Jeff O’Connell, Tim O’Connor, Heather Ochs-Balcom, Geoffrey Okwuonu, Allan Pack, David T. Paik, Nicholette Palmer, James Pankow, George Papanicolaou, Cora Parker, Gina Peloso, Juan Manuel Peralta, Marco Perez, James Perry, Ulrike Peters, Patricia Peyser, Lawrence S. Phillips, Jacob Pleiness, Toni Pollin, Wendy Post, Julia Powers Becker, Meher Preethi Boorgula, Michael Preuss, Bruce Psaty, Pankaj Qasba, Dandi Qiao, Zhaohui Qin, Nicholas Rafaels, Laura Raffield, Mahitha Rajendran, Vasan S. Ramachandran, D. C. Rao, Laura Rasmussen-Torvik, Aakrosh Ratan, Susan Redline, Robert Reed, Catherine Reeves, Elizabeth Regan, Alex Reiner, Muagututi’a Sefuiva Reupena, Ken Rice, Stephen Rich, Rebecca Robillard, Nicolas Robine, Dan Roden, Carolina Roselli, Jerome Rotter, Ingo Ruczinski, Alexi Runnels, Pamela Russell, Sarah Ruuska, Kathleen Ryan, Ester Cerdeira Sabino, Danish Saleheen, Shabnam Salimi, Sejal Salvi, Steven Salzberg, Kevin Sandow, Vijay G. Sankaran, Jireh Santibanez, Karen Schwander, David Schwartz, Frank Sciurba, Christine Seidman, Jonathan Seidman, Frederic Series, Vivien Sheehan, Stephanie L. Sherman, Amol Shetty, Aniket Shetty, Wayne Hui-Heng Sheu, M. Benjamin Shoemaker, Brian Silver, Edwin Silverman, Robert Skomro, Albert Vernon Smith, Jennifer Smith, Josh Smith, Nicholas Smith, Tanja Smith, Sylvia Smoller, Beverly Snively, Michael Snyder, Tamar Sofer, Nona Sotoodehnia, Adrienne M. Stilp, Garrett Storm, Elizabeth Streeten, Jessica Lasky Su, Yun Ju Sung, Jody Sylvia, Adam Szpiro, Daniel Taliun, Hua Tang, Margaret Taub, Kent D. Taylor, Matthew Taylor, Simeon Taylor, Marilyn Telen, Timothy A. Thornton, Machiko Threlkeld, Lesley Tinker, David Tirschwell, Sarah Tishkoff, Hemant Tiwari, Catherine Tong, Russell Tracy, Michael Tsai, Dhananjay Vaidya, David Van Den Berg, Peter VandeHaar, Scott Vrieze, Tarik Walker, Robert Wallace, Avram Walts, Fei Fei Wang, Heming Wang, Jiongming Wang, Karol Watson, Jennifer Watt, Daniel E. Weeks, Bruce Weir, Scott T. Weiss, Lu-Chen Weng, Jennifer Wessel, Cristen Willer, Kayleen Williams, L. Keoki Williams, Carla Wilson, James Wilson, Lara Winterkorn, Quenna Wong, Joseph Wu, Huichun Xu, Lisa Yanek, Ivana Yang, Ketian Yu, Seyedeh Maryam Zekavat, Yingze Zhang, Snow Xueyan Zhao, Wei Zhao, Xiaofeng Zhu, Michael Zody, and Sebastian Zoellner
Supplementary information
The online version contains supplementary material available at 10.1038/s41467-021-23470-9.
References
- 1.Hennekens CH, Dyken ML, Fuster V. Aspirin as a therapeutic agent in cardiovascular disease: a statement for healthcare professionals from the American Heart Association. Circulation. 1997;96:2751–2753. doi: 10.1161/01.CIR.96.8.2751. [DOI] [PubMed] [Google Scholar]
- 2.Jin RC, Voetsch B, Loscalzo J. Endogenous mechanisms of inhibition of platelet function. Microcirculation. 2005;12:247–258. doi: 10.1080/10739680590925493. [DOI] [PubMed] [Google Scholar]
- 3.Faraday N, et al. Heritability of platelet responsiveness to aspirin in activation pathways directly and indirectly related to cyclooxygenase-1. Circulation. 2007;115:2490–2496. doi: 10.1161/CIRCULATIONAHA.106.667584. [DOI] [PubMed] [Google Scholar]
- 4.Johnson AD. The genetics of common variation affecting platelet development, function and pharmaceutical targeting. J. Thromb. Haemost. 2011;9:246–257. doi: 10.1111/j.1538-7836.2011.04359.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.O’Donnell CJ, et al. Genetic and environmental contributions to platelet aggregation: the Framingham heart study. Circulation. 2001;103:3051–3056. doi: 10.1161/01.CIR.103.25.3051. [DOI] [PubMed] [Google Scholar]
- 6.Puurunen, M. K. et al. ADP platelet hyperreactivity predicts cardiovascular disease in the FHS (Framingham Heart Study). J Am Heart Assoc7, 10.1161/JAHA.118.008522 (2018). [DOI] [PMC free article] [PubMed]
- 7.Qayyum R, et al. Greater collagen-induced platelet aggregation following cyclooxygenase 1 inhibition predicts incident acute coronary syndromes. Clin. Transl. Sci. 2015;8:17–22. doi: 10.1111/cts.12195. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Chen MH, et al. Exome-chip meta-analysis identifies association between variation in ANKRD26 and platelet aggregation. Platelets. 2019;30:164–173. doi: 10.1080/09537104.2017.1384538. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Eicher JD, et al. Whole exome sequencing in the Framingham Heart Study identifies rare variation in HYAL2 that influences platelet aggregation. Thromb. Haemost. 2017;117:1083–1092. doi: 10.1160/TH16-09-0677. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Johnson AD, et al. Genome-wide meta-analyses identifies seven loci associated with platelet aggregation in response to agonists. Nat. Genet. 2010;42:608–613. doi: 10.1038/ng.604. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Rodriguez BAT, et al. A platelet function modulator of thrombin activation is causally linked to cardiovascular disease and affects PAR4 receptor signaling. Am. J. Hum. Genet. 2020;107:211–221. doi: 10.1016/j.ajhg.2020.06.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Keramati AR, et al. Targeted deep sequencing of the PEAR1 locus for platelet aggregation in European and African American families. Platelets. 2019;30:380–386. doi: 10.1080/09537104.2018.1447659. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Eicher JD, Xue L, Ben-Shlomo Y, Beswick AD, Johnson AD. Replication and hematological characterization of human platelet reactivity genetic associations in men from the Caerphilly Prospective Study (CaPS) J. Thromb. Thrombolysis. 2016;41:343–350. doi: 10.1007/s11239-015-1290-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Faraday N, et al. Identification of a specific intronic PEAR1 gene variant associated with greater platelet aggregability and protein expression. Blood. 2011;118:3367–3375. doi: 10.1182/blood-2010-11-320788. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Herrera-Galeano JE, et al. A novel variant in the platelet endothelial aggregation receptor-1 gene is associated with increased platelet aggregability. Arterioscler Thromb. Vasc. Biol. 2008;28:1484–1490. doi: 10.1161/ATVBAHA.108.168971. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Delesque-Touchard N, et al. Regulator of G-protein signaling 18 controls both platelet generation and function. PLoS One. 2014;9:e113215. doi: 10.1371/journal.pone.0113215. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Alshbool FZ, et al. The regulator of G-protein signaling 18 regulates platelet aggregation, hemostasis and thrombosis. Biochem Biophys. Res. Commun. 2015;462:378–382. doi: 10.1016/j.bbrc.2015.04.143. [DOI] [PubMed] [Google Scholar]
- 18.Ma P, et al. Modulating platelet reactivity through control of RGS18 availability. Blood. 2015;126:2611–2620. doi: 10.1182/blood-2015-04-640037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Wu MC, et al. Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 2011;89:82–93. doi: 10.1016/j.ajhg.2011.05.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Petersen R, et al. Platelet function is modified by common sequence variation in megakaryocyte super enhancers. Nat. Commun. 2017;8:16058. doi: 10.1038/ncomms16058. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Kim Y, et al. Targeted deep resequencing identifies coding variants in the PEAR1 gene that play a role in platelet aggregation. PLoS One. 2013;8:e64179. doi: 10.1371/journal.pone.0064179. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Izzi B, et al. Allele-specific DNA methylation reinforces PEAR1 enhancer activity. Blood. 2016;128:1003–1012. doi: 10.1182/blood-2015-11-682153. [DOI] [PubMed] [Google Scholar]
- 23.Nanda N, et al. Platelet endothelial aggregation receptor 1 (PEAR1), a novel epidermal growth factor repeat-containing transmembrane receptor, participates in platelet contact-induced activation. J. Biol. Chem. 2005;280:24680–24689. doi: 10.1074/jbc.M413411200. [DOI] [PubMed] [Google Scholar]
- 24.Izzi, B., Noro, F., Cludts, K., Freson, K. & Hoylaerts, M. F. Cell-specific PEAR1 methylation studies reveal a locus that coordinates expression of multiple genes. Int J Mol Sci19, 10.3390/ijms19041069 (2018). [DOI] [PMC free article] [PubMed]
- 25.Samuelov L, et al. SVEP1 plays a crucial role in epidermal differentiation. Exp. Dermatol. 2017;26:423–430. doi: 10.1111/exd.13256. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Karpanen T, et al. An evolutionarily conserved role for polydom/Svep1 during lymphatic vessel formation. Circ. Res. 2017;120:1263–1275. doi: 10.1161/CIRCRESAHA.116.308813. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Morooka N, et al. Polydom is an extracellular matrix protein involved in lymphatic vessel remodeling. Circ. Res. 2017;120:1276–1288. doi: 10.1161/CIRCRESAHA.116.308825. [DOI] [PubMed] [Google Scholar]
- 28.Eicher JD, et al. Characterization of the platelet transcriptome by RNA sequencing in patients with acute myocardial infarction. Platelets. 2016;27:230–239. doi: 10.3109/09537104.2015.1083543. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Sun BB, et al. Genomic atlas of the human plasma proteome. Nature. 2018;558:73–79. doi: 10.1038/s41586-018-0175-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Sato-Nishiuchi R, et al. Polydom/SVEP1 is a ligand for integrin alpha9beta1. J. Biol. Chem. 2012;287:25615–25630. doi: 10.1074/jbc.M112.355016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Winkler MJ, et al. Functional investigation of the coronary artery disease gene SVEP1. Basic Res. Cardiol. 2020;115:67. doi: 10.1007/s00395-020-00828-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Stuckey TD, et al. Impact of aspirin and clopidogrel hyporesponsiveness in patients treated with drug-eluting stents: 2-year results of a prospective, multicenter registry study. JACC Cardiovasc. Inter. 2017;10:1607–1617. doi: 10.1016/j.jcin.2017.05.059. [DOI] [PubMed] [Google Scholar]
- 33.Price MJ, et al. Platelet reactivity and cardiovascular outcomes after percutaneous coronary intervention: a time-dependent analysis of the Gauging Responsiveness with a VerifyNow P2Y12 assay: Impact on Thrombosis and Safety (GRAVITAS) trial. Circulation. 2011;124:1132–1137. doi: 10.1161/CIRCULATIONAHA.111.029165. [DOI] [PubMed] [Google Scholar]
- 34.Bray PF, et al. Heritability of platelet function in families with premature coronary artery disease. J. Thromb. Haemost. 2007;5:1617–1623. doi: 10.1111/j.1538-7836.2007.02618.x. [DOI] [PubMed] [Google Scholar]
- 35.Shuldiner AR, et al. Association of cytochrome P450 2C19 genotype with the antiplatelet effect and clinical efficacy of clopidogrel therapy. JAMA. 2009;302:849–857. doi: 10.1001/jama.2009.1232. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Sofer T, et al. A fully adjusted two-stage procedure for rank-normalization in genetic association studies. Genet Epidemiol. 2019;43:263–275. doi: 10.1002/gepi.22188. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Taliun D, et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature. 2021;590:290–299. doi: 10.1038/s41586-021-03205-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Liu X, et al. WGSA: an annotation pipeline for human genome sequencing studies. J. Med Genet. 2016;53:111–112. doi: 10.1136/jmedgenet-2015-103423. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Liu X, Wu C, Li C, Boerwinkle E. dbNSFP v3.0: a one-stop database of functional predictions and annotations for human nonsynonymous and splice-site SNVs. Hum. Mutat. 2016;37:235–241. doi: 10.1002/humu.22932. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Ioannidis NM, et al. REVEL: an ensemble method for predicting the pathogenicity of rare missense variants. Am. J. Hum. Genet. 2016;99:877–885. doi: 10.1016/j.ajhg.2016.08.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Jagadeesh KA, et al. M-CAP eliminates a majority of variants of uncertain significance in clinical exomes at high sensitivity. Nat. Genet. 2016;48:1581–1586. doi: 10.1038/ng.3703. [DOI] [PubMed] [Google Scholar]
- 42.Kircher M, et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 2014;46:310–315. doi: 10.1038/ng.2892. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Brody JA, et al. Analysis commons, a team approach to discovery in a big-data environment for genetic epidemiology. Nat. Genet. 2017;49:1560–1563. doi: 10.1038/ng.3968. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Conomos MP, Miller MB, Thornton TA. Robust inference of population structure for ancestry prediction and correction of stratification in the presence of relatedness. Genet Epidemiol. 2015;39:276–293. doi: 10.1002/gepi.21896. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Manichaikul A, et al. Robust relationship inference in genome-wide association studies. Bioinformatics. 2010;26:2867–2873. doi: 10.1093/bioinformatics/btq559. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Stunnenberg, H. G. International Human Epigenome Consortium; Martin Hirst. The International Human Epigenome Consortium: a blueprint for scientific collaboration and discovery. Cell167, 1145–1149 (2016). [DOI] [PubMed]
- 47.Whyte WA, et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell. 2013;153:307–319. doi: 10.1016/j.cell.2013.03.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Loven J, et al. Selective inhibition of tumor oncogenes by disruption of super-enhancers. Cell. 2013;153:320–334. doi: 10.1016/j.cell.2013.03.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Van Hout CV, et al. Exome sequencing and characterization of 49,960 individuals in the UK Biobank. Nature. 2020;586:749–756. doi: 10.1038/s41586-020-2853-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Das S, et al. Next-generation genotype imputation service and methods. Nat. Genet. 2016;48:1284–1287. doi: 10.1038/ng.3656. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Ripatti S, Palmgren J. Estimation of multivariate frailty models using penalized partial likelihood. Biometrics. 2000;56:1016–1022. doi: 10.1111/j.0006-341X.2000.01016.x. [DOI] [PubMed] [Google Scholar]
- 52.Willer CJ, Li Y, Abecasis GR. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics. 2010;26:2190–2191. doi: 10.1093/bioinformatics/btq340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Elwood PC, et al. Ischemic heart disease and platelet aggregation. The Caerphilly Collaborative Heart Disease Study. Circulation. 1991;83:38–44. doi: 10.1161/01.CIR.83.1.38. [DOI] [PubMed] [Google Scholar]
- 54.Kammers, K. et al. Transcriptional profile of platelets and iPSC-derived megakaryocytes from whole genome and RNA sequencing. Blood, 10.1182/blood.2020006115 (2020). [DOI] [PMC free article] [PubMed]
- 55.Giambartolomei C, et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 2014;10:e1004383. doi: 10.1371/journal.pgen.1004383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Wallace C. Statistical testing of shared genetic control for potentially related traits. Genet Epidemiol. 2013;37:802–813. doi: 10.1002/gepi.21765. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Wallace C, et al. Statistical colocalization of monocyte gene expression and genetic risk variants for type 1 diabetes. Hum. Mol. Genet. 2012;21:2815–2824. doi: 10.1093/hmg/dds098. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Bhan A, et al. Antisense transcript long noncoding RNA (lncRNA) HOTAIR is transcriptionally induced by estradiol. J. Mol. Biol. 2013;425:3707–3722. doi: 10.1016/j.jmb.2013.01.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Katoh K, Rozewicki J, Yamada KD. MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Brief. Bioinform. 2019;20:1160–1166. doi: 10.1093/bib/bbx108. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
TOPMed WGS variant calls are available for all samples through the Database of Genotypes and Phenotypes (dbGaP) under accession number phs001218 for GeneSTAR, phs000956 and phs000391 for OOA and phs000974 for Framingham. Phenotype data for GeneSTAR, OOA and Framingham are also available through this mechanism. Summary statistics are being deposited in the TOPMed GSR (Genomic Summary Results) site. eQTL analysis results used in the co-localization analysis are hosted on a website at: http://www.biostat.jhsph.edu/~kkammers/GeneSTAR/.