Skip to main content
Genes logoLink to Genes
. 2021 Feb 25;12(3):328. doi: 10.3390/genes12030328

A Workflow for Selection of Single Nucleotide Polymorphic Markers for Studying of Genetics of Ischemic Stroke Outcomes

Gennady Khvorykh 1,*,, Andrey Khrunin 1,, Ivan Filippenkov 1, Vasily Stavchansky 1, Lyudmila Dergunova 1, Svetlana Limborska 1
Editor: Isabella Ceccherini1
PMCID: PMC7996278  PMID: 33668793

Abstract

In this paper we propose a workflow for studying the genetic architecture of ischemic stroke outcomes. It develops further the candidate gene approach. The workflow is based on the animal model of brain ischemia, comparative genomics, human genomic variations, and algorithms of selection of tagging single nucleotide polymorphisms (tagSNPs) in genes which expression was changed after ischemic stroke. The workflow starts from a set of rat genes that changed their expression in response to brain ischemia and results in a set of tagSNPs, which represent other SNPs in the human genes analyzed and influenced on their expression as well.

Keywords: single nucleotide polymorphisms, models of brain ischemia, human orthologues, ischemic stroke

1. Introduction

The ischemic stroke (IS) is a multifactorial disease, where the genetic factors contribute substantially [1]. The same seems to be true for outcomes after IS. However, their associations with the particular genetic factors are poorly known and require further investigation [2,3]. There are two main approaches to identify the genes involved in the development of complex traits: candidate gene approach and genome-wide association (GWA) study (GWAS) [4]. Both were extensively applied to study the genetic bases of IS and resulted in revealing several tens of genes involved in stroke development and risk [5]. In contrast, only few GWA studies have been published on outcomes after IS [6,7]. Therefore, the real genetic control of them remains a black box and the full list of the risk (prognostic) loci is yet to be identified. In this paper we describe an approach to explore the genetic bases of variability in IS outcomes.

GWAS does not require the prior knowledge on the importance of the specific functional features of the trait under consideration. At the same time, it is less precise in revealing causal loci (genes) generally located in particular chromosomal regions that can contain no genes or alternatively be abundant with them [8]. The usability of a gene-based approach was mainly restricted by the incompleteness of knowledge about the biology of the phenotypes studied. To break the information bottleneck, several strategies extending the candidate gene approach were proposed [4]. They were based on linkage information in a chromosomal segment, methods of comparative genomics, and gene expression at different stages. There were also the approaches that combine two or more strategies together. One such method is the digital candidate gene approach (DigiCGA), which extract, filter, and analyze the resources on the web available publicly [9]. The method we propose incorporates the best strategies of the mentioned above approaches and puts them in a form of a workflow.

The idea of this research originates from the models of brain ischemia in laboratory animals that were developed to understand the biological processes underlying cerebral ischemic injury [10]. Studies of rat and mouse genomes showed that most part of human disease genes (99.5%) had orthologues in rodents [11]. Furthermore, comparison of conservation rates of rodent orthologues associated with different types of diseases demonstrated that gene set related to neurological conditions evolved slowly. Together that suggested the rodent models of human neurological diseases to be appropriate representations of the disease processes in humans. Many of the results obtained in model experiments were subsequently confirmed (correlated) in corresponding GWA studies in humans, including those assessed with outcomes after IS [6]. Although there is no animal model that could cover all aspects of human ischemic stroke [12], one of such models—the transient middle cerebral artery occlusion (tMCAO)—is quite promising and actively tested for the development of neuroprotective therapeutic approaches. It is based on temporal artery occlusion and subsequent restoration of blood flow. According to Howells, such model was used in 42.2% of 2582 neuroprotection experiments. The occlusion with subsequent restoration of blood flow can influence the functioning of different genes. Recently, Dergunova et al. identified a list of rat genes that substantially changed their expression in brain in the response to tMCAO [13]. We propose to explore the genomic variations in human orthologues of these genes for searching the genomic markers of IS outcome. Below, we describe in the details the workflow that starts from the list of the rat genes and leads to a set of tagging SNPs (tagSNP) that can be used in case–control studies with the conventional TaqMan real-time PCR assays.

2. Materials and Methods

The main steps of the workflow proposed are shown in Figure 1. In the beginning, there are rat genes with expression level evaluated at 24 h after tMCAO [13]. Twenty-four of them demonstrated the most significant changes in expression level (change in expression >6-fold and p-value < 0.01) and were chosen for further analysis.

Figure 1.

Figure 1

The workflow to identify the tagging SNPs for studying the ischemic stroke outcomes.

The human orthologues of the rat genes were comparatively identified by querying several resources: Ensembl [14], PANTHER 8.0 [15], PhylomeDB 4 [16], and MetaPhOrs [17]. The data from the database Ensembl Genes 97 were retrieved with BioMart by accessing it with web-based interface [18].

The next step was the identification of SNPs within the human genes, including their 5’ and 3’ flanking regions of 5000 bp length. To be relevant to the SNP frequencies in the potential case–control study, the genotypic data should be taken from an appropriate population [19]. To choose such a population, the collection of population samples of 1000 Genomes Project was used. The project comprises one the most comprehensively characterized set of populations with detailed history about each of them [20]. For our purposes we selected CEU population because its genotype data had been shown to be appropriate for selection of loci to assess genetic variability in the most European populations, including those living in Russia [21,22,23,24]. We extracted the required set of SNPs from the bulk of CEU genotype data using VCFtools (0.1.15) [25]. To capture the most common genetic variants, the SNPs with minor allele frequency (MAF) higher than 10% were considered.

Then, we explored the associations between the alleles of selected loci using the correlation coefficient r2 and revealed patterns of linkage disequilibrium (LD) in each of the region considered. To do this, we applied the CLUSTAG tool [26], Tagger instrument [27] implemented in Haploview 4.2 tool [28], and gpart R package (version 1.2.0) [29] using default parameters.

The input files were generated from vcf files obtained in the previous step with the custom scripts. All of the tools were able to reveal patterns of LD (LD blocks) using distinct algorithms but only CLUSTAG and Haploview allowed to compute tagSNPs which represented the groups of highly correlated SNPs in a chromosomal region. Thus, they were used for revealing tagSNPs in the gene regions studied (the threshold of squared correlation between SNPs r2 ≥ 0.8). For both tools, we estimated the tagging effectiveness (TE) as the ratio of the number of tagSNPs to the number of SNPs they tagged.

Because of large number of potential tagSNPs and taking into account that not all of them could mark functionally important SNPs, the subsequent step was to annotate all the possible tagSNPs from high-LD regions with expression quantitative trait loci (eQTLs). For each gene, we downloaded the Significant Single-Tissue eQTLs using the web-interface of Genotype-Tissue Expression (GTEx) project (Release V8) [30]. The eQTLs were further intersected with the tagSNPs determined with Tagger algorithm and filtered by tissue defined as Brain, Artery, Nerve, Blood, and Heart.

At the final step the tagSNPs from the Haploview’s Tagger runs with the maximal capture efficiency (maximal mean r2) and defying as eQTLs were selected to form a list of markers for studying in case–control associations using an appropriate genotyping approach (e.g., TaqMan real-time PCR assay).

The scripts used in this research are freely available at the repository https://github.com/inzilico/tagSNP (accessed on 9 August 2020).

3. Results

We extracted 23 of 24 human orthologues in rat using such projects as Ensembl, PANTHER, PhylomeDB, and MetaPhOrs. Different repositories resulted in the same list of orthologues that showed a one-to-one relationship between human and rat genes. The exception was Glycam1 gene, which orthologue was not identified. The human GLYCAM1 is pseudogene. The genes extracted from Ensembl are presented in Table 1. The numbers of SNPs identified in each gene including flanking regions are given in Supplementary Table S1. The high-LD regions revealed with three approaches were in good agreement. The TE for CLUSTAG and Tagger are presented in Figure 2. In general Tagger demonstrated higher values of TE than CLUSTAG. Therefore, the tagSNPs revealed by Tagger were used for further analyses, particularly, searching eQTLs.

Table 1.

The human orthologues of rat genes identified with Ensembl.

Rat Human Metrics
Gene Chrom Start (bp) End (bp) Gene Chrom Start (bp) End (bp) 1 2 3 4 5
Adora2a 20 16449385 16466147 ADORA2A 22 24813847 24838328 82 82 100 87.44 1
Bcl3 1 81996116 82010351 BCL3 19 45250962 45263301 82 83 100 59.56 1
Ccl22 19 10668403 10675173 CCL22 16 57392684 57400102 65 65 100 53.85 1
Ccr1 8 132147929 132153481 CCR1 3 46243200 46249887 80 80 75 100 1
Cd14 18 29265353 29266946 CD14 5 140011313 140013286 62 63 75 100 1
Cd44 3 99339455 99426032 CD44 11 35160417 35253949 71 68 100 91.02 1
Csf2rb 7 119544873 119558539 CSF2RB 22 37309670 37336491 56 56 100 100 1
Emp1 4 233415324 233449254 EMP1 12 13349650 13369708 76 74 75 100 1
Fosl1 1 227755887 227764393 FOSL1 11 65659520 65668044 92 91 100 100 1
Glycam1 7 142951738 142953998 *
Gpr6 20 47518790 47521561 GPR6 6 110299514 110301921 94 94 50 99.76 1
Gpr88 2 237334865 237339419 GPR88 1 101003693 101007574 95 95 50 100 1
Hmox1 19 25622556 25629372 HMOX1 22 35776354 35790207 80 80 50 100 1
Il6 4 3095536 3100112 IL6 7 22765503 22771621 40 40 0 100 0
Lcn2 3 16763059 16766466 LCN2 9 130911350 130915734 64 64 100 100 1
Lgals3 15 28094062 28106276 LGALS3 14 55590828 55612126 82 78 100 96.41 1
Mcm5 19 25637492 25681915 MCM5 22 35796056 35821423 47 97 50 99.53 0
Olr1 4 211883405 211905489 OLR1 12 10310902 10324737 66 49 75 100 0
Osmr 2 75851664 75892056 OSMR 5 38845960 38945698 56 57 100 99.6 1
Ptx3 2 177457263 177463073 PTX3 3 157154578 157161417 81 81 100 100 1
Rgs9 10 97225541 97298645 RGS9 17 63133549 63223821 91 90 75 67.67 1
Sdc1 6 43667444 43689898 SDC1 2 20400558 20425194 77 76 100 100 1
Serpine1 12 24653385 24663763 SERPINE1 7 100770370 100782547 81 81 100 100 1
Spp1 14 6653093 6658953 SPP1 4 88896819 88904562 63 62 100 66.28 1

1—%id. target rat gene identical to query gene; 2—%id. query gene identical to target Rat gene; 3—rat gene-order conservation score; 4—rat whole-genome alignment coverage; 5—rat orthology confidence [0 low, 1 high]; *Glycam1 has no orthologues in human.

Figure 2.

Figure 2

Tagging effectiveness by CLUSTAG and Tagger tools.

Figure 3 represents the patterns of LD and tagSNPs revealed in PTX3 gene. All the tagSNPs obtained are given in Supplementary Table S2. Only part of them was found to be eQTLs. Some of such tagSNPs was the eQTLs for several tissues. On other hand, no eQTLs were identified among tagSNPs located in BCL3, CCL22, FOSL1, GLYCAM1, GPR6, HMOX1, IL6, and LCN2 genes. After checking the identified sets of eQTLs, nine tagSNPs were determined as potential candidates for further analysis in case–control study using real-time PCR with TaqMan probes. Eight of them were associated with the changes of expression in brain tissues and thus to be the first-priority markers. The ninth locus—the SNP in CCR1 gene—had the greatest absolute values of eQTL-related statistics, particularly, p-value and normalized effect size (10−47 and −0.40, respectively).

Figure 3.

Figure 3

The heatmap of LD between the SNPs in the region of PTX3 gene in the CEU population. The numbers at the bottom of LD plot designate the SNPs included in analyses. Their coordinates as well as the boundaries of the gene are presented at the line below LD plot. The SNPs with numbers 2, 3, 4, 6, 9, 10, 13, 15, and 17 are the members of the first group of strongly associated (r2 ≥ 0.8) SNPs, while the SNPs with the numbers 5,7,8,14,16,18 and 1,11,12 represent the second and the third group, respectively.

4. Discussion

In this paper we proposed a workflow to identify the genetic markers associated with the outcomes of ischemic stroke. It is based on candidate gene approach that requires a prior knowledge about the system under consideration. We hypothesized that such information, particularly, a list of gene-candidates, can be taken from the model studies of brain ischemia in rat. Namely, we took 24 genes exhibited substantial changes in their expression in brain rat after tMCAO and using the workflow proposed obtained a list of the SNPs (tagSNPs with eQTLs abilities) that can be potentially applied in case–control studies.

In the line of workflow, we additionally compared four different sources of human orthologues in rat and three different methods for identification of high-LD regions and selection of tagSNPs. Ensembl, PANTHER, PhylomeDB, and MetaPhOrs were chosen because of the best accuracy and call rate of orthologues inference [31]. They all revealed the same list of human orthologues in rat and thus anyone can be used for searching of orthologs. Nevertheless, human orthologues in rat was identified for each gene of interest and confirmed by four different resources.

To explore patterns of LD and identify tagSNPs we used CLUSTAG, Tagger, and gpart tools. These methods were chosen because they represent three different approaches to the problem of identifying groups of highly correlated SNPs. Although they all exploit the LD-based approach and MAF to split the list of SNPs into high-LD regions (blocks), their algorithms differ. Tagger is based on the analysis of single markers and multi-marker haplotypes, CLUSTAG—on the analysis of clusters, while gpart—on graph analysis. gpart can effectively identify LD blocks of different range but cannot tag SNPs. In terms of TE, Tagger outperformed CLUSTAG and thus its tagSNPs were used for further analysis. However, the number of tagSNPs computed was still high for practical usage, which is why we annotated the SNPs from high-LD regions with eQTLs and subset the appropriate tagSNPs manually. Because the expression of a particular gene can be potentially affected not only the loci located inside the gene (cis-eQTLs) but the loci lied outside the gene (trans-eQTLs) [32] the workflow may be extended with searching additional distant loci associated with the changes of expression of target genes, particularly, the genes in which no cis-eQTLs were identified.

Like other studies pointed to establish genomic landscape of complex traits, our approach is also based on exploration of data of different types (mRNA transcription, population genetic variations, eQTLs) [33,34]. However, it does not rely on GWAS data which are known to be not good in identifying real causative variants and genes as well [35] and thus it is initially more confident. Another characteristic of our approach is its higher genetic complexity due to use of whole genome sequence data allowing possibility for involvement of higher number of real (not imputed) genetic loci in analysis. It should be also noted that although the workflow was applied to SNPs with frequency higher than 10%, it can be used for selecting and testing SNPs with lower frequency (e.g., loci with 5% to 1% frequency). However, it will require increasing the size of human samples analyzed (i.e., population sample, case and control samples). The data of Genome aggregation database project [36] that includes sequencing data of 1000 Genomes Project and others can be used for creating of samples with appropriate size.

The limitation of the proposed approach is that it has not been experimentally validated in a cohort of patients. Nevertheless, we believe that the created workflow will help both in studying of genomics of individual variability in ischemic stroke outcomes and looking inside the black box of polygenicity in their control.

Supplementary Materials

The following are available online at https://www.mdpi.com/2073-4425/12/3/328/s1, Table S1: The number of SNPs identified in human genes, Table S2: The gene based list of tagSNPs including those being eQTLs.

Author Contributions

Conceptualization, A.K. and S.L.; formal analysis, G.K.; investigation, A.K., I.F., V.S., and L.D.; methodology, G.K. and A.K.; project administration, L.D.; supervision, S.L.; writing—original draft, G.K. and A.K.; writing—review & editing, A.K., G.K., and S.L. All authors have read and agreed to the published version of the manuscript.

Funding

The study was funded by RFBR (Russian Foundation for Basic Research) according to the research project No 19-04-00397.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Conflicts of Interest

The authors have no conflict of interest to declare.

Footnotes

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Jood K., Ladenvall C., Rosengren A., Blomstrand C., Jern C. Family history in ischemic stroke before 70 years of age: The Sahlgrenska academy study on ischemic stroke. Stroke. 2005;36:1383–1387. doi: 10.1161/01.STR.0000169944.46025.09. [DOI] [PubMed] [Google Scholar]
  • 2.Jickling G.C., Kittner S.J. A SNP-it of stroke outcome. Neurology. 2019;92:549–550. doi: 10.1212/WNL.0000000000007118. [DOI] [PubMed] [Google Scholar]
  • 3.Torres-Aguila N.P., Carrera C., Muiño E., Cullell N., Cárcel-Márquez J., Gallego-Fabrega C., González-Sánchez J., Bustamante A., Delgado P., Ibañez L., et al. Clinical variables and genetic risk factors associated with the acute outcome of ischemic stroke: A systematic review. J. Stroke. 2019;21:276–289. doi: 10.5853/jos.2019.01522. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Zhu M., Zhao S. Candidate gene identification approach: Progress and challenges. Intl. J. Biol. Sci. 2007;3:420–427. doi: 10.7150/ijbs.3.420. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Falcone G.J., Malik R., Dichgans M., Rosand J. Current concepts and clinical applications of stroke genetics. Lancet Neurol. 2014;13:405–418. doi: 10.1016/S1474-4422(14)70029-8. [DOI] [PubMed] [Google Scholar]
  • 6.Söderholm M., Pedersen A., Lorentzen E., Stanne T.M., Bevan S., Olsson M., Cole J.W., Fernandez-Cadenas I., Hankey G.J., Jimenez-Conde J., et al. Genome-wide association meta-analysis of functional outcome after ischemic stroke. Neurology. 2019;92:E1271–E1283. doi: 10.1212/WNL.0000000000007138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Ibanez L., Heitsch L., Carrera C., Farias F.H., Dhar R., Budde J., Cruchaga C. Multi-ancestry genetic study in 5,876 patients identifies an association between excitotoxic genes and early outcomes after acute ischemic stroke. medRxiv Prepr Serv Heal. Sci. 2020 doi: 10.1101/2020.10.29.20222257. [DOI] [Google Scholar]
  • 8.Visscher P.M., Wray N.R., Zhang Q., Sklar P., McCarthy M.I., Brown M.A., Yang J. 10 Years of GWAS Discovery: Biology, Function, and Translation. Amer. J.Human Gen. 2017;101:5–22. doi: 10.1016/j.ajhg.2017.06.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Zhu M.-J., Li X., Zhao S.-H. Digital candidate gene approach (DigiCGA) for identification of cancer genes. Methods Mol. Biol. 2010;653:105–129. doi: 10.1007/978-1-60761-759-4_7. [DOI] [PubMed] [Google Scholar]
  • 10.White B.C., Sullivan J.M., DeGracia D.J., O’Neil B.J., Neumar R.W., Grossman L.I., Rafols J.A., Krause G.S. Brain ischemia and reperfusion: Molecular mechanisms of neuronal injury. J. Neurol. Sci. 2000;179:1–33. doi: 10.1016/S0022-510X(00)00386-5. [DOI] [PubMed] [Google Scholar]
  • 11.Huang H., Winter E.E., Wang H., Weinstock K.G., Xing H., Goodstadt L., Stenson P.D., Cooper D.N., Smith D., Albà M.M., et al. Evolutionary Conservation and Selection of Human Disease Gene Orthologs in the Rat and Mouse Genomes. [(accessed on 21 October 2019)];2004 doi: 10.1186/gb-2004-5-7-r47. Available online: http://genomebiology.com/2004/5/7/R47. [DOI] [PMC free article] [PubMed]
  • 12.Howells D.W., Porritt M.J., Rewell S.S.J., O’Collins V., Sena E.S., Van Der Worp H.B., Traystman R.J., MacLeod M.R. Different strokes for different folks: The rich diversity of animal models of focal cerebral ischemia. J. Cereb. Blood Flow Metabolism. 2010;30:1412–1431. doi: 10.1038/jcbfm.2010.66. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Dergunova L.V., Filippenkov I.B., Stavchansky V.V., Denisova A.E., Yuzhakov V.V., Mozerov S.A., Limborska S.A. Genome-wide transcriptome analysis using RNA-Seq reveals a large number of differentially expressed genes in a transient MCAO rat model. BMC Genom. 2018;19:1–16. doi: 10.1186/s12864-018-5039-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Hunt S.E., McLaren W., Gil L., Thormann A., Schuilenburg H., Sheppard D., Parton A., Armean I.M., Trevanion S.J., Flicek P., et al. Ensembl variation resources. Database. 2018;2018:1–12. doi: 10.1093/database/bay119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Mi H., Muruganujan A., Ebert D., Huang X., Thomas P.D. PANTHER version 14: More genomes, a new PANTHER GO-slim and improvements in enrichment analysis tools. Nucleic Acids Res. 2019;47:D419–D426. doi: 10.1093/nar/gky1038. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Huerta-Cepas J., Capella-Gutiérrez S., Pryszcz L.P., Marcet-Houben M., Gabaldón T. PhylomeDB v4: Zooming into the plurality of evolutionary histories of a genome. Nucleic Acids Res. 2014;42:897–902. doi: 10.1093/nar/gkt1177. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Pryszcz L.P., Huerta-Cepas J., Gabaldón T. MetaPhOrs: Orthology and paralogy predictions from multiple phylogenetic evidence using a consistency-based confidence score. Nucleic Acids Res. 2011;39:e32. doi: 10.1093/nar/gkq953. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.BioMart. [(accessed on 5 July 2019)]; Available online: http://grch37.ensembl.org/biomart/martview/a117c9fd556f996c278019dae08cfa00.
  • 19.Wojcik G.L., Fuchsberger C., Taliun D., Welch R., Martin A.R., Shringarpure S., Carlson C.S., Abecasis G., Kang H.M., Boehnke M., et al. Imputation-aware tag SNP selection to improve power for large-scale, multi-ethnic association studies. G3 Genes Genomes Genet. 2018;8:3255–3267. doi: 10.1534/g3.118.200502. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.1000 Genomes Project. [(accessed on 2 April 2018)]; Available online: ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502.
  • 21.Lundmark P.E., Liljedahl U., Boomsma D.I., Mannila H., Martin N.G., Palotie A., Syvänen A.C. Evaluation of HapMap data in six populations of European descent. Eur. J. Hum. Genet. 2008;16:1142–1150. doi: 10.1038/ejhg.2008.77. [DOI] [PubMed] [Google Scholar]
  • 22.Nelis M., Esko T., Mägi R., Zimprich F., Zimprich A., Toncheva D., Karachanak S., Piskácková T., Balascák I., Peltonen L., et al. Genetic structure of Europeans: A view from the North-East. PLoS ONE. 2009;4:e5472. doi: 10.1371/journal.pone.0005472. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Khrunin A.V., Khokhrin D.V., Filippova I.N., Esko T., Nelis M., Bebyakova N.A., Bolotova N.L., Klovins J., Nikitina-Zake L., Rehnström K., et al. A genome-wide analysis of populations from European Russia reveals a new pole of genetic diversity in northern Europe. PLoS ONE. 2013;8:e58552. doi: 10.1371/journal.pone.0058552. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Khrunin A.V., Aliev A.M., Limborska S.A. DNA markers from genome-wide association studies of cardiovascular diseases. Microbiol. Virol. 2018;33:245–247. doi: 10.3103/S0891416818040031. [DOI] [Google Scholar]
  • 25.Danecek P., Auton A., Abecasis G., Albers C.A., Banks E., DePristo M.A. The variant call format and VCFtools. Bioinformatics. 2011;27:2156–2158. doi: 10.1093/bioinformatics/btr330. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Ao S.I., Yip K., Ng M., Cheung D., Fong P.Y., Melhado I. CLUSTAG: Hierarchical clustering and graph methods for selecting tag SNPs. Bioinformatics. 2005;21:1735–1736. doi: 10.1093/bioinformatics/bti201. [DOI] [PubMed] [Google Scholar]
  • 27.De Bakker P.I.W., Yelensky R., Pe’Er I., Gabriel S.B., Daly M.J., Altshuler D. Efficiency and power in genetic association studies. Nat. Genet. 2005;37:1217–1223. doi: 10.1038/ng1669. [DOI] [PubMed] [Google Scholar]
  • 28.Barrett J.C., Fry B., Maller J., Daly M.J. Haploview: Analysis and visualization of LD and haplotype maps. Bioinformatics. 2004;21:263–265. doi: 10.1093/bioinformatics/bth457. [DOI] [PubMed] [Google Scholar]
  • 29.Kim S.A., Brossard M., Roshandel D., Paterson A.D., Bull S.B., Yoo Y.J. gpart: Human genome partitioning and visualization of high-density SNP data by identifying haplotype blocks. Bioinformatics. 2019;35:4419–4421. doi: 10.1093/bioinformatics/btz308. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Genotype-Tissue Expression (GTEx) Project. [(accessed on 8 July 2020)]; Available online: gtexportal.org.
  • 31.Altenhoff A.M., Boeckmann B., Capella-Gutierrez S., Dalquen D.A., DeLuca T., Forslund K. Standardized benchmarking in the quest for orthologs. Nat. Methods. 2016;13:425–430. doi: 10.1038/nmeth.3830. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Nodzak C. Methods in Molecular Biology. Humana Press Inc.; Totowa, NJ, USA: 2020. Introductory methods for EQTL analyses; pp. 3–14. [DOI] [PubMed] [Google Scholar]
  • 33.Zhao S., Jiang H., Liang Z.H., Ju H. Integrating Multi-Omics Data to Identify Novel Disease Genes and Single-Neucleotide Polymorphisms. Front. Genet. 2020;10:1–8. doi: 10.3389/fgene.2019.01336. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Zheng Q., Ma Y., Chen S., Che Q., Chen D. The Integrated Landscape of Biological Candidate Causal Genes in Coronary Artery. Disease. Front. Genet. 2020;11:1–14. doi: 10.3389/fgene.2020.00320. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Gallagher M.D., Chen-Plotkin A.S. The Post-GWAS Era: From Association to Function. Am. J. Hum. Genet. 2018;102:717–730. doi: 10.1016/j.ajhg.2018.04.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Karczewski K.J., Francioli L.C., Tiao G., Cummings B.B., Alföldi J., Wang Q., Collins R.L., Laricchia K.M., Ganna A., Birnbaum D.P., et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020;581:434–443. doi: 10.1038/s41586-020-2308-7. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials


Articles from Genes are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES