Abstract
BACKGROUND:
Genetic ancestry plays a role in asthma health disparities.
OBJECTIVE:
To evaluate the impact of ancestry on, and identify genetic variants associated with asthma, serum immunoglobulin E (IgE), and lung function.
METHODS:
436 Peruvian children (9–19 years) with asthma and 291 without asthma were genotyped using the Illumina Multi-Ethnic Global Array. Genome-wide proportions of Indigenous ancestry from continental America (NAT) and European ancestry from Iberian Populations in Spain (IBS) were estimated using ADMIXTURE. We assessed the relationship between ancestry and the phenotypes, and performed a genome-wide association study.
RESULTS:
Mean ancestry was 84.7% NAT (cases: 84.2%; controls: 85.4%) and 15.3% IBS (15.8%; 14.6%). Adjusting for asthma, NAT was associated with higher IgE (p<0.001) and IBS with lower IgE levels (p<0.001). NAT was associated with higher FEV1 percent-predicted (p=0.001) while IBS was associated with lower FEV1 in the controls but not cases. The HLA-DR/DQ region on chromosome 6 was strongly associated with IgE (rs3135348, p =3.438×10−10), and is independent of an association with the haplotype HLA-DQA1~HLA-DQB1:04.01~04.02 (p = 1.55×10−05). For lung function, we identified a locus (rs4410198, p = 5.536×10−11) mapping to chromosome 19, near a cluster of zinc finger interacting genes that co-localizes to the LncRNA CTD-2537I9.5. This novel locus was replicated in an independent sample of pediatric asthma cases with similar admixture from Brazil (p = 0.005).
CONCLUSION:
This study confirms the role of HLA in atopy, and identifies a novel locus mapping to a LncRNA for lung function that may be specific to children with Indigenous ancestry from continental America.
Keywords: asthma, immunoglobulin E, lung function, admixture, genome wide association analyses, Peru, ancestry, allergy
CAPSULE SUMMARY:
We replicate prior associations between genetic ancestry and asthma-related phenotypes in a cohort of Peruvian children. We identify a novel locus for lung function which was replicated in an independent sample of pediatric asthma cases.
INTRODUCTION
Asthma is the most prevalent non-communicable chronic disease in children and a major cause of emergency department visits, hospitalizations, and school absences.(1, 2) African Americans and Hispanics are more likely to have severe asthma and worse asthma-related outcomes.(3) Indeed, asthma is a complex disease to which environmental and socioeconomic factors may contribute. Nonetheless there is evidence that these factors do not completely explain these disparities, and that genetics plays a role in asthma development, atopy, and pulmonary function.(4)
The genome-wide association study (GWAS) approach has been successful in identifying numerous loci/genes associated with asthma.(5) Recently, Pividori, Schoettler, and colleagues reported 60 unique loci associated with childhood onset of asthma in a cohort of individuals of European ancestry.(6) Genetic risk loci may however differ by ancestral population.(7) For instance, the Genes-environments & Admixture in Latino Americans (GALA II) study confirmed that there was a strong association between the ORMDL3 locus on 17q21, previously identified in European populations, but identified a novel locus (MUC22) 6p21 in Latinos with asthma.(5) The Consortium on Asthma among African Ancestry Populations (CAAPA) also showed that the effect of the genetic variants at the ORMDL3 locus differed based on the ancestral haplotype on which the variant was found.(8) Thus, the study of admixed populations such as those in Peru may provide opportunities to confirm previously identified loci in this specific population, and also facilitate the identification of novel loci.
In the International Study of Asthma and Allergies in Childhood (ISAAC), Peru was one of the countries with the highest prevalence of childhood asthma with a significant proportion of these children having severe disease.(9) Our study goal was to bridge gaps stemming from a minimal representation of this population in asthma genetics interrogations to-date. We do so by determining the global estimates of ancestry for children with and without asthma from the Genetic Asthma Susceptibility to Indoor Pollution in Peru (GASP study), and evaluating if these global ancestry estimates are associated with asthma status, lung function and total serum IgE levels (IgE). Acknowledging our limited sample size, we also performed a GWAS for asthma, IgE and lung function in this pediatric sample from Peru.
METHODS
Study Participants and Setting
We analyzed data from the GASP study which evaluates the association between genetics, environment and asthma status among children and adolescents residing in Peru. As described in a previous paper(10), cases and controls were recruited from two adjacent communities in Lima: Pampas de San Juan de Miraflores and Villa El Salvador between 2011 and 2014. Subjects were eligible if they were 9–19 years of age. Children were considered to have asthma if they reported a physician diagnosis of asthma and asthma symptoms or taking asthma medications within the past year. Patients with other chronic respiratory condition, pregnancy, current or past history of tuberculosis, history of hospitalization for cardiovascular disease in the preceding 3 months, and/or history of ocular, abdominal, or thoracic surgery in the past 3 months were excluded. Patients who were unwilling or unable to provide a blood sample were also excluded. Controls were children without asthma symptoms or the use of asthma medications in the past year and normal forced expiratory volume in the first second to forced vital capacity ratio (FEV1/FVC) and FEV1 above 80%.
At baseline, questionnaires were completed by each child or caregiver. The questionnaire included demographic data, comorbidities including other allergic diseases, and data on asthma control and severity. Baseline anthropometry and lung function were assessed and predicted values and Z-scores calculated using multi-ethnic reference values derived by the Global Lung Function Initiative.(11) IgE was measured using the ImmunoCAP 250 (ThermoFisher Scientific, Waltham, Massachusetts); all samples were above detection thresholds. For specific IgE antibody testing, a level higher than 0.10 kUa/L indicated a positive IgE antibody response to mixes of 3 common allergens (animal, mold, and dust mite). The institutional review boards at the Johns Hopkins University School of Medicine (Baltimore, Maryland) and AB PRISMA (Lima, Peru) approved this study, and all subjects/parents provided written consent.
Genotyping and quality control for GASP samples
GASP subjects were genotyped using the Illumina Multi Ethnic Genotyping Array (MEGA) which was specifically designed to capture genetic variation in populations with significant African and Native American contribution.(12) Genotyping plates were balanced by asthma status and sex. Duplicate sample concordance, HapMap concordance and Mendelian errors of HapMap trios were used as controls for each set of 91 samples plated.(13) Any unresolved sex mismatches, and ambiguously imputed sex defined as samples with an F-stat between 0.20–0.65 as assessed using PLINK1.9, were removed.(14, 15) Thereafter, we excluded samples with a genotyping rate below 98.5%. We then excluded all samples with strong cryptic relatedness (PI_HAT>0.3) and excess heterozygosity (+/− 3 standard deviations, SD, from mean).
Genotyped SNPs that did not pass quality control procedures were removed using the following criteria: <99% call rate, and/or deviations from Hardy-Weinberg equilibrium (p<10−6). Ancestral outliers in the dataset were evaluated using a set of LD-pruned SNPs in a combined dataset of GASP and Thousand Genomes Project (TGP) samples. Principal components analysis was performed using King, GENESIS, and PC-AiR.(16) Study samples ≥6 SD away from the mean of the Peru sample for principal components (PCs) 1 and 2 were dropped. Approximately 15% of subjects were filtered out, resulting in a total of 743 samples (Pampas- 408, Villa- 319) with high quality genotype data. (Supplementary Figure 1) Prior to analyses, we excluded 16 additional subjects due to ambiguity in their case-control status.
Assessment of genetic ancestry: ADMIXTURE and PCA
Three ancestral reference populations were used in the ancestry deconvolution for the GASP samples: 107 TGP samples from the Iberian Population in Spain (IBS), 88 TGP Yoruban (YRI) samples, and 43 Indigenous ancestry samples from continental America (NAT).(17) The specific samples selected were those reported with no admixture themselves and included : Bolivian Aymara = 25, Maya=6, Mixtec=5, Nahua=1, Peruvian Quechua=2 and Tlapanec =4. We used IBS as the European population given Spanish introgression and gene flow patterns in Peruvians following the Spanish conquest.(18) SNPs were LD-pruned based on a SNP window size of 50, variant count per step of 5 and a variance inflation factor of 2(14, 15) on the combined dataset of GASP and three reference populations (IBS, YIR and NAT). The number of SNPs after LD-pruning was 107,095. Principal components analysis (PCA) was performed using this set of 107,095 SNPs and including all three reference populations using King, GENESIS, and PC-AiR.(16)
To obtain global estimates of ancestry, we used ADMIXTURE.(19) The first step was an unsupervised analyses conducted for clusters, K=1–5 with cross-validation, to find the number of putative source populations in GASP alone. We found that K=2 had the smallest cross-validation error. Data from GASP was then merged with the IBS and NAT reference populations based on the confirmation from the PCA above that these were the two major contributing reference populations.(17) Global estimates of admixture using maximum likelihood estimates were then obtained assuming K=2 ancestral clusters.
Imputation of data from the 1000 Genomes Project Pilot
Subsequent to the SNP-based quality control described above, we removed SNPs with a minor allele frequency (MAF) < 0.0001, and all ambiguous allele SNPs. Additionally, strand flips were resolved prior to imputation using the Michigan Imputation Server against the TGP.(20) A total of 956,459 SNPs were used as input for imputation. Only imputed variants with an imputation quality Rsq>0.3 were used in downstream association analysis; these included SNPs and short indels (insertion/deletion variants) returned from the imputation server. Additional filters on allele frequency are described below.
Analysis of association between global genetic ancestry and asthma susceptibility, total serum IgE levels, and lung function
We tested for an association between the global estimates of ancestry and asthma, lung function and total serum IgE (IgE). For these analyses, the global estimates of ancestry from ADMIXTURE were used for %CEU and %NAT. For asthma susceptibility, we used logistic regression and for IgE and lung function we used a linear model. All models included age, sex, and socioeconomic status (SES). Asthma status was included as a covariate for analyses combining cases and controls on lung function and IgE. Other potential confounders were also included if they were associated with the phenotype in univariate models. For asthma, body mass index (BMI) was also included as a covariate, and site in models combining Pampas and Villa. For lung function, we used prebronchodilator FEV1 percent predicted, and included height, BMI, and site as additional covariates. Results were however similar using Z-scores. IgE was log-transformed, and additionally adjusted for site.
Statistical Models for Genetic Association Analysis
All tests for association were performed in the R package, GENESIS version 2.4.0(21, 22) testing each variant (genotyped and imputed) under an additive model. Primary GWAS were performed for (1) asthma, (2) log-transformed IgE (log10[IgE]) in the combined sample of cases and controls, and (3) lung function also in the combined sample using covariates specified below. Stratified analyses by case and control group were also performed for any GWAS signals identified for IgE or lung function. An MAF of >=5% was applied universally for genotyped and imputed SNPs given our small sample size. As described above, any imputed SNP with an Rsq <= 0.3 and any genotyped SNP with a <99% call rate, and/or deviations from Hardy-Weinberg equilibrium (p<10−6) was also discarded. Finally, the quality of the resulting data set used for the genomewide analysis was confirmed by plotting Q-Q plots (Supplementary Figure 2).
A logistic mixed effect model was used for asthma and linear mixed effects models for IgE and lung function. All three models included GASP-specific PCs as covariates in addition to age, sex, and SES. The PCs used in these tests for association were derived using a total of 246,361 LD-pruned genotyped SNPs in the GASP Peruvian samples, i.e. these PCs were calculated without any reference ancestral samples. This is a larger set of SNPs than those used in the analysis described above including ancestral populations because there we had to limit the starting set of SNPs to those overlapping between the MEGA array and that used for the reference 43 Indigenous ancestry from continental America (NAT) reference population(17); here we were able to perform LD pruning on the full set of SNPs from the MEGA array. The first 20 PCs generated using King, GENESIS and PC-AiR with reference(16) were visually examined on a scree plot, with the first 4 PCs identified as accounting for the most variance in the dataset and were used as covariates in association analysis models. Site was not included as a covariate as ancestry differences were addressed with the inclusion of the PCs. BMI was also included as a covariate for asthma and lung function. For IgE and lung function, asthma was included as a covariate to account for any association that may be with asthma rather than the IgE or lung function phenotype. For any identified associations, we further performed stratified analyses within the cases and controls separately. All data cleanup, Manhattan and Q-Q plots were generated using custom written scripts and the qqman R package version 0.1.(23)
We implemented standard GWAS thresholds for discovery (p<5×10−8) and suggestive evidence (p<1×10−5) for each of the three phenotypes. Additionally, we sought to replicate the 60 significant childhood asthma loci identified by Pividori and Schoettler, et al(6) within extracted flanking regions of +/− 0.4Mb (similar to scale of Pividori and Schoettler, et al).(6) We used two significance thresholds for these lookups: first, a simple correction for number of loci tested which assumes a single causal variant per locus (p < 0.05/60), and second, a correction for the number of independent SNPs (at an Rsq of 0.7) tested across the 60 loci (p<0.05/1,599).
Colocalization analysis of GWAS signals and cis-eQTLs
GTEx analysis V7 (dbGap Accession phs000424.v7.p2) eQTL results were downloaded from GTEx portal for lung and whole blood tissues, and false discovery rate (FDR) of ≤0.05 was used to ascertain the significant transcripts. We performed pairwise colocalization analysis of GWAS signals with cis-eQTL data using R-package coloc.(24, 25) The method uses approximate Bayes Factor computations and tests pairwise colocalization of SNPs in GWAS dataset with eQTLs. It generates five posterior probabilities (PP0, PP1, PP2, PP3 and PP4) for the locus weighting the evidence for competing hypotheses of either no colocalization or colocalization.(24) A PP3 ≥ 75% indicates evidence against colocalization, and in contrast, PP4 ≥ 75% supports evidence of colocalization; therefore the first step is to find all genes with a PP3<75% and PP4≥ 75%, and then examine the posterior probability for each SNP within the region for the likely causal variant.
Imputation of HLA alleles, haplotypes and amino acids.
HLA alleles for HLA class I genes HLA-A, HLA-B, HLA-C and HLA class II genes HLA-DPB1, HLA-DQA1, HLA-DQB1 and HLA-DRB1 were imputed using the HIBAG R package v1.3(26, 27) employing attribute bagging method to impute HLA alleles using hg19 genome assembly and Illumina Infinium Multi-Ethnic Global BeadChip prediction model with Two-field (4-digit) resolution. We filtered out all the genes with call rate <95% and analysis was limited to those alleles with a frequency ≥5% for each passing gene. Each allele was then analyzed under an additive model in HIBAG for asthma, log-transformed IgE and lung function incorporating the same covariates as used in the GWAS analysis above.
HLA haplotypes were generated across HLA-DQA1 and HLA-DRB1 using the BIGDAWG R package.(28) Only samples with complete haplotype data (i.e. those with alleles at both genes) were retained for analysis, and analysis was limited to those alleles with a frequency ≥5%. Each haplotype was then analyzed under an additive model in Plink using the exact analyses models for asthma, log10[IgE] and lung function stated above for the SNP association analysis; BIGDAWD cannot accommodate covariates and quantitative traits. Additional analyses were performed within Plink to facilitate a conditional analysis to evaluate independence of the identified SNP from the HLA alleles/haplotypes.
Replication of Chromosome 19 locus for lung function.
Replication for the genome-wide signification association on chromosome 19 with lung function (FEV1 percent-predicted) in GASP was assessed in four studies: (1) ProAR (Program for Control of Asthma and Allergic Rhinitis), (2) SCAALA (Social Changes, Asthma and Allergies in Latin America) Program, (3) Hispanics participants randomized to long-agonist and inhaled corticosteroids with or without rescue long-acting beta agonist and inhaled corticosteroid combination therapy for six months in the Astra-Zeneca-sponsored COMPASS trial,(29) and (4) a large published GWAS of lung function in 400,102 individuals of European ancestry from the UKBiobank and the SpiroMeta Consortium.(30) A total of 12 SNPs with a discovery p<5E-08 were examined for replication in each of these datasets. No correction was made, as these 12 SNPs represent a single association signal in GASP.
The ProAR severe asthma case-control study(31) was carried out in the city of Salvador, Bahia, Brazil, in 2013. Adult individuals (> 18 years old), with no other lung diseases, were recruited and a total of 1,065 individuals with FEV1 percent-predicted data available were included in the analysis. ProAR DNA samples were genotyped on the Illumina MEGA array and imputed to the CAAPA reference panel on the Michigan imputation server. Association tests were performed using PLINK, including, age, sex and the first principal component as covariates. The SCAALA cohort aimed to assess risk factors for asthma and allergies in children and adolescents aged 4–11 years living in the city of Salvador (State of Bahia, Brazil) and later on the genetics determinants of such conditions. For this replication we included 947 unrelated children with available FEV1 percent-predicted data.
SCAALA DNA samples were genotyped using a commercial panel 2.5 HumanOmni Beadchip available from Illumina (www.illumina.com) and imputed to the 1000 Genomes reference panel on the Michigan imputation server. Association tests were performed using PLINK, including, age, sex and the first four principal components as covariates.
COMPASS trial participants were genotyped on the Illumina Human OmniExpress-12v1 chip and imputed to the TOPmed freeze 5 reference panel on the Michigan imputation server. Pulmonary function was assessed from data obtained during the run in period of the COMPASS trial in 249 individuals with asthma from Argentina and 312 from Mexico, of which 471 were adults (>18 years of age). Compass samples were genotyped on the Illumina Human OmniExpress-12v1 chip and imputed to the TOPMed freeze 5 reference panel on the Michigan imputation server reference panel on the Michigan imputation server. Association tests were performed using PLINK, including, age, sex and the first four principal components as covariates (N=947).
Summary statistics from the UKBiobank and the SpiroMeta Consortium GWAS was obtained from the GWAS catalog (ftp://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/ShrineN_30804560_GCST007432/Shrine_30804560_FEV1_meta-analysis.txt.gz) and the chr19 variants associated with lung function in GASP were extracted and assessed for replication.
RESULTS
Study subject characteristics.
Characteristics of participants at enrollment are shown in Table 1. The cohort from both sites were similar in age, gender, IgE, FEV1 percent predicted, and distribution of the two ancestral populations. There was a higher proportion of males among cases compared to controls (56 vs 47%), higher IgE levels (1233 vs 652 kU/L), and cases from both sites had significantly lower FEV1 (percent predicted: 114 vs 118%, p=0.003). Figure 1A shows the ADMIXTURE global genetic ancestry proportions among the study cohort by case and control status and sampling site allowing for 2-way admixture. The 2-way admixture is also evident from the PCA analysis in Figure 1B showing the distribution of the GASP samples relative to three ancestral reference groups with variability largely due to NAT and IBS contribution. On average, the global genetic ancestry in the study subjects was 15% European (IBS) and 85% Indigenous ancestry. Differences were observed between the two sites (Table 1, Figure 1) with higher Indigenous / lower European ancestry in Pampas (86%/14%) vs Villa (82%/18%), (p=<0.001/<0.001).
Table 1:
PAMPAS | VILLA | |||||
---|---|---|---|---|---|---|
| ||||||
Cases (n= 256) | Controls (n= 152) | p value | Cases (n= 180) | Controls (n= 139) | p value | |
Demographic characteristics | ||||||
| ||||||
Age (years), mean (SD) | 13.5 (2.7) | 13.8 (2.6) | 0.28 | 13.3 (2.6) | 12.8 (2.7) | 0.07 |
Male, n (%) | 151 (59) | 73 (48) | 0.03 | 94 (52) | 65 (47) | 0.33 |
a, bSES, mean (SD) | −0.35 (1.6) | −0.81 (1.6) | 0.005 | 0.92 (1.4) | 0.38 (1.5) | 0.001 |
cTotal IgE, kU/L (%) | 1202 (1523) | 599 (758) | <0.001 | 1257 (1498) | 658 (1092) | <0.001 |
d Atopic Diseases, n (%) | ||||||
Eczema | 36 (14.6) | 19 (13.2) | 0.69 | 17 (10.3) | 3 (2.2) | 0.005 |
Rhinitis | 21 (15.2) | 2 (3.7) | 0.03 | 21 (20.4) | 1 (3.1) | 0.02 |
Ancestry, mean (SD) | ||||||
Indigeneous (NAT) | 85.7 (9.2) | 86.8 (9.7) | 0.22 | 82.0 (10.3) | 83.9 (9.6) | 0.10 |
European (IBS) | 14.3 (9.2) | 13.2 (9.7) | 0.22 | 18.0 (10.3) | 16.1 (9.6) | 0.10 |
| ||||||
Clinical characteristics | ||||||
| ||||||
BMI, n (%) | ||||||
eUnderweight/Normoweight | 146 (57.0) | 102 (67.1) | 0.10 | 77 (42.8) | 83 (59.7) | 0.01 |
Overweight | 77 (30.1) | 32 (21.1) | 66 (36.7) | 37 (26.6) | ||
Obese | 33 (12.9) | 18 (11.8) | 37 (20.6) | 19 (13.7) | ||
Baseline FEV1 (percent predicted), mean (SD) | 114.1 (14.5) | 118.4 (13.8) | 0.003 | 113.6 (15.6) | 119.2 (13.9) | <0.001 |
Baseline FEV1 (raw) in liters, mean (SD) | 2.86 (0.82) | 3.03 (0.80) | 0.04 | 2.86 (0.78) | 2.80 (0.77) | 0.48 |
Baseline FEV1 (Z-score), mean (SD) | 1.22 (1.25) | 1.58 (1.20) | 0.004 | 1.18 (1.34) | 1.66 (1.22) | <0.001 |
Asthma severity, n (%) | ||||||
Mild intermittent | 40 (20.4) | N/A | N/A | 31 (22.6) | N/A | N/A |
Mild persistent | 82 (41.8) | N/A | 64 (46.7) | N/A | ||
Moderate persistent | 41 (20.9) | N/A | 23 (16.8) | N/A | ||
Severe persistent | 33 (16.8) | N/A | 19 (13.9) | N/A | ||
Sensitivity to environmental allergens, n (%) | ||||||
House dust mix (D.pter, D. far, Bla g) | 195 (76) | 88 (58) | 0.001 | 132 (73) | 76 (55) | <0.001 |
Mold and yeast mix (pen, clad, asp, candida, alternaria, setomelanomma( | 154 (62) | 49 (35) | <0.001 | 76 (42) | 35 (25) | 0.001 |
Animal and epidermal mix (Can f, Fel d, Mus m1, rat, guinea pig) | 118 (46) | 30 (20) | <0.001 | 79 (44) | 28 (20) | <0.001 |
Exhaled Exhaled nitric oxide, eNO (ppb), median [IQR] | 37.1 (34.9) | 21.7 (24.6) | <0.001 | 44.2 (41.0) | 24.3 (31.1) | <0.001 |
SD: standard deviation
Socioeconomic status score based on a principal component analysis, which includes 12 household assets, parental level of education, and number of persons in the household. Higher number denotes higher socioeconomic status
All IgE levels in the samples were above level of assay detection
Eczema (390 in Pampas, 300 in Vila (690 of 727 subjects)) and rhinitis (192 in Pampas, 135 in Vila (327 of 727 subjects)) data collected only on subset of patients
Between both sites, only 3 individuals fell into the ‘Underweight’ category
Association of genetic ancestry with asthma susceptibility, total serum IgE, and lung function
We did not find any significant association between Indigenous ancestry or European ancestry and asthma susceptibility (Supplementary Table 1). European ancestry was associated with lower FEV1 (percent predicted and z-score) (FEV1-percent predicted: β: −.192; p = 0.001), and given 2-way admixture the exact opposite is noted for Indigenous ancestry ( β: .192; p = 0.001) (Supplementary Table 2). In analyses stratified by asthma status, these associations between ancestry and lung function remained statistically significant in controls but not in cases; although the direction of effect was the same between the two the magnitude of the effect was larger in controls (NAT: β: .272; p = 0.001) vs. cases (NAT: β: .111; p = NS). As seen in Supplementary Table 2, European ancestry was associated with lower IgE levels (β: −.010; p < 0.001), and this effect was highly consistent between asthmatic cases (β: −.009; p = 0.004) and controls (−.025011; p = 0.02). Once again, given 2-way admixture, the exact opposite was observed for Indigenous ancestry. Given the observation in Figure 1B of some African contribution to a small set of samples, a sensitivity analysis was performed using K=3 in ADMIXTURE. We were able to confirm the robustness of our K=2 results; there was virtually no change in the NAT estimation (i.e. any YRI ancestry was absorbed into the IBS ancestry) and the described patterns with NAT remain unchanged.
Genomewide Association Analyses
Manhattan plots and Q-Q plots of the primary GWAS for asthma, IgE and lung function are shown in Figure 2 and Supplementary Figure 2 (lambda ~1), respectively.
Asthma status
We did not find any variants that met GWAS significance (p < 5×10−8) in the asthma GWA analysis (Figure 2A). However, there were 13 variants spanning 9 loci that met the suggestive GWAS threshold (p < 1×10−5); 12 out of 13 map to intronic regions, while one SNP is located in an intergenic region, with the nearest gene being IP6K2. (Supplementary table 3).
Total serum IgE
Association analysis for IgE resulted in 30 variants that met the GWAS threshold, all of which are located on chromosome 6 in the HLA region, closest to HLA-DQA1 (Table 2, Figure 2B). The most significant SNP in the region was rs3135348 (p=3.438×10−10). An additional 414 variants in the same HLA region also met the suggestive GWAS threshold (Supplementary table 3). The region spans a broader range of HLA genes containing HLA-DR/DQ genes. There were an additional 68 variants that met the suggestive GWAS threshold outside of the HLA region, primarily spanning chromosomes 1, 2, 16, and 17 (Supplementary table 3).
TABLE II.
rslD | Chr | hg19 Position | Ref/Alt | MAF | Genotyped | R 2 | Nearby gene(s) | Asthma |
log10 (total serum IgE) |
Lung function |
|||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
OR | P value | Effect size | P value | Effect size | P value | ||||||||
| |||||||||||||
rs6913471 | 6 | 32339925 | T/A | 0.273 | Imputed | 0.955 | 0.965605416 | .7811 | 0.215 | 4.93E–08 | −1.099 | .2016 | |
rs4496841 | 6 | 32389997 | C/T | 0.719 | Imputed | 0.988 | 0.854704059 | .1852 | −0.209 | 1.39E–08 | 1.363 | .08916 | |
rs3135359 | 6 | 32390578 | T/C | 0.736 | Imputed | 0.995 | 0.878973966 | .2832 | −0.218 | 5.33E–09 | 1.147 | .1599 | |
rs9296027 | 6 | 32393062 | C/G | 0.241 | Imputed | 0.995 | 1.093080656 | .4805 | 0.218 | 2.74E–08 | −1.303 | .1278 | |
rs9469109 | 6 | 32393161 | A/T | 0.241 | Imputed | 0.995 | 1.094174284 | .4793 | 0.218 | 2.73E–08 | −1.304 | .1275 | |
rs3135348 | 6 | 32394098 | A/G | 0.687 | Imputed | 0.966 | 0.922193691 | .4957 | −0.229 | 3.44E–10 | 1.464 | .06675 | |
rs9501400 | 6 | 32394184 | G/A | 0.247 | Imputed | 0.996 | 1.112934254 | .398 | 0.221 | 1.50E–08 | −1.433 | .09281 | |
rs9501622 | 6 | 32394251 | T/A | 0.241 | Imputed | 0.998 | 1.095269005 | .4695 | 0.218 | 2.64E–08 | −1.311 | .1254 | |
rs732163 | 6 | 32394911 | G/A | 0.24 | Imputed | 1.000 | 1.11516235 | .3907 | 0.218 | 2.70E–08 | −1.213 | .1563 | |
rs4959100 | 6 | 32397813 | C/T | 0.237 | Imputed | 1.000 | 1.122995872 | .3612 | 0.216 | 4.63E–08 | −1.358 | .1152 | |
rs9469110 | 6 | 32398525 | G/T | 0.237 | Genotyped | 1.000 | 1.122995872 | .3613 | 0.216 | 4.64E–08 | −1.358 | .1151 | |
rs3129854 | 6 | 32398781 | G/C | 0.719 | Imputed | 1.000 | 0.862431115 | .2136 | −0.224 | 1.16E–09 | 1.231 | .1263 | |
NA | 6 | 32398853 | T/TA | 0.719 | Imputed | 1.000 | 0.862431115 | .2136 | −0.224 | 1.16E–09 | 1.231 | .1263 | |
rs984778 | 6 | 32400088 | C/T | 0.719 | Genotyped | 1.000 | 0.862431115 | .2136 | −0.224 | 1.16E–09 | 1.231 | .1263 | |
rs9501626 | 6 | 32400344 | C/A | 0.237 | Genotyped | 1.000 | 1.122995872 | .3613 | 0.216 | 4.64E–08 | −1.358 | .1151 | |
rs3135338 | 6 | 32401217 | C/T | 0.719 | Genotyped | 1.000 | 0.862431115 | .2136 | −0.224 | 1.16E–09 | 1.231 | .1263 | |
rs3135336 | 6 | 32401829 | G/A | 0.72 | Imputed | 0.995 | 0.860707976 | .2091 | −0.225 | 1.16E–09 | 1.238 | .1254 | |
rs3135335 | 6 | 32401845 | C/G | 0.72 | Imputed | 0.995 | 0.860707976 | .2091 | −0.225 | 1.16E–09 | 1.238 | .1254 | |
rs2027856 | 6 | 32402705 | G/A | 0.237 | Genotyped | 1.000 | 1.122995872 | .3613 | 0.216 | 4.64E–08 | −1.358 | .1151 | |
rs3135397 | 6 | 32403941 | A/T | 0.719 | Imputed | 1.000 | 0.862431115 | .2136 | −0.224 | 1.16E–09 | 1.231 | .1263 | |
rs3129866 | 6 | 32404065 | G/C | 0.719 | Imputed | 1.000 | 0.862431115 | .2136 | −0.224 | 1.16E–09 | 1.231 | .1263 | |
rs3129867 | 6 | 32404220 | G/C | 0.723 | Imputed | 0.999 | 0.884263663 | .3042 | −0.22 | 2.57E–09 | 1.243 | .1233 | |
rs2395173 | 6 | 32404859 | A/G | 0.719 | Genotyped | 1.000 | 0.862431115 | .2136 | −0.224 | 1.16E–09 | 1.231 | .1263 | |
rs3135395 | 6 | 32405192 | T/G | 0.719 | Genotyped | 1.000 | 0.862431115 | .2136 | −0.224 | 1.16E–09 | 1.231 | .1263 | |
rs2395178 | 6 | 32405362 | G/C | 0.719 | Imputed | 1.000 | 0.862431115 | .2138 | −0.224 | 1.16E–09 | 1.231 | .1263 | |
rs3129869 | 6 | 32405671 | A/C | 0.72 | Genotyped | 0.999 | 0.863293977 | .2186 | −0.224 | 1.09E–09 | 1.237 | .1244 | |
rs3129871 | 6 | 32406342 | A/C | 0.709 | Imputed | 0.999 | 0.832768156 | .1245 | −0.211 | 1.05E–08 | 1.153 | .1516 | |
rs9271364 | 6 | 32586787 | A/G | 0.689 | Imputed | 1.000 | 0.923116346 | .5037 | −0.201 | 4.97E–08 | 0.497 | .5375 | |
rs9272518 | 6 | 32606446 | G/T | 0.495 | Imputed | 0.701 | HLA-DQA1 | 1.044982355 | .7238 | −0.228 | 4.78E–09 | 1.725 | .04204 |
rs9273395 | 6 | 32627094 | C/T | 0.634 | Imputed | 0.936 | 0.88603396 | .3118 | −0.206 | 3.47E–08 | 1.365 | .09296 | |
rs4335869 | 19 | 56085656 | T/A | 0.297 | Imputed | 0.872 | 1.065026839 | .6251 | −0.022 | .5873 | 5.489 | 1.33E–10 | |
NA | 19 | 56086758 | GA/GAA | 0.328 | Imputed | 0.848 | 1.030454534 | .8187 | −0.023 | .5706 | 5.433 | 1.43E–10 | |
rs28699417 | 19 | 56087272 | T/C | 0.323 | Imputed | 0.887 | 1.034584607 | .7907 | −0.021 | .6048 | 5.279 | 2.54E–10 | |
rs28379489 | 19 | 56087281 | T/A | 0.323 | Imputed | 0.887 | 1.035619709 | .7818 | −0.021 | .6044 | 5.276 | 2.65E–10 | |
NA | 19 | 56088110 | C/CA | 0.249 | Imputed | 0.593 | 1.09089668 | .6007 | −0.025 | .6331 | 6.372 | 5.19E–09 | |
rsl 2972695 | 19 | 56088487 | A/G | 0.33 | Imputed | 0.892 | 1.010050167 | .9379 | −0.026 | .5089 | 5.144 | 4.73E–10 | |
rsl 0403008 | 19 | 56089947 | C/G | 0.307 | Imputed | 0.903 | ZNF579 | 1.057597684 | .6543 | −0.009 | .8208 | 5.211 | 3.39E–10 |
rs34164618 | 19 | 56090076 | G/T | 0.306 | Imputed | 0.903 | ZNF579 | 1.071436209 | .5838 | −0.009 | .8128 | 5.271 | 2.44E–10 |
rsl 2609355 | 19 | 56105932 | G/A | 0.3 | Imputed | 0.904 | FIZ1 | 1.108491409 | .4255 | −0.014 | .7245 | 5.195 | 1.14E–09 |
rs3803890 | 19 | 56110700 | G/A | 0.288 | Imputed | 0.879 | FIZ1 | 1.133148453 | .3485 | −0.008 | .8501 | 5.369 | 8.98E–10 |
rs4410198 | 19 | 56122538 | G/A | 0.296 | Imputed | 0.832 | ZNF865 | 1.087628894 | .5364 | 0.001 | .9873 | 5.83 | 5.54E–11 |
rsl 46619376 | 19 | 56127441 | C/G | 0.277 | 0.854 | ZNF865 | 1.068226717 | .626 | −0.006 | .8831 | 5.351 | 2.51E–09 |
Alt, Alternative; NA, not available; OR, odds ratio; Ref, reference; rsID, rs identifier.
A strong association with the previously implicated HLA-DR/DQ región on Chr6 with total serum IgE level was identified, as was a novel locus mapping to chromosome 19, near zinc finger interacting genes for lung function. Boldface indícales significance at the P < 5 × 10−8 threshold.
Lung function: FEV1
We identified 12 variants that met the GWAS threshold of significance (p < 5×10439−8). All of these SNPS are located on chromosome 19, near zinc finger interacting genes (Table 2, Figure 2C). The SNP with the most significant p-value (p = 5.536×10−11) was rs4410198 which is located in LOC107985322, a non-coding transcript near the zinc finger interacting genes. There were an additional 118 variants that met the suggestive GWAS threshold of significance (p < 1×10−5) outlined in Supplementary table 3.
Overview of the chromosome 6 and 19 loci showing significant association with total serum IgE and lung function, respectively
To further investigate the two GWAS loci identified for IgE and lung function, we performed stratified analyses within the asthma cases and controls separately and plotted the variant in each region using Locus Zoom(32). The peak SNP for IgE in the analysis including all subjects (with adjustment for asthma status), rs3135348, is located at position Chr6:32,394,098 (Figure 3A, left panel). We observe evidence for association at rs3135348 in the control analysis (n=291, Figure 3C, left panel, beta=−0.2588, p=5.448×10−5), and slightly stronger signal in the asthma-only analysis (n=436, Figure 3B, left panel, beta=−0.2020, p=4.257×10−6). rs3135348, is a significant eQTL for a large number of HLA genes across a variety of tissues in the GTEx data (HLA-DQA1, HLA-DQA2, HLA-DQB1, HLA-DQB1-AS1, HLA-DQB2, HLA-DRA, HLA-DRB1, HLA-DRB5, HLA-DRB6, HLA-DRB9).(33)
The peak SNP for lung function, rs4410198, is located at position Chr19:56,122,538 (Figure 3A, right panel). At this locus, the strength of the association is high in the asthma-only analysis (n=436, Figure 3B, right panel, beta=6.870, p=1.547×10−8) and to a lesser degree in the non-asthma analysis (n=291, Figure 3C, right panel, beta=4.514, p= 0.00057). Formal colocalization analysis was performed for this novel locus to identify a potential target gene.
Replication of prior asthma GWAS
We compared the results from our GWAS to what is to date the most comprehensive asthma GWAS focused on childhood asthma.(6) They identified 61 significant asthma loci using UK Biobank data from both children and adults. Since GASP is a pediatric cohort, we focused on the 60 loci identified exclusively from childhood-onset asthma or shared between children and adults. We selected variants from our asthma association analyses that overlapped with each loci and highlighted these on a Manhattan plot (Figure 2A), and used two significance thresholds for these lookups: first, a simple correction for number of loci tested and second, a correction for the number of independent SNPs tested across the 60 loci. Although none of the variants within the 60 known loci reached GWAS significance, we were able to replicate two loci at the locus-corrected threshold (Figure 2A, green line). Both loci identified,12q13.2 and 17q12, are exclusively childhood-related. (Figure 2A, red dots). The peak SNP on chromosome 12 was rs12578859 (p=0.0001) and on chromosome 17 was rs12450091 (p=0.0002); neither of these cross the threshold correcting for number of independent SNPs tested.
Next, we investigated the overlap of the 60 loci to our results for IgE association analysis. One shared locus (i.e. adult and childhood asthma), 6p21.32, overlaps the same HLA-DR/DQ gene region we reported that met GWAS level of significance (Figure 2B) for IgE. There were an additional 4 loci: 2p25.1, 6p21.33, 7p15.1, and 18q21.33, that replicated when using the locus-corrected threshold. The 6p21.33 locus spans a broader HLA region that includes both HLA-B and HLA-C genes. The peak SNP on chromosome 2 was rs57838855 (p=0.0005), on chromosome 7 was rs6962289 (p=0.0004), at 6p21.33 was rs9378247 (p=0.0002), and on chromosome 18 was rs56173102 (p=1.82E-05). Only rs56173102 meet the threshold correcting for number of independent SNPs tested, and it is interesting to note that this is the only childhood-specific locus from the prior published GWAS; the other three were shared with adult and childhood asthma loci.
Using the same approach for lung function association analysis we found a single locus, 8q24.21, replicated at the locus-corrected threshold (Figure 2C). The peak SNP was rs4645958 (p=0.0007), and it did not meet the threshold correcting for number of independent SNPs tested.
Finally, because the Pividori and Schoettler, et al GWAS is largely European in ancestry, and may not be representative of genetic contributions to asthma and its related phenotypes in this Peruvian cohort of predominantly indigenous ancestry, we looked up a set of additional variants identified from the non-European samples within the EVE(34) and CAAPA(8) consortia. We were able to replicate only one variant, rs335016 (p = 0.003) for asthma; however, the effect size was not in the same direction (Supplementary Table 4).
Chr19 region colocalization with cis eQTLs of CTD-2537I9.5 in lung tissue
We jointly analyzed the GWAS data with cis-eQTLs in lung and whole blood tissues from GTEx. We identified 37 significant transcripts in lung tissue within a ±1Mb region of the peak SNP for lung function, rs4410198. Among these transcripts, only LncRNA CTD-2537I9.5 showed strong evidence of colocalization (PP3 =0.02 and PP4 = 98%, see Supplementary Table 5). The top colocalizing SNP, rs34164618 had a GWAS association p-values (p=2.44E-10), eQTL association p-values (p=3.47E-06) and a 50% posterior probability (Supplementary Figure 3). A total of 30 significant transcripts were identified in whole blood tissue, however none of them showed the evidence of colocalization (Supplementary Table 5). Similar analysis was done for peak SNP for IgE levels, rs3135348, in both tissues but no evidence of colocalization was observed for any significant transcript.
Testing for association with HLA alleles imputed for the Chr6 region
Given that the HLA region is one of the strongest genetic loci across atopic diseases,(35) we imputed HLA alleles for Class I and II genes. Supplementary Table 6 shows the genes and alleles that were imputed in these data. There were 14 alleles with a frequency >=5% which were tested individually for the three primary phenotypes as shown in Supplementary Table 7. Similar to the associations at the SNP level, the strongest associations were observed forIgE. Using a multiple-testing threshold of 0.05/(14*3); three alleles had significant associations with log10(IgE): HLA-DQA1*04:01 (p=4.47×10−5), HLA-DQB1*03:02 (p=2.13×10−4) and HLA-DQB1*04:02 (p=4.34×10−5). These allelic associations are largely represented by the haplotypes across the HLA-DQA1 and HLA-DQB1 genes (HLA-DQA1~HLA-DQB1):04.01~04.02 (p=1.55×10−5). Given the strong linkage disequilibrium well documented in this HLA region, we are unable to separate the independence of the association between the peak SNP in this region for log10(IgE) and the final haplotype HLA-DQA1~HLA-DQB1:04.01~04.02. Even after adjusting for the effect of the haplotype, the peak variant from the GWAS, rs3135348, retains its significance, although somewhat reduced (Supplementary Table 7)
Replication of the novel Chr19 locus for lung function.
Replication evidence for the Chr19 locus for lung function and the four independent datasets is presented in Supplementary Table 8. Significant replication is noted at 7 of the SNPs within the pediatric SCAALA study with the strongest evidence at rs12609355 (p = 0.005) and rs10403008 (p = 0.008) with consistent directions of effect between the discovery (Beta = 5.195) and replication (Beta = 2.028 ) data. No evidence for replication is noted in the adult PROAR cohort, the COMPASS study (with only a limited number of pediatric cases) nor the study by Shrine et al based off the largely European UKBB sample.
DISCUSSION
Given the evidence that Peru is one of the countries with the highest prevalence of childhood asthma and high disease severity(9), our study goal was to understand if ancestry was associated with asthma status, lung function and IgE levels in a cohort of Peruvian children. Recognizing the limitations of a relatively modest sample size, we not only performed a GWAS, but specifically sought to replicate prior risk loci associated with childhood-onset asthma and with lung function.
We found that participants had a high proportion of Indigenous ancestry (85%) but lower European ancestry (15%) similar to prior studies showing Indegeneous American ancestry as the predominant ancestry in Peruvians.(18) Latino populations worldwide represent some of the most diverse and admixed populations which has implications for health and disease.(36, 37) The source of Indigenous American ancestry could also differ between Latino populations with differing frequencies of risk alleles between these types,(38) and could account for the difference in the effect of the ancestral population on development of asthma.(39) The differences in the relationship between European ancestry and asthma in these admixed populations are complex, and these differing effects (i.e. risk vs. protection) could be due to significant contributions from African ancestry (where there is three-way admixture), and/or the variability in origin of Indigenous ancestry.
With respect to IgE, we found that European ancestry is associated with lower IgE levels and Indigenous ancestry associated with higher IgE levels. These findings are consistent with prior studies showing that populations of Amer-Indigenous ancestry descent have higher IgE levels compared to populations with higher European ancestry.(40–42)
Finally, with respect to lung function, consistent with multiple prior studies,(5, 43, 44) we found Indigenous American ancestry to be associated with higher lung function, and European ancestry with lower lung function. When stratified by asthma status, this effect remained statistically significant in controls, but not in cases. There is remarkable diversity in the genes associated with lung function, and these genes could differ between patients with asthma and those without asthma.(45) It is also possible that the higher effect seen in controls is a better reflection of the overall effect of genetic ancestry on lung function because the presence of asthma in itself could mediate the relationship between ancestry and lung function. Following the development of asthma, ancestry might lead to smaller differences in lung function comparing a child with asthma vs another, while in those without asthma, ancestry may play a larger role in lung function.
Our GWAS was unable to identify genome-wide associations of asthma, and this is likely due to the small sample size; a power calculation on the extremes of effect sizes (OR= 1.013–1.97) noted in Pividori and Schoettler, et al(6) shows that our power even for asthma replications was <60%. However, we were able to identify associations for IgE and FEV1. We found a strong association between the HLA-DQA1 region and IgE levels. Although the most significant SNP in the region, rs3135348, has not been previously reported in the GWAS literature or the GWAS catalog, the major histocompatibility complex (MHC) region on chromosome 6, which harbors HLA genes, has consistently been documented to associate with asthma (46–48)and IgE(40, 49–53) levels in diverse populations. In the GABRIEL consortium which looked at genetic signals for asthma, the SNP with the strongest association, rs9271300, was located in the MHC region between HLA-DRB1 and HLA-DQA1.(46) Similarly, in the EVE Asthma Genetics Consortium, which had better representation of Latino and African-American participants, HLA-DRB1 had the strongest association with IgE levels.(40) The MHC region contains immune-regulating genes, including HLA‐DR-B1 variants, which modulate antigen presentation to T‐helper cells leading to Th2 skewing.(50) This is an important step in the development of allergies, and in IgE production.(54) (50) Multiple other genes in the HLA region, including HLA‐G, HLA‐A and HLA‐DQA2, have also been shown to be associated with IgE levels.(52)
Beyond associations with specific SNPs, the association with specific alleles and haplotypes for Class I and II MHC genes is also demonstrated in prior studies.(35, 55–58) In these populations from Peru, our extension to looking at alleles at the HLA class I and II genes that were imputed with MAF ≥5% mirrors the patterns of association from the GWAS; associations are identified for HLA-DQA1 (04:01) and HLA-DQB1 (03:02 and 04:02) alleles only for IgE, with evidence coming from both, the cases and controls. The extensive linkage disequilibrium within this region is recapitulated in haplotypes across the two genes that includes the alleles driving the association with IgE. The strongest association is with HLA-DQA~HLA-DQB1*04.01~04.02 with a haplotype frequency of 19% in this Peru sample. While the haplotype has an increased risk for asthma (OR=1.18), this is not statistically significant (p=0.296), and supports the interpretation that this association is driving atopy and not asthma in these samples. A conditional model with the peak SNP from the GWAS, along with the haplotypes reveals that there is an independent effect at the SNP beyond the HLA haplotype.
We found a GWAS association between 12 SNPs mapping to a locus on chromosome 19 and FEV1 amongst cases and controls. Notably, we were able to replicate this novel finding at 7 SNPS in similarly admixed pediatric samples from SCAALA, but were unable to replicate it in either similarly admixed adult samples or the largely European UKBB samples. This region on chromosome 19 houses multiple zinc finger genes including ZNF579, ZNF865, and FIZ1. Prior studies have shown that zinc finger genes are associated with bronchodilator response and airway remodeling. The Childhood Asthma Management Program (CAMP) study showed that variants in the ZNF432 gene were involved in the bronchodilator response among children with asthma and that inhaled steroids modified this response,(59) while prior studies in patients with chronic obstructive pulmonary disease have suggested that transcription factors of zinc finger family of proteins are involved in airway remodeling and COPD pathogenesis.(60, 61) Co-localization analysis was unable to identify a regulatory overlap between out peak association signal and these zinc finger genes. We did find strong evidence for co-localization with a LncRNA (Long non-coding RNA) CTD-2537I9.5. Although not much is known specifically for this LncRNA, there is well documented evidence for the role of LncRNAs in lung biology in general. LncRNA are a diverse class of transcribed, but not translated RNA that are ~200 nucleotides in length. While they do not encode proteins, they can interact with both RNA and DNA in the cell and have been shown to lead to transcriptional activation of other proteins such as HIF1a, Myc, for example.(62) In addition they have been implicated in altering methylation of DNA, presumably by binding to DNA fragments and preventing methylation and thereby influencing epigenetic regulation.(62–64) While the majority of studies have implicated LncRNA in cancer, and specifically with lung cancer,(63) they have also been implicated in almost all types of lung disease:(65) acute lung injury, where they have been proposed to function as decoys to some miRs, COPD where their regulation is altered with cigarette smoke, IPF where they have been implicated in fibroblast proliferation, and PAH with smooth muscle proliferation.
Several loci from prior GWAS replicated in our Peruvian cohort. For asthma, we found replication of 17q12 and 12q13.2 shown by Pividori and Schoettler, et al(6) as associated with childhood-onset asthma. 17q12 was the most significant locus associated with childhood-onset with each copy of the risk allele conferring ~2.5 years earlier onset compared to individuals without the risk.(6) GSDMB, the a major protein-encoding gene in this region, maps to a broad region of high LD with multiple other genes associated with asthma in Latino and non-Latino populations, including ORMDL3 and IKZF3.(66, 67) Chromosome 17q12–q21 is associated with asthma risk, and demonstrates significant ancestral heterogeneity with the risk increasing as the proportion of European ancestry increasing.(8, 68) For IgE, 6p21.32, a shared locus for adult and childhood asthma directly overlaps the HLA-DR region which met GWAS level of significance for IgE in our analyses. The 6p21.33 locus spans a broader region that includes both HLA-B and HLA-C genes. This validates prior evidence that the MHC region on chromosome 6 plays an important role in allergy.(40, 47, 48, 51, 52) For lung function, 8q24.21 reached the Bonferroni-corrected threshold. This was one of two loci identified to be associated with asthma in the genome-wide meta-analyses from populations of African ancestry in CAAPA.(8) It is also the only locus that passes our more stringent correction for multiple testing based on the number of independent SNPs evaluated. This replication in our sample of individuals with predominantly Indigenous ancestry supports that this locus may be particularly relevant in admixed populations. The TATDN1 gene in the region of chromosome 8q24 has increased expression in airway smooth muscle cells of asthma patients, and the adjacent binding site for CCAAT/enhancer-binding protein beta, is a transcription factor involved in IL-17 signaling pathway which modulates the effect of house dust mite on lung function.(8, 69)
One of the biggest limitations of our work is the limited sample size. Nonetheless, we were able to identify statistically significant associations for IgE and FEV1. We also replicate prior observations regarding the relationship between Indigenous ancestry and lung function and IgE. While the association between HLA and IgE is not novel, it recapitulates the importance of the HLA region in allergy with a strong association for IgE in this allergy asthmatic pediatric sample from Peru. We show that there is an HLA haplotype DQA1:04*01~DQB1:04*02 that is strongly associated with IgE. Even after accounting for this haplotype, the peak GWAS SNP retains some significance confirming the inability to fully disentangle quantitative (i.e., regulatory eQTL effects) and qualitative (i.e., alleles representing antigen specific binding) effects.(70) Despite the limitation in sample size, we identified a novel locus for lung function that we were able to replicate in an independent pediatric population also with Indigenous central American admixture. We acknowledge that while our co-localization analysis does hone in on a potential LncRNA (CTD-2537I9.5) as a target for our novel chromosome 19 signal, little is known about its specific role in lung physiology and further work is needed to gain mechanistic insights. The peak variants mapping to our GWAS signal are common in both European and American ancestry populations in the 1000 Genomes European and American ancestry populations. A notable exception is rs146619376, which has a MAF of 27% in GASP, but a MAF of 4% and 13% in 1000 Genomes European and American populations, respectively. The higher MAF of rs146619376 in GASP suggest that this association may be population specific, and formal analyses utilizing local ancestry would be valuable in the future to resolve the potential for ancestry-specific effects at this local but in our current data are hampered by sparse genotyping in the reference data. In conclusion, our replication of a novel finding only in a similarly admixed pediatric population, but not in adults or Europeans, validates the importance of including under-represented samples in additional explorations of the genetics of allergy and asthma. This is particularly important as we continue to work towards Precision medicine initiatives and eliminating health disparities in genetics research.
Supplementary Material
KEY MESSAGES:
Genetic ancestry is associated with asthma-related phenotypes of lung function and total serum IgE.
A novel locus mapping to a LncRNA for lung function may be specific to children with continental American admixture.
Acknowledgments
FUNDING:
This research study was supported by National Institute of Environmental Health Sciences/National Institutes of Health (NIH) grants R01ES018845 and R01ES018845-S1 and NHLBI R01 HL142992. A.T.A was supported by the National Institute of Allergy and Infectious Diseases T32 Research Training Grant in Pediatric Allergy and Immunology (grant no. 2T32AI007007-41) and the Johns Hopkins University Provost’s Postdoctoral Award. K.R. was a Fogarty Global Health Fellow through the consortium comprised of the University of North Carolina, Johns Hopkins University, Morehouse School of Medicine, and Tulane University during the conduct of this work through grant 5R25TW009340. W.C. was supported by Pathway to Independence Award R00HL096955 from the National Heart, Lung, and Blood Institute/NIH.
ABBREVIATIONS:
- GWAS
genome wide association analyses
- SNP
single nucleotide polymorphism
- IgE
total serum immunoglobulin E
- SES
socioeconomic status
- BMI
body mass index
- PC
principal components
- FEV1
forced expiratory volume in 1 second
- CAAPA
Consortium on Asthma among African Ancestry Populations
- TGP
The Thousand Genomes Project
- (IBS)
Iberian populations in Spain
- NAT
Indigenous ancestry populations from continental America
- LD
linkage disequilibrium
- MHC
major histocompatibility complex
Footnotes
CONFLICTS OF INTEREST: The authors have no conflict of interest relevant to this article to disclose
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
REFERENCES
- 1.Centers for Disease Control and Prevention: Asthma’s Impact on the Nation 2012. [Available from: https://www.cdc.gov/asthma/impacts_nation/asthmafactsheet.pdf.
- 2.World Health Organization [WHO]: Asthma Fact Sheet 2017. [Available from: https://www.who.int/news-room/fact-sheets/detail/asthma.
- 3.Mersha TB. Mapping asthma-associated variants in admixed populations. Front Genet. 2015;6:292. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Mathias RA, Taub MA, Gignoux CR, Fu W, Musharoff S, O’Connor TD, et al. A continuum of admixture in the Western Hemisphere revealed by the African Diaspora genome. Nat Commun. 2016;7:12522. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Galanter JM, Gignoux CR, Torgerson DG, Roth LA, Eng C, Oh SS, et al. Genome-wide association study and admixture mapping identify different asthma-associated loci in Latinos: the Genes-environments & Admixture in Latino Americans study. J Allergy Clin Immunol. 2014;134(2):295–305. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Pividori M, Schoettler N, Nicolae DL, Ober C, Im HK. Shared and distinct genetic risk factors for childhood-onset and adult-onset asthma: genome-wide and transcriptome-wide studies. Lancet Respir Med. 2019;7(6):509–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Torgerson DG, Capurso D, Mathias RA, Graves PE, Hernandez RD, Beaty TH, et al. Resequencing candidate genes implicates rare variants in asthma susceptibility. Am J Hum Genet. 2012;90(2):273–81. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Daya M, Rafaels N, Brunetti TM, Chavan S, Levin AM, Shetty A, et al. Association study in African-admixed populations across the Americas recapitulates asthma risk loci in non-African populations. Nat Commun. 2019;10(1):880. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Worldwide variations in the prevalence of asthma symptoms: the International Study of Asthma and Allergies in Childhood (ISAAC). Eur Respir J. 1998;12(2):315–35. [DOI] [PubMed] [Google Scholar]
- 10.Hansel NN, Romero KM, Pollard SL, Bose S, Psoter KJ, L JU, et al. Ambient Air Pollution and Variation in Multiple Domains of Asthma Morbidity among Peruvian Children. Ann Am Thorac Soc. 2019;16(3):348–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Quanjer PH, Stanojevic S, Cole TJ, Baur X, Hall GL, Culver BH, et al. Multi-ethnic reference values for spirometry for the 3–95-yr age range: the global lung function 2012 equations. Eur Respir J. 2012;40(6):1324–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Bien SA, Wojcik GL, Zubair N, Gignoux CR, Martin AR, Kocarnik JM, et al. Strategies for Enriching Variant Coverage in Candidate Disease Loci on a Multiethnic Genotyping Array. PLoS One. 2016;11(12):e0167758. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.A haplotype map of the human genome. Nature. 2005;437(7063):1299–320. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.PLINK 1.9. Purcell, Shaun and Chang, Christopher. [Available from: www.cog-genomics.org/plink/2.0.
- 15.Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience. 2015;4:7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen WM. Robust relationship inference in genome-wide association studies. Bioinformatics. 2010;26(22):2867–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Mao X, Bigham AW, Mei R, Gutierrez G, Weiss KM, Brutsaert TD, et al. A genomewide admixture mapping panel for Hispanic/Latino populations. Am J Hum Genet. 2007;80(6):1171–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Harris DN, Song W, Shetty AC, Levano KS, Caceres O, Padilla C, et al. Evolutionary genomic dynamics of Peruvians before, during, and after the Inca Empire. Proc Natl Acad Sci U S A. 2018;115(28):E6526–e35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 2009;19(9):1655–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Das S, Forer L, Schonherr S, Sidore C, Locke AE, Kwong A, et al. Next-generation genotype imputation service and methods. Nat Genet. 2016;48(10):1284–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Conomos MP, Miller MB, Thornton TA. Robust inference of population structure for ancestry prediction and correction of stratification in the presence of relatedness. Genet Epidemiol. 2015;39(4):276–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Team RC. R: A language and environment for statistical computing.: R Foundation for Statistical Computing, Vienna, Austria.; 2019. [Available from: URL http://www.r-project.org/. [Google Scholar]
- 23.Turner S qqman: an R package for visualizing GWAS results using Q-Q and manhattan plots. Journal of Open Source Software. 2018;3(25):731. [Google Scholar]
- 24.Giambartolomei C, Vukcevic D, Schadt EE, Franke L, Hingorani AD, Wallace C, et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 2014;10(5):e1004383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Wallace C Statistical testing of shared genetic control for potentially related traits. Genet Epidemiol. 2013;37(8):802–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Zheng X, Shen J, Cox C, Wakefield JC, Ehm MG, Nelson MR, et al. HIBAG--HLA genotype imputation with attribute bagging. Pharmacogenomics J. 2014;14(2):192–200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Zheng X Imputation-Based HLA Typing with SNPs in GWAS Studies. Methods Mol Biol. 2018;1802:163–76. [DOI] [PubMed] [Google Scholar]
- 28.Pappas DJ, Marin W, Hollenbach JA, Mack SJ. Bridging ImmunoGenomic Data Analysis Workflow Gaps (BIGDAWG): An integrated case-control analysis pipeline. Hum Immunol. 2016;77(3):283–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Kuna P, Peters MJ, Manjra AI, Jorup C, Naya IP, Martínez-Jimenez NE, et al. Effect of budesonide/formoterol maintenance and reliever therapy on asthma exacerbations. Int J Clin Pract. 2007;61(5):725–36. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Shrine N, Guyatt AL, Erzurumluoglu AM, Jackson VE, Hobbs BD, Melbourne CA, et al. New genetic signals for lung function highlight pathways and chronic obstructive pulmonary disease associations across multiple ancestries. Nat Genet. 2019;51(3):481–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Cruz AA, Riley JH, Bansal AT, Ponte EV, Souza-Machado A, Almeida PCA, et al. Asthma similarities across ProAR (Brazil) and U-BIOPRED (Europe) adult cohorts of contrasting locations, ethnicity and socioeconomic status. Respir Med. 2020;161:105817. [DOI] [PubMed] [Google Scholar]
- 32.Pruim RJ, Welch RP, Sanna S, Teslovich TM, Chines PS, Gliedt TP, et al. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics. 2010;26(18):2336–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Yizhak K, Aguet F, Kim J, Hess JM, Kubler K, Grimsby J, et al. RNA sequence analysis reveals macroscopic somatic clonal expansion across normal tissues. Science. 2019;364(6444). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Torgerson DG, Ampleford EJ, Chiu GY, Gauderman WJ, Gignoux CR, Graves PE, et al. Meta-analysis of genome-wide association studies of asthma in ethnically diverse North American populations. Nat Genet. 2011;43(9):887–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Schoettler N, Rodríguez E, Weidinger S, Ober C. Advances in asthma and allergic disease genetics: Is bigger always better? J Allergy Clin Immunol. 2019;144(6):1495–506. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Vergara C, Murray T, Rafaels N, Lewis R, Campbell M, Foster C, et al. African ancestry is a risk factor for asthma and high total IgE levels in African admixed populations. Genet Epidemiol. 2013;37(4):393–401. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Burchard EG, Avila PC, Nazario S, Casal J, Torres A, Rodriguez-Santana JR, et al. Lower bronchodilator responsiveness in Puerto Rican than in Mexican subjects with asthma. Am J Respir Crit Care Med. 2004;169(3):386–92. [DOI] [PubMed] [Google Scholar]
- 38.Lorenzo Bermejo J, Boekstegers F, Gonzalez Silos R, Marcelain K, Baez Benavides P, Barahona Ponce C, et al. Subtypes of Native American ancestry and leading causes of death: Mapuche ancestry-specific associations with gallbladder cancer risk in Chile. PLoS Genet. 2017;13(5):e1006756. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Rosenberg NA, Pritchard JK, Weber JL, Cann HM, Kidd KK, Zhivotovsky LA, et al. Genetic structure of human populations. Science. 2002;298(5602):2381–5. [DOI] [PubMed] [Google Scholar]
- 40.Levin AM, Mathias RA, Huang L, Roth LA, Daley D, Myers RA, et al. A meta-analysis of genome-wide association studies for serum total IgE in diverse study populations. J Allergy Clin Immunol. 2013;131(4):1176–84. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Yang JJ, Burchard EG, Choudhry S, Johnson CC, Ownby DR, Favro D, et al. Differences in allergic sensitization by self-reported race and genetic ancestry. J Allergy Clin Immunol. 2008;122(4):820–7.e9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Litonjua AA, Celedon JC, Hausmann J, Nikolov M, Sredl D, Ryan L, et al. Variation in total and specific IgE: effects of ethnicity and socioeconomic status. J Allergy Clin Immunol. 2005;115(4):751–7. [DOI] [PubMed] [Google Scholar]
- 43.Salari K, Choudhry S, Tang H, Naqvi M, Lind D, Avila PC, et al. Genetic admixture and asthma-related phenotypes in Mexican American and Puerto Rican asthmatics. Genet Epidemiol. 2005;29(1):76–86. [DOI] [PubMed] [Google Scholar]
- 44.Pino-Yanes M, Thakur N, Gignoux CR, Galanter JM, Roth LA, Eng C, et al. Genetic ancestry influences asthma susceptibility and lung function among Latinos. J Allergy Clin Immunol. 2015;135(1):228–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Brehm JM, Man Tse S, Croteau-Chonka DC, Forno E, Litonjua AA, Raby BA, et al. A Genome-Wide Association Study of Post-bronchodilator Lung Function in Children with Asthma. Am J Respir Crit Care Med. 2015;192(5):634–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Moffatt MF, Schou C, Faux JA, Abecasis GR, James A, Musk AW, et al. Association between quantitative traits underlying asthma and the HLA-DRB1 locus in a family-based population sample. Eur J Hum Genet. 2001;9(5):341–6. [DOI] [PubMed] [Google Scholar]
- 47.Li X, Howard TD, Zheng SL, Haselkorn T, Peters SP, Meyers DA, et al. Genome-wide association study of asthma identifies RAD50-IL13 and HLA-DR/DQ regions. J Allergy Clin Immunol. 2010;125(2):328–35.e11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Moffatt MF, Gut IG, Demenais F, Strachan DP, Bouzigon E, Heath S, et al. A large-scale, consortium-based genomewide association study of asthma. N Engl J Med. 2010;363(13):1211–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Kim JH, Lee SY, Kang MJ, Yoon J, Jung S, Cho HJ, et al. Association of Genetic Polymorphisms with Atopic Dermatitis, Clinical Severity and Total IgE: A Replication and Extended Study. Allergy Asthma Immunol Res. 2018;10(4):397–405. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Potaczek DP, Kabesch M. Current concepts of IgE regulation and impact of genetic determinants. Clin Exp Allergy. 2012;42(6):852–71. [DOI] [PubMed] [Google Scholar]
- 51.Pino-Yanes M, Gignoux CR, Galanter JM, Levin AM, Campbell CD, Eng C, et al. Genome-wide association study and admixture mapping reveal new loci associated with total IgE levels in Latinos. J Allergy Clin Immunol. 2015;135(6):1502–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Granada M, Wilk JB, Tuzova M, Strachan DP, Weidinger S, Albrecht E, et al. A genome-wide association study of plasma total IgE concentrations in the Framingham Heart Study. J Allergy Clin Immunol. 2012;129(3):840–5.e21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Upham JW, Holt PG, Taylor A, Thornton CA, Prescott SL. HLA-DR expression on neonatal monocytes is associated with allergen-specific immune responses. J Allergy Clin Immunol. 2004;114(5):1202–8. [DOI] [PubMed] [Google Scholar]
- 54.Georas SN, Guo J, De Fanis U, Casolaro V. T-helper cell type-2 regulation in allergic disease. Eur Respir J. 2005;26(6):1119–37. [DOI] [PubMed] [Google Scholar]
- 55.Movahedi M, Moin M, Gharagozlou M, Aghamohammadi A, Dianat S, Moradi B, et al. Association of HLA class II alleles with childhood asthma and Total IgE levels. Iran J Allergy Asthma Immunol. 2008;7(4):215–20. [PubMed] [Google Scholar]
- 56.Kontakioti E, Domvri K, Papakosta D, Daniilidis M. HLA and asthma phenotypes/endotypes: a review. Hum Immunol. 2014;75(8):930–9. [DOI] [PubMed] [Google Scholar]
- 57.Takejima P, Agondi RC, Rodrigues H, Aun MV, Kalil J, Giavina-Bianchi P. Allergic and Nonallergic Asthma Have Distinct Phenotypic and Genotypic Features. Int Arch Allergy Immunol. 2017;172(3):150–60. [DOI] [PubMed] [Google Scholar]
- 58.Mubarak B, Afzal N, Javaid K, Talib R, Aslam R, Latif W, et al. Frequency of HLA DQβ1*0201 and DQβ1*0301 Alleles and Total Serum IgE in Patients with Bronchial Asthma: A Pilot Study from Pakistan. Iran J Allergy Asthma Immunol. 2017;16(4):313–20. [PubMed] [Google Scholar]
- 59.Wu AC, Himes BE, Lasky-Su J, Litonjua A, Peters SP, Lima J, et al. Inhaled corticosteroid treatment modulates ZNF432 gene variant’s effect on bronchodilator response in asthmatics. J Allergy Clin Immunol. 2014;133(3):723–8.e3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Koczulla AR, Jonigk D, Wolf T, Herr C, Noeske S, Klepetko W, et al. Kruppel-like zinc finger proteins in end-stage COPD lungs with and without severe alpha1-antitrypsin deficiency. Orphanet J Rare Dis. 2012;7:29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Abe K, Sugiura H, Hashimoto Y, Ichikawa T, Koarai A, Yamada M, et al. Possible role of Kruppel-like factor 5 in the remodeling of small airways and pulmonary vessels in chronic obstructive pulmonary disease. Respir Res. 2016;17:7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Shih JW, Kung HJ. Long non-coding RNA and tumor hypoxia: new players ushered toward an old arena. J Biomed Sci. 2017;24(1):53. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Jafarzadeh M, Tavallaie M, Soltani BM, Hajipoor S, Hosseini SM. LncRNA HSPC324 plays role in lung development and tumorigenesis. Genomics. 2020;112(3):2615–22. [DOI] [PubMed] [Google Scholar]
- 64.Wu X, Tudoran OM, Calin GA, Ivan M. The Many Faces of Long Noncoding RNAs in Cancer. Antioxid Redox Signal. 2018;29(9):922–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Groot M, Zhang D, Jin Y. Long Non-Coding RNA Review and Implications in Lung Diseases. JSM Bioinform Genom Proteom. 2018;3(2). [PMC free article] [PubMed] [Google Scholar]
- 66.Yan Q, Brehm J, Pino-Yanes M, Forno E, Lin J, Oh SS, et al. A meta-analysis of genome-wide association studies of asthma in Puerto Ricans. Eur Respir J. 2017;49(5). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Galanter J, Choudhry S, Eng C, Nazario S, Rodriguez-Santana JR, Casal J, et al. ORMDL3 gene is associated with asthma in three ethnically diverse populations. Am J Respir Crit Care Med. 2008;177(11):1194–200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Stein MM, Thompson EE, Schoettler N, Helling BA, Magnaye KM, Stanhope C, et al. A decade of research on the 17q12–21 asthma locus: Piecing together the puzzle. J Allergy Clin Immunol. 2018;142(3):749–64.e3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Forno E, Sordillo J, Brehm J, Chen W, Benos T, Yan Q, et al. Genome-wide interaction study of dust mite allergen on lung function in children with asthma. J Allergy Clin Immunol. 2017;140(4):996–1003.e7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Kanchan K, Irizar CS, Bunyavanich S, Mathias RA. Current insights into the genetics of food allergy. Journal of Allergy and Clinical Immunology. 2020 (In press). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.