Summary
Whole-genome sequencing (WGS), a powerful tool for detecting novel coding and non-coding disease-causing variants, has largely been applied to clinical diagnosis of inherited disorders. Here we leveraged WGS data in up to 62,653 ethnically diverse participants from the NHLBI Trans-Omics for Precision Medicine (TOPMed) program and assessed statistical association of variants with seven red blood cell (RBC) quantitative traits. We discovered 14 single variant-RBC trait associations at 12 genomic loci, which have not been reported previously. Several of the RBC trait-variant associations (RPN1, ELL2, MIDN, HBB, HBA1, PIEZO1, and G6PD) were replicated in independent GWAS datasets imputed to the TOPMed reference panel. Most of these discovered variants are rare/low frequency, and several are observed disproportionately among non-European Ancestry (African, Hispanic/Latino, or East Asian) populations. We identified a 3 bp indel p.Lys2169del (g.88717175_88717177TCT[4]) (common only in the Ashkenazi Jewish population) of PIEZO1, a gene responsible for the Mendelian red cell disorder hereditary xerocytosis (MIM: 194380), associated with higher mean corpuscular hemoglobin concentration (MCHC). In stepwise conditional analysis and in gene-based rare variant aggregated association analysis, we identified several of the variants in HBB, HBA1, TMPRSS6, and G6PD that represent the carrier state for known coding, promoter, or splice site loss-of-function variants that cause inherited RBC disorders. Finally, we applied base and nuclease editing to demonstrate that the sentinel variant rs112097551 (nearest gene RPN1) acts through a cis-regulatory element that exerts long-range control of the gene RUVBL1 which is essential for hematopoiesis. Together, these results demonstrate the utility of WGS in ethnically diverse population-based samples and gene editing for expanding knowledge of the genetic architecture of quantitative hematologic traits and suggest a continuum between complex trait and Mendelian red cell disorders.
Keywords: whole-genome sequencing, red blood cell traits, base editing
Introduction
Red blood cells (RBCs) or erythrocytes contain hemoglobin, an iron-rich tetramer composed of two alpha-globin and two beta-globin chains. RBCs play an essential role in oxygen transport and also serve important secondary functions in nitric oxide production, regulation of vascular tone, and immune response to pathogens.1 RBC indices, including hemoglobin (HGB), hematocrit (HCT), mean corpuscular hemoglobin (MCH), mean corpuscular hemoglobin concentration (MCHC), mean corpuscular volume (MCV), RBC count, and red blood cell width (RDW), are primary indicators of RBC development, size, and hemoglobin content.2 These routinely measured clinical laboratory assays may be altered in Mendelian genetic conditions (e.g., hemoglobinopathies such as sickle cell disease [MIM: 603903] or thalassemia [MIM: 613985, 604131], hereditary spherocytosis [MIM: 182900], or G6PD deficiency [MIM: 300908])3 as well as by non-genetic or nutritional factors (e.g., vitamin B and iron deficiency).
RBC indices have estimated family-based heritability values ranging from 40% to 90%4,5 and have been extensively studied as complex quantitative traits in genome-wide association studies (GWASs). Early GWASs identified common genetic variants with relatively large effects associated with RBC indices.6, 7, 8 With improved imputation, increased sample sizes, and deeper interrogation of coding regions of the genome, additional common variants associated with RBC indices with progressively smaller effect sizes and coding variants of larger effect with lower minor allele frequency (MAF) have been identified.9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 However, the full allelic spectrum (e.g., lower frequency non-coding variants, indels, structural variants) that explain the genetic architecture of complex traits remains incomplete.9 In addition, non-European populations (including admixed U.S. minority populations such as African Americans and Hispanics/Latinos) have been under-represented in these studies. Since RBCs play a key role in pathogen invasion and defense, associated quantitative trait loci may be relatively isolated to a particular ancestral population due to local evolutionary selective pressures and population history. Emerging studies with greater inclusion of East Asian, African, and Hispanic ancestry populations have identified ancestry-specific variants associated with RBC quantitative traits.15, 16, 17,20,21 These may account, at least in part, for inter-population differences in RBC indices as well as ethnic disparities in rates of hematologic and other related chronic diseases.18,22
Whole-genome sequencing (WGS) data have been generated through the NHLBI Trans-Omics for Precision Medicine (TOPMed) program in very large and ethnically diverse population samples with existing hematologic laboratory measures. These TOPMed WGS data provide novel opportunities to assess rare and common single-nucleotide and indel variants across the genome, including variants more common in African, East Asian, or Native American ancestry individuals that are not captured by existing GWAS arrays or imputation reference panels. We thereby aimed to identify previously undescribed genetic variants and genes associated with the seven RBC indices and to dissect association signals at previously reported regions through conditional analysis and fine-mapping.
Subjects and methods
TOPMed study population
The analyses reported here included 62,653 participants from 13 TOPMed studies: Genetics of Cardiometabolic Health in the Amish (Amish, n = 1,102), Atherosclerosis Risk in Communities Study VTE cohort (ARIC, n = 8,118), Mount Sinai BioMe Biobank (BioMe, n = 10,993), Coronary Artery Risk Development in Young Adults (CARDIA, n = 3,042), Cardiovascular Health Study (CHS, n = 3,490), Genetic Epidemiology of COPD Study (COPDGene, n = 5,794), Framingham Heart Study (FHS, n = 3,141), Genetic Studies of Atherosclerosis Risk (GeneSTAR, n = 1,713), Hispanic Community Health Study - Study of Latinos (HCHS_SOL, n = 7,655), Jackson Heart Study (JHS, n = 3,033), Multi-Ethnic Study of Atherosclerosis (MESA, n = 2,499), Whole Genome Sequencing to Identify Causal Genetic Variants Influencing CVD Risk - San Antonio Family Studies (SAFS, n = 1,153), and Women’s Health Initiative (WHI, n = 10,920). The composition of the 62,653 participants by race/ethnicity is 54% white, 23% Black, 22% Hispanic/Latino, and 1% Asian (see Table S1 and supplemental methods for details). Further descriptions of the design of the participating TOPMed cohorts and the sampling of individuals within each cohort for TOPMed WGS are provided in the section “Participating studies” in the supplemental methods. We analyzed each of seven red blood cell traits separately, accounting for any unique sampling features within each study. The total counts of participants, mean age, and the count of male participants from each study stratified by trait are shown in Table 1. All studies were approved by the appropriate institutional review boards (IRBs), and informed consent was obtained from all participants.
Table 1.
Study | N (male) | Age | HCT | HGB | MCH | MCHC | MCV | RBC | RDW |
---|---|---|---|---|---|---|---|---|---|
Amish | 1,102 (557) | 50.6 ± 16.9 | 40.6 ± 3.5 | 13.8 ± 1.2 | 30.9 ± 1.3 | 34.1 ± 0.8 | 90.7 ± 3.4 | 4.5 ± 0.4 | – |
ARIC | 8,113 (3,577) | 54.8 ± 5.8 | 41.6 ± 4.0 | 13.9 ± 1.4 | 30.5 ± 2.1 | 33.3 ± 1.0 | 89.6 ± 5.1 | 4.5 ± 0.5 | 14.1 ± 1.1 |
BioMe | 10,990 (4,559) | 52.1 ± 13.5 | 39.5 ± 5.2 | 13.1 ± 1.7 | 30.3 ± 2.8 | 33.7 ± 1.0 | 89.0 ± 7.2 | 4.4 ± 0.6 | 14.2 ± 1.8 |
CARDIA | 3,042 (1,319) | 25.0 ± 3.6 | 42.1 ± 4.4 | 14.2 ± 1.5 | 29.8 ± 2.1 | 33.8 ± 1.0 | 88.1 ± 5.4 | 4.8 ± 0.5 | – |
CHS | 3,490 (1,459) | 72.6 ± 5.4 | 41.8 ± 3.9 | 14.0 ± 1.3 | – | 33.5 ± 1.0 | – | – | – |
COPDGene | 5,794 (2,913) | 64.8 ± 8.8 | 42.0 ± 4.1 | 13.9 ± 1.5 | 30.3 ± 2.3 | 33.2 ± 1.1 | 91.4 ± 5.8 | 4.6 ± 0.5 | – |
FHS | 3,140 (1,514) | 58.4 ± 15.0 | 41.6 ± 4.0 | 14.1 ± 1.3 | 31.1 ± 1.8 | 33.9 ± 1.0 | 91.9 ± 4.9 | 4.5 ± 0.5 | 13.1 ± 1.0 |
GeneSTAR | 1,713 (699) | 43.7 ± 12.9 | 40.9 ± 3.9 | 13.5 ± 1.4 | 29.6 ± 2.1 | 33.0 ± 0.8 | 89.5 ± 5.3 | 4.6 ± 0.4 | – |
HCHS/SOL | 7,655 (3,186) | 46.6 ± 14.0 | 42.1 ± 4.1 | 13.8 ± 1.5 | 29.1 ± 2.2 | 32.7 ± 1.4 | 89.2 ± 6.0 | 4.7 ± 0.4 | 13.8 ± 1.3 |
JHS | 2,905 (1,089) | 53.5 ± 12.8 | 39.4 ± 4.3 | 13.1 ± 1.5 | 28.9 ± 2.5 | 33.2 ± 0.9 | 86.9 ± 6.3 | 4.5 ± 0.5 | 13.7 ± 1.4 |
MESA | 2,499 (1,211) | 69.4 ± 9.2 | 40.1 ± 4.0 | 13.4 ± 1.4 | 30.1 ± 2.3 | 33.4 ± 1.1 | 89.9 ± 6.0 | 4.5 ± 0.5 | – |
SAFS | 1,152 (492) | 40.6 ± 15.9 | 40.3 ± 4.5 | 13.1 ± 1.5 | 29.0 ± 2.3 | 32.6 ± 1.4 | 88.9 ± 5.4 | 4.5 ± 0.5 | – |
WHI | 10,913 (0) | 66.7 ± 6.8 | 40.2 ± 2.9 | 13.5 ± 1.0 | 29.9 ± 2.1 | 32.9 ± 1.1 | 90.9 ± 5.8 | 4.4 ± 0.4 | 14.2 ± 1.3 |
Values are shown as mean ± SD. Abbreviations are as follows: HCT, hematocrit; HGB, hemoglobin; MCH, mean corpuscular hemoglobin; MCHC, mean corpuscular hemoglobin concentration; MCV, mean corpuscular volume; RBC, red blood cell count; RDW, red blood cell width.
RBC trait measurements and exclusion criteria in TOPMed
The seven RBC traits considered for analyses were measured from freshly collected whole blood samples at local clinical laboratories using automated hematology analyzers calibrated to manufacturer recommendations according to clinical laboratory standards. Each trait was defined as follows. HCT is the percentage of volume of blood that is composed of red blood cells. HGB is the mass per volume (grams per deciliter) of hemoglobin in the blood. MCH is the average mass in picograms of hemoglobin per red blood cell. MCHC is the average mass concentration (grams per deciliter) of hemoglobin per red blood cell. MCV is the average volume of red blood cells, measured in femtoliters. RBC count is the count of red blood cells in the blood, by number concentration in millions per microliter. RDW is the measurement of the ratio of variation in width to the mean width of the red blood cell volume distribution curve taken at ±1 CV. In studies where multiple blood cell measurements per participant were available, we selected a single measurement for each trait and each participant as described further in supplemental methods. Each trait was analyzed to identify extreme values that may have been measurement or recording errors and such observations were removed from the analysis (see supplemental methods). Table 1 displays the mean and standard deviation among participants analyzed after exclusions by study. The pairwise correlation among the seven RBC traits is shown in Table S2.
WGS data and quality control in TOPMed
WGS was performed as part of the NHLBI TOPMed program. The WGS was performed at an average depth of 38 × by six sequencing centers (Broad Genomics, Northwest Genome Institute, Illumina, New York Genome Center, Baylor, and McDonnell Genome Institute) using Illumina X10 technology and DNA from blood. Here we report analyses from “Freeze 8,” for which reads were aligned to human-genome build GRCh38 using a common pipeline across all centers. To perform variant quality control (QC), a support vector machine (SVM) classifier was trained on known variant sites (positive labels) and Mendelian inconsistent variants (negative labels). Further variant filtering was done for variants with excess heterozygosity and Mendelian discordance. Sample QC measures included: concordance between annotated and inferred genetic sex, concordance between prior array genotype data and TOPMed WGS data, and pedigree checks. Details regarding the genotype “freezes,” laboratory methods, data processing, and quality control are described on the TOPMed website and in a common document accompanying each study’s dbGaP accession.23 Genomic coordinates of variants presented here are based on the GRCh38 build.
Single-variant association analysis
Single-variant association tests were performed for each of the seven RBC traits separately using linear mixed models (LMMs). In each case, a model assuming no association between the outcome and any genetic variant was first fit; we refer to this as the “null model.” In the null model, covariates modeled as fixed effects were sex; age at trait measurement; a variable indicating TOPMed study and phase of genotyping (study_phase); indicators of whether the participant is known to have had a stroke, chronic obstructive pulmonary disease (COPD), or a venous thromboembolism (VTE) event; and the first 11 PC-AiR24 principal components (PCs) of genetic ancestry. A 4th degree sparse empirical kinship matrix (KM) computed with PC-Relate25 was included to account for genetic relatedness among participants. Additional details on the computation of the ancestry PCs and the sparse KM are provided in the supplemental methods. Finally, we allowed for heterogeneous residual variances by study and ancestry group (e.g., ARIC_White), as this has been shown previously to control inflation.26 The details on how we estimated the ancestry group for this adjustment are in the supplemental methods. The numbers of individuals per ancestry group per study and the respective mean and standard deviation for each trait are shown in Table S3.
To improve power and control of false positives when phenotypes have a non-normal distribution, we implemented a fully adjusted two-stage procedure for rank-normalization when fitting the null model, for each of the seven RBC traits in turn:27
-
1.
Fit a LMM, with the fixed effect covariates, sparse KM, and heterogeneous residual variance model as described above. Perform a rank-based inverse-normal transformation of the marginal residuals, and subsequently rescale by their variance prior to transformation. This rescaling allows for clearer interpretation of estimated genotype effect sizes from the subsequent association tests.
-
2.
Fit a second LMM using the rank-normalized and re-scaled residuals as the outcome, with the same fixed effect covariates, sparse KM, and heterogeneous residual variance model as in stage 1.
The output of the stage 2 null model was then used to perform genome-wide score tests of genetic association for all individual variants with minor allele count (MAC) ≥ 5 that passed the TOPMed variant quality filters and had less than 10% of samples freeze-wide with sequencing read depth < 10 at that particular variant. We tested up to 102,674,666 SNVs and 7,722,116 indels (Table S4). Genome-wide significance was determined at the p < 5E−9 level.28 For each locus, we defined the top variant as the most significant variant within a 2 Mb window. All association analyses were performed using the GENESIS software.29
Conditional analysis
Because of the very large number of variants and genomic loci that have recently been associated with quantitative RBC traits, following the single-variant association analyses, we systematically performed a series of conditional association analyses for each trait to determine which genome-wide significant associations were independent of previously reported RBC variants. We gathered the variants known to be associated with each phenotype from previous publications (Table S5) and matched these to TOPMed variants using position and alleles. Then, genome-wide conditional association analyses were performed by including known variants as fixed effects covariates in the null model using the same fully adjusted two-stage LMM association testing procedure described above. We performed three types of conditional analysis, namely the trait-specific, the trait-agnostic, and the iterative, stepwise conditional analysis to identify a set of conditionally independent variants that have not been previously reported (supplemental methods).
Single-variant association analysis of chromosome 16
The alpha-globin gene region on chromosome 16p13.3 contains a large, 3.7 kb structural variant (esv3637548, chr16: 173,529–177,641) common among African ancestry individuals known to be highly significantly associated with all RBC traits.15,18 This large copy number variant is not well-tagged by SNVs in the region. Therefore, we performed genotype calling for the alpha-globin 3.7 kb CNV in 52,772 available TOPMed whole genomes using MosDepth.30 Since the chromosome 16 alpha-globin CNV calls were available for only a subset of the samples in the primary analyses, to assess the effect of conditioning on the alpha-globin CNV, the same set of analyses described above were run for chromosome 16 restricted to the sample set with alpha-globin CNV calls. The most probable alpha-globin copy number was included as a categorical variable to allow for potential non-linear effects on the phenotype.
Proportion of variance explained
For each trait, we estimated the proportion of variance explained (PVE) by the set of LD-pruned known associated variants, by the final set of conditionally independent variants we identified following the iterative stepwise conditional analysis, and by both sets together. These cumulative PVE values were estimated jointly from the stage 2 null model using approximations from multi-parameter score tests, thus accounting for covariance between the variant effect size estimates. The PVE estimates were calculated using the full sample set and did not include the alpha-globin CNV as a known variant but did include the set of conditionally independent SNVs and indels identified on chromosome 16 after conditioning on the alpha-globin CNV. More details are provided in the supplemental methods.
Replication studies for single-variant association findings
We sought replication of the lead variants at genome-wide significant loci identified in the trait-specific conditional analysis in independent studies including the INTERVAL study, the Kaiser-Permanente Genetic Epidemiology Research on Aging (GERA) cohort, samples from the Women’s Health Initiative - SNP Health Association Resource (WHI-SHARe)31 not included in TOPMed, European ancestry samples from phase 1 of the UK BioBank (UKBB),9 and African and East Asian ancestry samples from phase 2 of UKBB.21 WGS data were used in INTERVAL while genotyping on various arrays and imputation to TOPMed WGS data or 1000 Genomes Phase 3 reference panels were performed in Kaiser, WHI-SHARe, and UKBB. Residuals were obtained by regressing the harmonized RBC traits on age, sex, the first 10 PCs in each study stratified by ancestry, followed by association analyses testing each genetic variant with the inverse-normalized residual values. Summary statistics from each study were combined through fixed-effect inverse-weighting meta-analysis using METAL.32
Aggregate variant association analysis of rare variants within each gene
Association tests aggregating rare variants by gene were performed for each RBC trait in order to assess the cumulative effect of rare variants within each gene and associated regulatory regions. We applied five strategies for grouping and filtering variants. Three of them aggregated coding variants and two of them aggregated coding and non-coding regulatory variants. For each aggregation strategy we filtered variants using one or more deleterious prediction scores creating relatively relaxed or stringent sets of variants (see details in supplemental methods). The five strategies are referred to as C1-S, C1-R, C2-R, C2-R+NC-S, and C2-R+NC-R by abbreviating coding to “C,” non-coding to “NC,” stringent to “S,” and relaxed to “R.” For all aggregate units, only variants with MAF < 0.01 that passed the quality filters and had less than 10% of samples with sequencing read depth < 10 were considered. The aggregate association tests were performed using the Efficient Variant-Set Mixed Model Association Test (SMMAT).33 The SMMAT test used the same fully adjusted two-stage null model as was fit for the single variant association tests, therefore adjusting for the same covariates, kinship, and residual variance structure as the single variant association analyses. For each aggregation unit, SMMAT efficiently combines a burden test p value with an asymptotically independent adjusted “SKAT-type” test p value using Fisher’s method. This testing approach is more powerful than either a burden or SKAT34 test alone and is computationally more efficient than the SKAT-O test.35 Wu weights34 based on the variant MAF were used to upweight rarer variants in the aggregation units. Significance was determined using a Bonferroni threshold, adjusting for the number of gene-based aggregation units tested genome-wide with cumulative MAC ≥ 5. Two types of conditional analysis were run (“trait-specific” and “trait-agnostic), conditioning previously reported RBC trait-associated variants as well as those discovered in the TOPMed single variant tests (Table S5). In addition, any previously reported RBC trait-associated variants and the set of conditionally independent variants identified in our single variant analyses were excluded from the gene-based aggregation units.
Predicted loss-of-function variants and predicted gene knockouts and their association with RBC traits
Our analyses of predicted loss-of-function (pLoF) variants in TOPMed freeze 8 focused on variants annotated by ENSEMBL’s Variant Effect Predictor (VEP) as nonsense, essential splice site, and frameshift insertion-deletion (indel) variants. From this list, we excluded variants that map to predicted transcripts36 and also variants located in the first and last 5% of the gene as these variants are more likely to give rise to transcripts that escape nonsense-mediated mRNA decay.37 We used a method previously described to identify predicted gene knockouts (pKO).38 Briefly, we considered individuals that were homozygotes for LoF variants, but also individuals who inherited two different LoF variants in trans using available phased information (compound heterozygotes).
We analyzed each study-ethnic group separately, adjusting for sex, age, and smoking status. We then normalized the residuals with each group using inverse normal transformation. We performed association testing per ethnic group with EPACTS. We adjusted all analyses using the first ten PCs and a kinship matrix (EMMAX) calculated using 150,000 common variants in LD. For pLoF, we tested an additive genetic model. For pKO, we coded individuals as “0” if they were not a pKO and as “1” if they were a pKO. We meta-analyzed association results using METAL.32 We excluded variants located in the alpha-globin region in self-reported African-ancestry individuals. The genome-wide significant threshold for each ancestral group was defined as p < 0.05/number of variants. Sensitivity analyses testing hemoglobin levels with LoF variants on chromosome 11 showed that adjustment for smoking status has minimal impact on the association results (Pearson’s correlation of p values > 0.99).
Lentivirus packaging
HEK293T cells (ATCC, cat# CRL-3216) were cultured with DMEM with 10% fetal bovine serum and 1% penicillin-streptomycin solution (10,000 U/mL stock). To produce lentivirus, HEK293T cells were transfected at 70%–80% confluence with 13.3 μg psPAX2, 6.7 μg VSV-G, and 20 μg of the lentiviral construct plasmid of interest using 180 μg of linear polyethylenimine in 15 cm tissue culture dishes. Lentiviral supernatant was collected at both 48 h and 72 h post-transfection and concentrated by ultracentrifugation at 24,000 rpm for 4 h at 4°C with a Beckman Coulter SW 32 Ti rotor.
HUDEP-2 cell and human CD34+ hematopoietic stem and progenitor cells (HSPCs) culture
HUDEP-2 cells39 were generously shared by Ryo Kurita (Japanese Red Cross) and Yukio Nakamura (RIKEN BioResource Research Center, University of Tsukuba, Japan) and cultured as previously described.40 Expansion phase medium for HUDEP-2 cells consists of SFEM (StemCell Technologies, Inc. #09650) base medium supplemented with 50 ng/mL recombinant human SCF (R&D systems #255-SC), 1 μg/mL doxycycline (Sigma Aldrich #D9891), 0.4 μg/mL dexamethasone (Sigma Aldrich #D4902), 3 IU/mL EPO (Epoetin Alfa, Epogen, Amgen), and 1% penicillin-streptomycin solution (10,000 U/mL stock). Human CD34+ HSPCs from mobilized peripheral blood of deidentified healthy donors were obtained from Fred Hutchinson Cancer Research Center, Seattle, Washington. CD34+ cells were maintained in SFEM supplemented with 1× StemSpan CD34+ expansion supplement (Cat# 02691, STEMCELL Technology).
Generation of AncBE4max-SpRY-expressing stable HUDEP-2 cell lines
The lentiviral plasmid for AncBE4max-SpRY41 was generated by subcloning the coding sequence of nSpRY(D10A) into the AgeI and XcmI restriction sites of pRDA_257 (pLenti-BPNLS-AncBE4-gsXTENgs-nSpCas9-gs-UGI-gs-BPNLS-P2A-Puro), generously provided by John Doench (Broad Institute). Lentivirus was produced as described above. HUDEP-2 cells were transduced with lentivirus, and 1 μg/mL puromycin was added into culture medium 2 days after lentiviral transduction. After 2-week positive selection, AncBE4max-SpRY editing efficiency was tested using multiple sgRNAs with variable PAM sequence.
C-to-T base editing at the rs112097551 locus in HUDEP-2 cells
The sequence of single-guide RNA targeting rs112097551 (chr3:128,603,774, GenBank: NC_000003.12, g.128603774G>A) is summarized in Table S6. Oligos (from GENEWIZ company) were annealed and ligated into LentiGuide-Puro (Addgene plasmid 52963). Following lentiviral production and transduction into cell lines with stable SpCas9 expression, 1 μg/mL puromycin were added to select for sgRNA integrants in HUDEP-2 cells expressing AncBE4max-SpRY. C-to-T editing efficiency was determined in bulk cells 10 days after lentiviral delivery into AncBE4max-SpRY-expressing HUDEP-2 cells (Figure S1). Briefly, genomic DNA was extracted using the QIAGEN Blood and Tissue kit. Genomic region surrounding the sgRNA targeting site was amplified using HotStarTaq DNA polymerase (QIAGEN, Cat# 203203) for other PCR reactions strictly following the manufactory instructions with variable annealing temperature. PCR products were subject to Sanger sequencing and then EditR analysis to estimate the editing efficiency based on sequencing chromatograms.42 Single HUDEP-2 cells were plated to obtain highly edited clones. Primers for PCR were summarized in Table S7.
CRISPR-Cas genome editing in CD34+ HSPCs
CD34+ cells were thawed and maintained in SFEM supplemented with 1× StemSpan CD34+ expansion supplement (Cat# 02691, STEMCELL Technology) for 24 h before electroporation. 100,000 cells per condition were electroporated using the Lonza 4D nucleofector with 100 pmol 3xNLS-SpCas943 protein and 300 pmol modified sgRNA targeting the locus of interest. In addition to mock treated cells, “safe-targeting” RNPs were used as experimental controls as indicated in each figure legend. After electroporation, cells were differentiated to erythroblasts as described previously.44 4 days after electroporation, genomic DNA was isolated from an aliquot of cells, the sgRNA targeted locus was amplified by PCR. PCR products were subject to Sanger sequencing and then TIDE analysis to quantify indel mutations.45 Meanwhile, total RNA was extracted from bulk cells and expression of genes of interest was determined by real time RT-qPCR as described below.
Determination of target gene expression
Total RNA was extracted from cell cultures 4 days after electroporation using the RNeasy Plus Mini Kit (QIAGEN) and reverse transcribed using the iScript cDNA synthesis kit (Biorad) according to the manufacturer’s instructions. Expression of target genes was quantified using real-time RT-qPCR with GAPDH (MIM: 138400) as an internal control. All gene expression data represent the mean of at least three biological replicates. Primers for PCR are summarized in Table S7.
Immunophenotyping of human CD34+ HSPCs xenograft from NBSGW mice
NOD.Cg-KitW-41J Tyr + Prkdcscid Il2rgtm1Wjl (NBSGW) mice were obtained from Jackson Laboratory (Stock 026622). CD34+ HSPCs were maintained and edited as described above. After electroporation, cells were allowed to recover for 24–48 h in SFEM medium with 1× StemSpan CD34+ expansion supplement (Cat# 02691, STEMCELL Technology). Cells were then washed twice by PBS, resuspended in 200 μL DPBS per million cells, and then infused by retro-orbital injection into non-irradiated NBSGW female mice. 16 weeks post transplantation, mice were euthanized, and bone marrow was collected and analyzed as previously described.45 Analysis of bone marrow subpopulations was performed by flow cytometry. Antibodies for flow cytometry included Human TruStainFcX (422302, BioLegend), TruStainfcX (anti-mouse CD16/32, 101320, BioLegend), anti-mouse CD45 (30-F11), anti-human CD45 (HI30), and Fixable Viability Dye eFluor 780 for live/dead staining (65-0865-14, Thermo Fisher). Percentage human engraftment was calculated as hCD45+ cells/(hCD45+ + mCD45+ cells). Cell sorting was performed on a FACSAria II machine (BD Biosciences).
Results
Single-variant association analysis
In the single-variant association analyses, the genomic inflation factors ranged from 1.015 to 1.038, indicating adequate control of population stratification and relatedness (Table S8). A total of 69 loci reached genome-wide significance for any of the seven RBC traits (p < 5E−9, Figure S2 and Table S9). Of the 69 loci, 9 (HBB, HBA1, RPN1, ELL2, EIF5-MARK3, MIDN, PIEZO1, TMPRSS6, and G6PD [MIM: 141900, 141800, 180470, 601874, 601710, 606700, 611184, 609862, 305900, respectively]) remained significant in the conditional analysis after accounting for RBC trait-specific known loci. In addition, three more loci reached genome-wide significance following RBC trait-specific conditional analysis (19q12, 10q26, and SHANK2 [MIM: 603290], p < 5E−9, Figure S3). Therefore, a total of 12 loci showed genome-wide significance for association with at least one of the seven RBC traits in the trait-specific conditional analysis, indicating signals independent of previously reported variants (p < 5E−9) (Figure S4, Table 2).
Table 2.
Trait | Variant | Chr:Pos (GRCh38) | Gene | CA/NCA | CAF(%) | N | Beta | SE | p | pconditional1a | pconditional2b |
---|---|---|---|---|---|---|---|---|---|---|---|
HCT | rs11549407 | 11: 5,226,774 | HBB | A/G | 0.026 | 62,487 | −4.94 | 0.67 | 1.68E−13 | 3.43E−13 | 1.55E−12 |
HGB | rs11549407 | 11: 5,226,774 | HBB | A/G | 0.026 | 62,461 | −2.14 | 0.23 | 2.86E−21 | 4.76E−21 | 1.75E−20 |
rs1368500441 | 19: 28,868,893 | 19q12 | A/G | 0.005 | 62,461 | 2.65 | 0.46 | 1.02E−8 | 2.49E−9 | 6.64E−8 | |
MCH | rs112097551 | 3:128,603,774 | RPN1 | A/G | 0.398 | 62,461 | 0.78 | 0.12 | 4.01E−10 | 4.27E−11 | 4.08E−10 |
rs116635225 | 5: 95,989,447 | ELL2 | A/G | 1.307 | 46,241 | −0.43 | 0.07 | 3.37E−9 | 1.18E−11 | 2.58E−11 | |
rs986415672 | 10: 131,440,166 | 10q26 | T/C | 0.006 | 46,241 | −4.26 | 0.82 | 2.16E−7 | 3.06E−9 | 2.49E−9 | |
rs34598529 | 11: 5,227,100 | HBB | C/T | 0.083 | 46,241 | −4.31 | 0.29 | 1.06E−49 | 1.37E−52 | 1.03E−53 | |
rs535577177 | 11: 70,462,791 | SHANK2 | A/G | 0.008 | 46,241 | −4.72 | 0.82 | 1.04E−8 | 8.28E−10 | 3.38E−9 | |
rs370308370 | 14: 103,044,696 | EIF5/MARK3 | A/G | 0.011 | 46,241 | −4.35 | 0.74 | 3.15E−9 | 1.42E−9 | 5.49E−9 | |
rs868351380 | 16: 55,649 | HBA1/2 | C/G | 0.022 | 37,917 | −3.19 | 0.51 | 4.85E−10 | 8.87E−11 | 1.49E−11 | |
rs73494666 | 19: 1,253,643 | MIDN | T/C | 16.5 | 46,241 | −0.16 | 0.03 | 1.11E−9 | 4.27E−11 | 9.00E−9 | |
rs228914 | 22: 37,108,472 | TMPRSS6 | A/C | 89.0 | 46,241 | −0.09 | 0.02 | 3.76E−5 | 6.53E−10 | 2.76E−8 | |
MCHC | rs11549407 | 11: 5,226,774 | HBB | A/G | 0.028 | 52,648 | −1.79 | 0.18 | 4.79E−23 | 1.21E−23 | 1.87E−23 |
rs763477215 | 16: 88,717,174 | PIEZO1 | A/ATCT | 0.070 | 52,648 | 0.66 | 0.11 | 1.57E−9 | 2.66E−9 | 1.74E−9 | |
MCV | rs112097551 | 3:128,603,774 | RPN1 | A/G | 0.405 | 48,830 | 1.98 | 0.31 | 1.09E−10 | 7.65E−12 | 6.28E−10 |
rs11549407 | 11: 5,226,774 | HBB | A/G | 0.028 | 48,830 | −16.5 | 1.08 | 3.52E−53 | 1.00E−54 | 1.31E−55 | |
rs868351380 | 16: 55,649 | HBA1/2 | C/G | 0.022 | 39,107 | −7.99 | 1.31 | 1.19E−9 | 2.17E−10 | 3.20E−11 | |
rs73494666 | 19: 1,253,643 | MIDN | T/C | 16.7 | 48,830 | −0.42 | 0.07 | 3.90E−10 | 2.72E−10 | 1.77E−11 | |
rs228914 | 22: 37,108,472 | TMPRSS6 | A/C | 89.1 | 48,830 | −0.20 | 0.06 | 3.80E−4 | 9.53E−10 | 2.52E−6 | |
RBC | rs34598529 | 11: 5,227,100 | HBB | C/T | 0.084 | 44,470 | 0.55 | 0.06 | 3.59E−22 | 1.48E−25 | 1.91E−23 |
rs372755452 | 16: 199,621 | HBA1/2 | A/AG | 0.010 | 36,430 | 1.27 | 0.18 | 1.55E−12 | 6.08E−10 | 3.95E−9 | |
RDW | rs34598529 | 11: 5,227,100 | HBB | C/T | 0.092 | 29,385 | 1.96 | 0.22 | 4.44E−19 | 1.35E−20 | 2.16E−20 |
rs76723693 | X: 154,533,025 | G6PD | G/A | 0.297 | 29,385 | −0.91 | 0.10 | 2.38E−19 | 2.97E-20 | 2.99E−15 |
Conditional analysis at the HBA1/2 locus was performed in a subset of TOPMed samples with available alpha-globin CNV data. Abbreviations are as follows: Chr, chromosome; Pos, position; CA, coded allele; NCA, non-coded allele; CAF, coded allele frequency; HCT, hematocrit; HGB, hemoglobin; MCH, mean corpuscular hemoglobin; MCHC, mean corpuscular hemoglobin concentration; MCV, mean corpuscular volume; RBC, red blood cell count; RDW, red blood cell width.
In the first conditional analysis, trait-specific reported variants were adjusted in the model.
In the second conditional analysis, all reported variants regardless of associated traits were adjusted in the model.
At the 12 significant loci identified in the trait-specific conditional analyses which have not been reported previously, the number of genome-wide significant variants ranged from 1 to 162 (Figure S4 and Table S10). Six loci harbored more than one genome-wide significant variants (HBB, HBA1, ELL2, MIDN, TMPRSS6, and G6PD). The lead variants for each trait at each of these 12 loci (including, across the 7 traits, 14 distinct variants [12 SNVs and 2 small indels]) are shown in Table 2. Notably, only two lead variants (MIDN-rs73494666, chr19: 1,253,643, GenBank: NC_000019.10, g.1253643C>T and TMPRSS6-rs228914, chr22: 37,108,472, GenBank: NC_000022.11, g.37108472C>A) had MAF > 5% in TOPMed. Most of these 14 lead variants were located within non-coding regions of the genome and most were low frequency (n = 3 between MAF 0.1% and MAF 2%) or rare (n = 9 with MAF < 0.1%). The latter category included three loci (SHANK2-rs535577177 [chr11: 70,462,791, GenBank: NC_000011.10, g.70462791G>A], 10q26-rs986415672 [chr10: 131,440,166, GenBank: NC_000010.11, g.131440166C>T], and 19q12-rs136850044 [chr19: 28,868,893, GenBank: NC_000019.10, g.28868893G>A]) in which the lead variant was extremely rare with MAF < 0.01%. Several of the lead variants showed large allele frequency differences between race/ethnicity groups as assessed from the genome aggregation database or gnomAD (Table S11). The RPN1-rs112097551 (chr3: 128,603,774, GenBank: NC_000003.12, g.128603774G>A), HBB-rs34598529 (chr11: 5,227,100, GenBank: NC_000011.10, g.5227100T>C), G6PD-rs76723693 (chrX: 154,533,025, GenBank: NC_000023.11, g.154533025A>G, NP_001346945.1, p.Leu323Pro), MIDN-rs73494666 (chr19: 1,253,643, GenBank: NC_000019.10, g.1253643C>T), and ELL2-rs116635225 (chr5: 95,989,447, GenBank: NC_000005.10, g.95989447G>A) variants are found disproportionately among individuals of African ancestry. The EIF5/MARK3-rs370308370 (chr14: 103,044,696, GenBank: NC_000014.9, g.103044696G>A) and chromosome 16p13.3 alpha-globin locus (rs372755452, chr16: 199,621, GenBank: NC_000016.10, g.199622del) variants are found only among East Asians. The alpha-globin locus variant rs868351380 (chr16: 55649, GenBank: NC_000016.10, g.55649G>C) and PIEZO1 variant rs763477215 (chr16: 88,717,174, GenBank: NC_000016.10, g.88717175_88717177TCT[4], GenBank: NP_001136336.2, p.Lys2169del) are more common among Hispanics/Latinos and Europeans, respectively.
Replication of single-variant discoveries
We sought replication for each of the 14 discovered variants in INTERVAL, the Kaiser Permanente GERA Study, the WHI-SHARe study, and UKBB phase 1 European and phase 2 African and East Asian samples (Table S12). Several of the rare variants (SHANK2-rs535577177, 10q26-rs986415672, 19q12-rs1368500441, EIF5/MARK3-rs370308370, and HBB-rs11549407 [chr11: 5,226,774, GenBank: NC_000011.10, g.5226774G>A, GenBank: NP_000509.1, p.Gln40Ter]) were not available for testing in any of the replication studies due to low frequency, population specificity, and/or poor imputation quality. For eight of the nine lead variants with available genotype data for testing, we successfully replicated each of the trait-specific associations for HBB-rs34598529, HBA1-rs868351380 (chr16: 55,649, GenBank: NC_000016.10, g.55649G>C), HBA1-rs372755452 (chr16: 199,622, GenBank: NC_000016.10, g.199622del), RPN1, ELL2, PIEZO1, G6PD, and MIDN (meta-analysis p < 5.6E−3, 0.05/9 loci, with consistent directions of effect). The replication p value for the lead variant at TMPRSS6 did not reach the predetermined significance threshold, but the association was directionally consistent. We further note that several of our identified TOPMed single variant-RBC trait associations (RPN1, HBB-rs11549407 and -rs34598529, and MIDN) reached genome-wide significance in recently published very large European ancestry or multi-ethnic imputed GWASs.19,21,46
Relationship of single variants discovered in TOPMed to previously known RBC genetic loci
Several of the variants we discovered in the single-variant association analysis (particularly those replicated in independent samples) in Table 2 are located within genomic regions known to harbor common variants associated with RBC quantitative traits and/or variants responsible for Mendelian blood cell disorders, such as hemoglobinopathies (HBB, HBA1/HBA2 [MIM: 141850]) and various hemolytic or non-hemolytic anemias (G6PD, PIEZO1, TMPRSS6, and GATA2-RPN1 [MIM: 137295]). At the HBB locus, the lead variant associated with lower HCT, HGB, MCHC, and MCV is a LoF variant (rs11549407 encoding p.Gln40Ter, MAF = 0.026%) while the lead variant associated with lower MCH and higher RBC, and higher RDW is a variant located within the HBB promoter region (rs34598529, MAF = 0.083%). At the HBA1/HBA2 locus, the lead variant for MCH and MCV, rs868351380 (MAF = 0.022%), is located ∼125 kb upstream of HBA1/HBA2 in an intron of SNRNP25, and the lead variant for RBC, rs372755452 (MAF = 0.010%), is located ∼30 kb downstream of HBA1/HBA2 in an intron of LUC7L (MIM: 607782). The GATA2-RPN1 locus, which contains variants previously reported for association with MCH and RDW in a European-only analysis (rs2977562 [chr3:128,387,424, GenBank: NC_000003.12, g.128387424A>G] and rs147412900 [chr3:128,575,268, GenBank: NC_000003.12, g.128575268G>A]),13 was associated with MCH and MCV in TOPMed (lead variant rs112097551, p = 4.27E−11). The MAF of the lead variant at the GATA2-RPN1 locus in all TOPMed samples is 0.4% but is 5.9 times more common among African than non-African samples according to gnomAD. At the G6PD locus, the lead variant associated with lower RDW was a missense variant rs76723693, which encodes p.Leu323Pro. At the PIEZO1 locus, the most significant variant was an in-frame 3 bp deletion rs763477215 (p.Lys2169del) associated with higher MCHC. While the index SNP rs228914 at TMPRSS6 has not been previously associated with RBC parameters, rs228914 is a cis-eQTL for TMPRSS6 and an LD surrogate rs228916 (chr22: 37,109,512, GenBank: NC_000022.11, g.37109512C>T) has been previously associated with serum iron levels.47 The remaining genetic loci (SHANK2, ELL2, 19q12, 10q26, EIF5/MARK3, and MIDN) have less clear functional relationships to RBC phenotypes. Moreover, the lead variants at EIF5/MARK3 and MIDN for MCH and the lead variant at TMPRSS6 for MCH and MCV were partially attenuated in the trait-agnostic conditional analysis.
Iterative conditional analysis identifies extensive allelic heterogeneity at HBB locus
We next performed stepwise conditional analysis to dissect association signals within each of the six loci harboring more than one genome-wide significant variants in the RBC trait-specific conditional analysis. One of the six regions (HBB) was found to have multiple, genome-wide significant variants independent of previously reported loci. The largest number of independent signals were observed for association with MCH (11 signals, Table S13). All independent variants at the HBB locus had MAF < 1%. No secondary independent signals were discovered in other regions (HBA1/2, ELL2, MIDN, TMPRSS6, and G6PD). For each RBC trait, we estimated the PVE by the set of LD-pruned known variants, by the conditionally independent variants identified in stepwise conditional analysis, and by both sets together (Table S14). In total, the PVE ranged from 3.4% (HCT) to 21.3% (MCH). The identified set of genetic variants that have not been described previously explained up to 3% of phenotypic variance (for MCH and MCV).
Rare variant aggregated association analysis
We next examined rare variants with MAF < 1% in TOPMed, aggregated based on protein-coding and non-coding gene units from GENCODE. To enrich for likely causal variants in the aggregation units, we used five different variant grouping and filtering strategies based on coding sequence and regulatory (gene promoter/enhancer) functional annotations (see supplemental methods). After accounting for all previously reported RBC trait-specific single variants, a total of five loci were significantly associated with one or more RBC traits using various aggregation strategies (Tables 3 and S15). These include genes encoding HBA1/HBA2, TMPRSS6, G6PD, and CD36 (MIM: 173510), as well as several genes and non-coding RNAs within the beta-globin locus on chromosome 11p15 (HBB, HBG1 [MIM: 142200], CTD-264317.6 [MIM: 604927], OR52H1, RF60021, and OR52R1). Some of the gene units in the chromosome 11p15 beta-globin region (HBG1, OR52R1, and RF00621) became non-significant after further adjustment for all known RBC variants in the trait-agnostic conditional analysis (Table 3). After additionally accounting for all 11 independent single-variant signals identified in TOPMed at the HBB locus in stepwise conditional analysis (Table S13), as well as all trait-specific known variants, five coding genes remained significant (HBA1/HBA2, HBB, TMPRSS6, G6PD, and CD36, Table S16) and two additional genes (TFRC [MIM: 190010] and SLC12A7 [MIM: 604879]) reached significance threshold (Table S16). AC104389.6, a non-coding gene 2 bp downstream of HBB, was also found significant in the aggregation approach where we included upstream regulatory variants, but the variants including in this unit are predominately the same ones tested in the HBB gene unit and hence we have not reported this gene unit as a distinct signal.
Table 3.
Trait | Chr (GRCh38) | Start (GRCh38) | End (GRCh38) | Gene | No. of variants | MAC | p | pconditional1a | pconditional2b |
---|---|---|---|---|---|---|---|---|---|
HCT | 11 | 5225464 | 5229395 | HBB | 15 | 76 | 1.27E−23 | 1.35E−23 | 5.91E−18 |
11 | 5224309 | 5225461 | AC104389.6 | 94 | 1,395 | 1.85E−13 | 6.23E−15 | 3.32E−11 | |
HGB | 11 | 5225464 | 5229395 | HBB | 15 | 76 | 2.06E−35 | 8.99E−30 | 7.44E−29 |
11 | 5224309 | 5225461 | AC104389.6 | 94 | 1,394 | 1.29E−18 | 2.43E−17 | 1.05E−23 | |
MCH | 11 | 5224309 | 5225461 | AC104389.6 | 83 | 1,078 | 6.76E−100 | 2.87E−104 | 5.51E−95 |
11 | 5225464 | 5229395 | HBB | 34 | 126 | 9.53E−76 | 2.76E−78 | 3.11E−75 | |
11 | 5224448 | 5224639 | RF00621 | 588 | 12,096 | 1.93E−20 | 4.02E−20 | 1.28E−12 | |
11 | 5544489 | 5548533 | OR52H1 | 8 | 441 | 6.15E−16 | 6.13E−17 | 9.82E−18 | |
11 | 5248079 | 5249859 | HBG1 | 526 | 7,852 | 9.95E−09 | 8.61E−9 | 8.36E−4 | |
16 | 176680 | 177522 | HBA1 | 16 | 30 | 4.97E−6 | 5.95E−9 | 1.98E−9 | |
22 | 37065436 | 37109713 | TMPRSS6 | 243 | 3,317 | 6.77E−07 | 9.92E−12 | 1.16E−9 | |
X | 154531391 | 154547572 | G6PD | 59 | 599 | 2.32E−06 | 6.59E−7 | 2.50E−7 | |
MCHC | 11 | 5224309 | 5225461 | AC104389.6 | 88 | 1,225 | 2.37E−64 | 5.01E−40 | 8.73E−39 |
11 | 5225464 | 5229395 | HBB | 36 | 136 | 4.07E−34 | 1.04E−33 | 2.65E−31 | |
11 | 5544489 | 5548533 | OR52H1 | 8 | 502 | 3.88E−07 | 2.12E−6 | 7.50E−7 | |
MCV | 11 | 5224309 | 5225461 | AC104389.6 | 86 | 1,148 | 2.29E−153 | 1.40E−148 | 4.75E−108 |
11 | 5225464 | 5229395 | HBB | 35 | 130 | 4.10E−82 | 6.02E−86 | 1.11E−81 | |
11 | 5224448 | 5224639 | RF00621 | 597 | 12,848 | 3.11E−37 | 1.56E−30 | 2.74E−16 | |
11 | 5544489 | 5548533 | OR52H1 | 8 | 468 | 1.07E−19 | 3.29E−19 | 4.50E−22 | |
11 | 5248079 | 5249859 | HBG1 | 546 | 8,321 | 4.46E−15 | 5.71E−8 | 1.79E−2 | |
16 | 176680 | 177522 | HBA1 | 16 | 30 | 5.11E−4 | 2.03E−6 | 9.24E−7 | |
22 | 37065436 | 37109713 | TMPRSS6 | 252 | 3,567 | 8.61E−06 | 9.11E−10 | 9.90E−8 | |
X | 154531390 | 154547572 | G6PD | 82 | 732 | 2.19E−12 | 2.70E−13 | 7.06E−14 | |
RBC | 11 | 5224309 | 5225461 | AC104389.6 | 81 | 1,036 | 9.51E−57 | 5.47E−60 | 2.55E−44 |
11 | 5225464 | 5229395 | HBB | 34 | 113 | 2.24E−24 | 5.35E−28 | 6.06E−25 | |
11 | 5224448 | 5224639 | RF00621 | 576 | 11,551 | 6.13E−15 | 7.39E−15 | 7.31E−7 | |
11 | 4803433 | 4804380 | OR52R1 | 72 | 1,551 | 4.48E−09 | 1.87E−9 | 9.37E−2 | |
11 | 5248079 | 5249859 | HBG1 | 517 | 7,502 | 2.74E−07 | 4.09E−8 | 3.49E−1 | |
X | 154531390 | 154547572 | G6PD | 58 | 574 | 1.29E−06 | 2.99E−9 | 3.49E−8 | |
RDW | 7 | 80369575 | 80679277 | CD36 | 178 | 1,537 | 3.28E−4 | 6.45E−7 | 2.46E−6 |
11 | 5224309 | 5225461 | AC104389.6 | 73 | 702 | 1.55E−29 | 1.19E−30 | 2.84E−24 | |
11 | 5225464 | 5229395 | HBB | 13 | 54 | 2.06E−24 | 9.07E−27 | 1.14E−24 | |
11 | 5544489 | 5548533 | OR52H1 | 7 | 300 | 1.20E−08 | 4.55E−9 | 7.08E−9 | |
11 | 5224448 | 5224639 | RF00621 | 480 | 8,119 | 1.80E−08 | 1.21E−8 | 2.01E−4 | |
22 | 37065436 | 37109713 | TMPRSS6 | 72 | 614 | 2.89E−07 | 1.38E−7 | 4.86E−8 | |
X | 154531390 | 154547572 | G6PD | 47 | 449 | 2.13E−24 | 6.71E−27 | 8.33E−21 |
Conditional analysis at the HBA1/2 locus was performed in a subset of TOPMed samples with available alpha-globin CNV data. Abbreviations are as follows: Chr, chromosome; MAC, minor allele counts; HCT, hematocrit; HGB, hemoglobin; MCH, mean corpuscular hemoglobin; MCHC, mean corpuscular hemoglobin concentration; MCV, mean corpuscular volume; RBC, red blood cell count; RDW, red blood cell width.
In the first conditional analysis, trait-specific reported variants were adjusted in the model. All genes that reached genome-wide significance in the trait-specific conditional analysis were presented.
In the second conditional analysis, all reported variants regardless of associated traits were adjusted in the model.
Notably, each of the seven genes (HBA1/HBA2, HBB, TMPRSS6, G6PD, CD36, TFRC, and SLC12A7) identified in rare variant aggregate analyses are known to harbor common non-coding or coding variants previously associated with RBC traits or disorders. We further explored the overall patterns of association, individual rare variants driving the associations, and their annotations (Figure S5 and Table S17). Several observations are noteworthy. (1) In general, for each gene, there are multiple rare missense and small indel (frameshift or stop-gain) variants contributing to the aggregate association signals, rather than a single strongly associated variant. (2) The patterns of phenotypic association are generally uni-directional and consistent with the biologic contribution of these genes to inherited RBC disorders: HBA1 and HBB variants are associated with lower MCV/MCH, with HBB variants additionally associated with lower HCT/HGB and higher RBC/RDW, consistent with ineffective erythropoiesis and shortened red cell survival in alpha and beta thalassemia; TMPRSS6 variants associated with lower MCH/MCV (Figures S5C16-19 and S5E13-14) and higher RDW (Figure S5G14), consistent with iron-refractory iron deficiency anemia. On the other hand, for G6PD rare variants, a bi-directional pattern of phenotypic association was observed for MCH, MCV, RBC, and RDW. (3) Several of the variants contributing to the HBA1, HBB, TMPRSS6, and G6PD signals are known to be pathogenic for inherited RBC disorders. Other variants that appear to contribute to the gene-based phenotypic effect are classified in ClinVar as variants of uncertain significance (VUSs) or have conflicting evidence to support their pathogenicity. (4) Three of the genes (CD36, TFRC, and SLC12A7) are located within regions of the genome containing common variants previously associated with RBC traits but have less clear relation to RBC biology. The presence of rare coding or LoF variants within these genes provides additional fine-mapping evidence that these three genes are causally responsible for RBC phenotypic variation.
pLoF and pKO variants associated with RBC traits
Predicted loss-of-function (pLoF) and predicted gene knockout (pKO) variants were examined in European, African, Hispanic, and Asian ancestry populations in TOPMed. The European ancestry population subset had the largest sample size and the largest number of both pLoF and pKO variants (Table S18). Two pLoF variants reached genome-wide significance, namely CD36-rs3211938 (chr7:80,671,133, GenBank: NM_000072.3, c.975T>G, GenBank: NP_000063.2, p.Tyr325Ter) for RDW in African participants and HBB-rs11549407 for multiple RBC traits in Hispanic and European participants (Table S19), which have been reported in previously published studies. No pKO variant reached genome-wide significance in any of the ancestral groups (Table S20). All pLoF and pKO variants with p < 1E−4 are presented in Tables S19 and S20.
Gene editing in human erythroid precursors and xenotransplantation of edited primary HSPCs identifies RUVBL1 as likely target gene of RPN1-rs112097551
In silico functional annotation of the RPN1-rs112097551 variant revealed a CADD-PHRED score of 20.4 and that the variant lies in a putative enhancer element bound by erythroid transcription factors GATA1 and TAL1. We therefore undertook additional experiments to investigate the causal gene underlying the association signal. First, we used cytosine base editing to modify the rs112097551 reference G to alternative A allele in HUDEP-2 erythroid precursor cells. Since there was no appropriately positioned NGG PAM motif, we utilized the recently described near-PAMless SpCas9 variant cytosine base editor AncBE4max-SpRY,41 achieving 33% G-to-A conversion efficiency (Figure 1A). Analysis of erythroblast promoter capture Hi-C datasets showed that the SNP interacts with RUVBL1 (MIM: 603449) which is 500 kb upstream but not with intervening genes which include RPN1 and the hematopoietic transcription factor GATA2 (Figure 1B). In five G/A heterozygous HUDEP-2 clones compared to G/G clones, we observed significantly reduced expression of RUVBL1 without significant change in expression of four more proximal genes EEFSEC (MIM: 607695), GATA2, RPN1, and RAB7A (MIM: 602298) (Figure 1C). Next, we performed SpCas9 nuclease editing to produce indels adjacent to rs112097551 in CD34+ hematopoietic stem/progenitor cell (HSPC) derived primary erythroid precursors (Figures 1D and 1E). Cells bearing these short insertions and deletions centered 3 bp from the rs112097551 position demonstrated significantly reduced RUVBL1 expression compared to control cells, while RPN1 and RAB7A expression was unchanged (Figure 1F). Together, these base and nuclease editing results suggest that rs112097551-G contributes to a regulatory element that exerts long-range control of RUVBL1 expression. Prior work has shown the mouse homolog of RUVBL1 is required for murine hematopoiesis.48 To test the role of RUVBL1 in human hematopoiesis, we performed gene editing studies in CD34+ HSPCs in which we targeted indels to coding sequences at RUVBL1. We observed 96.1% indels at RUVBL1 compared to 84.2% indels in control cells targeted at a neutral locus. We infused edited HSPCs to immunodeficient NBSGW mice and analyzed bone marrow after 16 weeks for engrafting human hematopoietic chimerism and gene editing. Compared to CD34+ HSPCs edited at a neutral locus which showed 91.6% mean human chimerism, human CD34+ HSPCs edited at RUVBL1 demonstrated only 7.7% mean chimerism (Figures 1G–1I). Engrafting human cells were marked by frequent gene edits (60.1%) when targeted at the neutral locus but only 4.8% gene edits after RUVBL1 editing, indicating that RUVBL1 edited cells inefficiently engrafted. Together these results suggest rs112097551-G contributes to long-range enhancement of RUVBL1 expression, which in turn supports human hematopoiesis.
Discussion
We report here a WGS-based association analysis of RBC traits in an ethnically diverse sample of 62,653 participants from TOPMed. We identified 14 association signals across 12 genomic regions conditionally independent of previously reported RBC trait loci and replicated eight of these (RPN1, ELL2, PIEZO1, G6PD, MIDN, HBB-rs34598529, HBA1-rs868351380, and HBA1-rs372755452) in independent samples with available imputed genome-wide genotype data. The replicated association signals are described further below. Stepwise, iterative conditional analysis of the beta-globin gene regions on chromosomes 11 additionally identified 12 independent association signals at the HBB locus. Further investigation of aggregated rare variants identified seven genes (HBA1/HBA2, HBB, TMPRSS6, G6PD, CD36, TFRC, and SLC12A7) containing significant rare variant association signals independent of previously reported and unreported discovered RBC trait-associated single variants. For the RPN1 locus, we used base and nuclease editing to demonstrate that the sentinel variant rs112097551 acts through a cis-regulatory element that exerts long-range control of the gene RUVBL1 which is essential for hematopoiesis.
Our study highlights the benefits of increasing participant ethnic diversity and coverage of the genome in genetic association studies of complex polygenic traits. Among the 24 unique independent variants we identified in the single variant association analyses, 21 showed MAF < 1% in all TOPMed samples and 18 were monomorphic in at least one of the four major contributing ancestral populations in our analysis (European, African, East Asian, and Hispanic). These low-frequency or ancestry-specific variants were most likely missed by previous GWAS analysis using imputed genotype data or focusing on one ancestral population (Table S13).
GATA2-RPN1
Here we report and replicate a distinct low-frequency variant (MAF = 0.4% overall but considerably higher frequency among African [0.94%] than European [0.07%] ancestry individuals) associated with higher MCH and MCV in TOPMed (rs112097551). The region between GATA2 and RPN1 on chromosome 3q21 contains several common variants previously associated with various WBC-related traits in European, Asian, and Hispanic ancestry individuals and two variants previously associated with MCH and RDW in Europeans (rs2977562 and rs147412900).13 GATA2 is a hematopoietic transcription factor and heterozygous coding or enhancer mutations of GATA2 are responsible for autosomal-dominant hereditary mononuclear cytopenia (MIM: 614172), immunodeficiency and myelodysplastic syndromes (MIM: 614286), as well as lymphatic dysfunction51,52 (MIM: 137295). There was no evidence of association of the TOPMed MCH/MCV-associated rs112097551 variant with WBC-related traits in TOPMed (data not shown), though the variant was associated with higher monocyte count and percentage in Astle et al.,9 but was not conditionally independent of other variants in the region. The MCV/MCH-associated rs112097551 variant lies in a putative enhancer element bound by erythroid transcription factors GATA-1 and TAL-1 and demonstrates physical interaction in erythroblasts with RUVBL1 500 kb away. Our results from gene editing of RUVBL1 in primary human HPSCs and xenotransplantation suggest that RUVBL1 plays a role in human hematopoiesis, consistent with data from mouse models suggesting that RUVBL1 (which encodes the protein product pontin) to be essential for murine hematopoietic stem cell survival.48 This finding also highlights the complexity and importance of experimentally validating the causal gene(s) underlying GWAS signals for complex traits, which are often assigned according to physical proximity (RPN1) or assumed on the basis of biologic function (GATA2).
ELL2
The chromosome 5q15 non-coding variant rs116635225 associated with lower MCH also has a low frequency in TOPMed (1.3%) and is considerably more common among African ancestry individuals (3.9%). The rs116635225 variant is located ∼27 kb upstream of ELL2, a gene responsible for immunoglobulin mRNA production and transcriptional regulation in plasma cells. Coding and regulatory variants of ELL2 have been associated with risk of multiple myeloma in European and African ancestry individuals as well as reduced levels of immunoglobulin A and G in healthy subjects.53, 54, 55 Another set of genetic variants located ∼200 kb away in the promoter region of GLRX or glutaredoxin-1 (rs10067881 [chr5:95,826,771, GenBank: NC_000005.10, g.95826771G>A], rs17462893 [chr5:95,827,733, GenBank: NC_000005.10, g.95827733A>G], rs57675369 [chr5:95,826,714, GenBank: NC_000005.10, g.95826714_95826715insG]) have been associated with higher reticulocyte count in UKBB Europeans.9 Glutaredoxin-1 is a cytoplasmic enzyme that catalyzes the reversible reduction of glutathione-protein mixed disulfides and contributes to the antioxidant defense system. Congenital deficiencies of other members of the glutaredoxin enzyme family (GLRX5 [MIM: 609588]) have been reported in patients with sideroblastic anemia (MIM: 300751).56, 57, 58 Notably, our ELL2 rs116635225 MCH-associated variant remained genome-wide significant after conditioning on the myeloma or reticulocyte-related variants. Therefore, the precise genetic regulatory mechanisms of the red cell trait associations in this region remain to be determined.
MIDN
The chromosome 19p13 African variant rs73494666 associated with lower MCV/MCH is located in an open chromatin region of an intron of MIDN, which encodes the midbrain nucleolar protein midnolin. The gene-rich region on chromosome 19p13 also includes SBNO2 (MIM: 615729), STK11 (MIM: 602216), CBARP, ATP5F1D (MIM: 603150), CIRBP (MIM: 602649), EFNA2 (MIM: 602756), and GPX4 (MIM: 138322). However, none of these genes have clear relationships to hematopoiesis or red structure/function. Other variants in the region have been associated with MCH and RBC count (rs757293, chr19:1,277,428, GenBank: NC_000019.10, g.1277428T>C)13 or reticulocytes (rs35971149, chr19:1,164,199, GenBank: NC_000019.10, g.1164199del).9 The MIDN- rs73494666 variant overlaps ENCODE cis-regulatory elements for CD34 stem cells and other blood cell progenitors.
PIEZO1
Mutations in the mechanosensitive ion channel PIEZO1 on chromosome 16q24 have been reported in patients with autosomal-dominant hereditary xerocytosis (MIM: 194380), a congenital hemolytic anemia associated with increased calcium influx, red cell dehydration, and potassium efflux along with various red cell laboratory abnormalities including increased MCHC, MCH, and reticulocytosis.59,60 Most reported hereditary xerocytosis PIEZO1 missense mutations are associated with at least partial gain-of-function and are located within the highly conserved C-terminal region near the pore of the ion channel. In some individuals carrying PIEZO1 missense mutations, mild red cell laboratory parameter alterations without frank hemolytic anemia have been reported.61 The PIEZO1 3 bp short tandem repeat (STR) rs763477215 in-frame coding variant (p.Lys2169del) associated with higher MCHC in TOPMed is extremely rare in all populations except for the Ashkenazi Jewish population (frequency of 1.5% in gnomAD), has not been previously associated with hereditary xerocytosis, and therefore has been reported as “benign” in ClinVar. The p.Lys2169del variant is located in a highly basic -Lys-Lys-Lys-Lys- motif near the C terminus of the 36 transmembrane domain protein within a 14-residue linker region between the central ion channel pore and the peripheral propeller-like mechanosensitive domains important for modulating PIEZO1 channel function.62,63 Interestingly, another 3 bp in-frame deletion of PIEZO1 (E756del) reported to be highly enriched in prevalence among African populations was recently associated with dehydrated red blood cells and reduced susceptibility to malaria.64,65 In TOPMed, however, we were unable to confirm any association between the rs59446030 (chr16:88,733,965, GenBank: NM_001142864.4, c.2247_2249GGA[7], GenBank: NP_001136336.2, p.Glu756del) putative malaria-susceptibility allele variant and phenotypic variation in MCHC (p value for trait-specific conditional analysis = 0.42).
TMPRSS6
TMPRSS6 on chromosome 22q12 encodes matriptase-2, a transmembrane serine protease that downregulates the production of hepcidin in the liver and therefore plays an essential role in iron homeostasis.66 Rare mutations of TMPRSS6 are associated with iron-refractory iron deficiency anemia (MIM: 206200)67 characterized by microcytic hypochromic anemia and low transferrin saturation. Several common TMPRSS6 variants have been associated with multiple RBC traits through prior GWASs. The common TMPRSS6 intronic variant associated with TMPRSS6 expression and lower MCH/MCV in TOPMed (rs228914/rs228916) was previously reported to be associated with lower iron levels,47 and therefore likely contributes to lower MCH and MCV via iron deficiency. In rare variant aggregated association testing, we were able to identify several additional rare coding missense, stop-gain, or splice variants that appear to drive the gene-based association of TMPRSS6 with lower MCH/MCV and higher RDW. At least one of these variants at exon 13 rs387907018 (chr22:37,073,550, GenBank: NC_000022.11, g.37073550C>T, GenBank: NP_705837.1, p.Glu522Lys, missense mutation) has been reported in a compound heterozygous iron-refractory iron deficiency anemia (IRIDA [MIM: 206200]) patient,68 suggesting that inheritance of this or similar LoF variants in the heterozygote state may contribute to mild reductions in MCV/MCH or increased RDW.67
G6PD
X-linked G6PD mutations (glucose-6-phosphate dehydrogenase) are the most common cause worldwide of acute and chronic hemolytic anemia. The G6PD-rs76723693 low-frequency missense variant (p.Leu323Pro, referred to as G6PD Nefza69) is common in persons of African ancestry and is associated with lower RDW in TOPMed. In persons of African ancestry, the p.Leu323Pro variant is often co-inherited with another G6PD missense variant, p.Asn126Asp, encoded by rs1050829 (chrX:154,535,277, GenBank: NC_000023.11, g.154535277T>C, GenBank: NP_001346945.1, p.Asn126Asp). The 968C/376G haplotype in African ancestry individuals constitutes one of several forms of the G6PD variant A-.70, 71, 72, 73 Functional studies of the p.Leu323Pro, p.Asn126Asp, and the double mutant suggest the p.Leu323Pro variant is the primary contributor to reduced catalytic activity.74 In the US, another African ancestry G6PD A- variant is due to the haplotypic combination of rs1050829 and rs1050828 (chrX:154,536,002, GenBank: NC_000023.11, g.154536002C>T, GenBank: NP_001346945.1, p.Val68Met), which has an allele frequency of ∼12%. Our finding that rs76723693 is significantly associated with lower RDW after conditioning on rs1050828 is consistent with the independence of effects of the G6PD Nefza and A- variants on red cell physiology and morphology. Importantly, both rs76723693 and rs1050828 G6PD variants were recently reported to have the effect of lowering hemoglobin A1c (HbA1c) values and therefore should be considered when screening African Americans for type 2 diabetes (MIM: 125853).75
In gene-based analyses, several additional G6PD missense variants contributed to the aggregated rare variant association signals for MCH, MCV, RBC, and RDW, including the class II Southeast Asian Mahidol variant p.Gly163Ser (rs730880992, chr12:112,453,349, GenBank: NC_000012.12, g.112453349G>A, GenBank: NP_002825.3, p.Gly163Cys)76 and the class II Union variant p.Arg454Cys (rs398123546, chrX:154,532,390, GenBank: NC_000023.11, g.154532390G>A, GenBank: NP_001035810.1, p.Arg454Cys).77 For a third previously reported variant associated with G6PD deficiency, the East Asian class II Gahoe variant p.His32Arg (rs137852340, chrX: 154,546,061, GenBank: NM_001360016.2, c.95A>G, GenBank: NP_001346945.1, p.His32Arg),78 there is conflicting evidence of pathogenicity in ClinVar. Of the two female rs137852340 variant allele carriers in TOPMed, one has a normal RDW and one has an elevated RDW. These findings add to the further genotypic-phenotypic complexity and clinical spectrum of G6PD deficiency, which is influenced by its sex-linkage and zygosity, residual G6PD variant enzyme activity and stability, genetic background, and environmental exposures.79
HBB
Heterozygosity for the common African HBB-rs334 hemoglobin S (chr11:5,227,002, GenBank: NC_000011.10, g.5227002T>A, GenBank: NP_000509.1, p.Glu7Val) or rs33930165 hemoglobin C (chr11:5,227,003, GenBank: NC_000011.10, g.5227003C>T, GenBank: NP_000509.1, p.Glu7Lys) beta-globin structural variants have recently been associated with alterations in various red cell laboratory parameters including lower hemoglobin, MCV, MCH, and RDW, along with higher MCHC, RDW, and HbA1c.17,18,20,80, 81, 82 In TOPMed, we were able to identify at least ten additional low-frequency or rare variants within the HBB locus independently associated with HGB, RBC, MCV, MCH, MCHC, and/or RDW. Notably, six of the ten variants correspond to HBB 5′ UTR and promoter regions previously identified in patients with beta-thalassemia: rs34598529 (chr11:5,227,100, GenBank: NC_000011.10, g.5227100T>C or −29A>G);83 rs33944208 (chr11:5,227,159, GenBank: NC_000011.10, g.5227159G>A or −88C>T);84, 85, 86 splice site rs33915217 (chr11:5,226,925, GenBank: NC_000011.10, g.5226925C>G or IVS1-5G>C);84,87 rs33945777 (chr11:5,226,576, GenBank: NC_000011.10, g.5226576C>T or IVS2-1G>A);84,87 rs35004220 (chr11:5,226,820, GenBank: NC_000011.10, g.5226820C>T or IVS-I-110 G->A),88,89 and nonsense mutations rs11549407 (chr11:5,226,774, GenBank: NC_000011.10, g.5226774G>T, GenBank: NP_000509.1, p.Gln40Lys or p.Gln40Ter).90,91 These findings confirm the very mild phenotype and clinically “silent” nature of the heterozygote carrier state of these beta-globin gene variants.92 Several of these mutations occur more commonly in populations of South Asian (rs33915217), African (rs34598529, rs33944208), or Mediterranean (rs11549407) ancestry. Four additional association signals in the region—rs73404549 (HBG2, chr11:5,299,424, GenBank: NC_000011.10, g.5299424C>T), rs77333754 (chr11:5,001,853, GenBank: NC_000011.10, g.5001853T>C), rs1189661759 (chr11:5,183,128, GenBank: NC_000011.10, g.5183128C>A), and rs539384429 (chr11:5,106,319, GenBank: NC_000011.10, g.5106319A>G)—are all rare non-coding variants without obvious functional consequences. In addition to the HBB protein-coding variants identified in single-variant analyses, several of the rare variants driving the aggregate HBB gene-based association with lower HGB/HCT and MCH/MCV/MCHC and higher RBC/RDW are similarly previously reported missense, frameshift, or nonsense mutations previously identified in beta-thalassemia patients and categorized as pathogenic in ClinVar (Figure S5 and Table S17).
HBA1/HBA2
Several common DNA polymorphisms located in the alpha-globin gene cluster on chromosome 16p13.3 have been associated with red cell traits in large GWASs,7,8,93 including heterozygosity for the common African ancestral 3.7 kb deletion which contributes to quantitative RBC phenotypes among African Americans and Hispanics/Latinos. In TOPMed, we identified two low-frequency variants in single-variant testing associated with MCH, MCV, and/or RBC count, independently of the 3.7 kb deletion. The rs868351380 variant is found primarily among Hispanics/Latinos while the rs372755452 variant is found primarily among East Asians. Neither of these two non-coding variants is located in any known alpha-globin regulatory region, and therefore requires further mechanistic confirmation. By contrast, in gene-based rare variant analysis, we identified several known alpha-globin variants associated in aggregate with lower MCH and MCV including the South Asian variant Hb Q India (HBA1, rs33984024, chr16:177,026, GenBank: NM_000558.5, c.193G>C, GenBank: NP_000549.1, p.Asp65His)94, 95, 96 and the African variant Hb Groene Hart (HBA1, rs63750751, chr16:177,340, GenBank: NM_000558.5, c.358C>T, GenBank: NP_000549.1, p.Pro120Ser).97, 98, 99 In homozygous or compound heterozygous forms, these latter variants have been reported in probands with alpha-thalassemia, whereas heterozygotes generally have mild microcytic phenotype. Several additional variants contributing to the HBA1 gene-based rare variant MCH/MCV signal (e.g., a 1 bp indel causing frameshift p.Asn79Ter, rs767911847, chr16:177,070, GenBank: NM_000558.5, c.237del, GenBank: NP_000549.1, p.Asn79fs) may represent previously undetected alpha-thalassemia mutations.
CD36, TFRC, and SLC12A7
The presence of rare coding or LoF variants within CD36, TFRC, and SLC12A7 provides evidence that these genes are causally responsible for RBC phenotypic variation. A common African ancestral null variant of CD36 (rs3211938 or p.Tyr325Ter) has been previously associated with higher RDW and with lower CD36 expression in erythroblasts.100 In TOPMed, additional CD36 rare coding variants were associated in aggregate with higher RDW independent of rs3211938, including several nonsense and frameshift or splice acceptor mutations, which have been previously classified as VUSs. Further characterization of the genetic complexity of the CD36-null phenotype (common in African and Asian populations) may provide information relevant to the tissue-specific expression of this receptor on red cells, platelets, monocytes, and endothelial cells and its role in malaria infection and disease severity.101 TFRC encodes the transferrin receptors (TfR1), which is required for iron uptake and erythropoiesis.102 While common non-coding variants of TFRC have been associated with MCV and RDW, the only known TFRC-related Mendelian disorder is a homozygous p.Tyr20His (rs863225436, chr3:196,075,339, GenBank: NM_001128148.3, c.58T>C, GenBank: NP_001121620.1, p.Tyr20His) substitution reported to cause combined immunodeficiency affecting leukocytes and platelets but not red cells.103 Common variants of SLC12A7 encoding the potassium ion channel KCC4 have been associated with RDW and other RBC phenotypes. While KCC4 is expressed in erythroblasts,104 its role in red blood cell function is not well described.105 Further characterization of KCC4 LoF variants may illuminate the role of this ion transporter in red cell dehydration with potential implications for treatment of patients with sickle cell disease.106
In summary, we illustrate that expanding coverage of the genome using WGS as applied to large, population-based multi-ethnic samples can lead to discovery of variants associated with quantitative RBC traits that have not been described before. Most of the discovered variants were of low frequency and/or disproportionately observed in non-Europeans. We also report extensive allelic heterogeneity at the chromosome 11 beta-globin locus, including associations with several known beta-thalassemia carrier variants. The gene-based association of rare variants within HBA1/HBA2, HBB, TMPRSS6, G6PD, CD36, TFRC, and SLC12A7 independent of known single variants in the same genes further suggest that rare functional variants in genes responsible for Mendelian RBC disorders contribute to the genetic architecture of RBC phenotypic variation among the population at large. Together these results demonstrate the utility of WGS in ethnically diverse population-based samples for expanding our understanding of the genetic architecture of quantitative hematologic traits and suggest a continuum between complex traits and Mendelian red cell disorders.
Consortia
The members of the NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium are Namiko Abe, Goncalo Abecasis, Francois Aguet, Christine Albert, Laura Almasy, Alvaro Alonso, Seth Ament, Peter Anderson, Pramod Anugu, Deborah Applebaum-Bowden, Kristin Ardlie, Dan Arking, Donna K Arnett, Allison Ashley-Koch, Stella Aslibekyan, Tim Assimes, Paul Auer, Dimitrios Avramopoulos, Najib Ayas, Adithya Balasubramanian, John Barnard, Kathleen Barnes, R. Graham Barr, Emily Barron-Casella, Lucas Barwick, Terri Beaty, Gerald Beck, Diane Becker, Lewis Becker, Rebecca Beer, Amber Beitelshees, Emelia Benjamin, Takis Benos, Marcos Bezerra, Larry Bielak, Joshua Bis, Thomas Blackwell, John Blangero, Eric Boerwinkle, Donald W. Bowden, Russell Bowler, Jennifer Brody, Ulrich Broeckel, Jai Broome, Deborah Brown, Karen Bunting, Esteban Burchard, Carlos Bustamante, Erin Buth, Brian Cade, Jonathan Cardwell, Vincent Carey, Julie Carrier, Cara Carty, Richard Casaburi, Juan P Casas Romero, James Casella, Peter Castaldi, Mark Chaffin, Christy Chang, Yi-Cheng Chang, Daniel Chasman, Sameer Chavan, Bo-Juen Chen, Wei-Min Chen, Yii-Der Ida Chen, Michael Cho, Seung Hoan Choi, Lee-Ming Chuang, Mina Chung, Ren-Hua Chung, Clary Clish, Suzy Comhair, Matthew Conomos, Elaine Cornell, Adolfo Correa, Carolyn Crandall, James Crapo, L. Adrienne Cupples, Joanne Curran, Jeffrey Curtis, Brian Custer, Coleen Damcott, Dawood Darbar, Sean David, Colleen Davis, Michelle Daya, Mariza de Andrade, Lisa de las Fuentes, Paul de Vries, Michael DeBaun, Ranjan Deka, Dawn DeMeo, Scott Devine, Huyen Dinh, Harsha Doddapaneni, Qing Duan, Shannon Dugan-Perez, Ravi Duggirala, Jon Peter Durda, Susan K. Dutcher, Charles Eaton, Lynette Ekunwe, Adel El Boueiz, Patrick Ellinor, Leslie Emery, Serpil Erzurum, Charles Farber, Jesse Farek, Tasha Fingerlin, Matthew Flickinger, Myriam Fornage, Nora Franceschini, Chris Frazar, Mao Fu, Stephanie M. Fullerton, Lucinda Fulton, Stacey Gabriel, Weiniu Gan, Shanshan Gao, Yan Gao, Margery Gass, Heather Geiger, Bruce Gelb, Mark Geraci, Soren Germer, Robert Gerszten, Auyon Ghosh, Richard Gibbs, Chris Gignoux, Mark Gladwin, David Glahn, Stephanie Gogarten, Da-Wei Gong, Harald Goring, Sharon Graw, Kathryn J. Gray, Daniel Grine, Colin Gross, C. Charles Gu, Yue Guan, Xiuqing Guo, Namrata Gupta, David M. Haas, Jeff Haessler, Michael Hall, Yi Han, Patrick Hanly, Daniel Harris, Nicola L. Hawley, Jiang He, Ben Heavner, Susan Heckbert, Ryan Hernandez, David Herrington, Craig Hersh, Bertha Hidalgo, James Hixson, Brian Hobbs, John Hokanson, Elliott Hong, Karin Hoth, Chao (Agnes) Hsiung, Jianhong Hu, Yi-Jen Hung, Haley Huston, Chii Min Hwu, Marguerite Ryan Irvin, Rebecca Jackson, Deepti Jain, Cashell Jaquish, Jill Johnsen, Andrew Johnson, Craig Johnson, Rich Johnston, Kimberly Jones, Hyun Min Kang, Robert Kaplan, Sharon Kardia, Shannon Kelly, Eimear Kenny, Michael Kessler, Alyna Khan, Ziad Khan, Wonji Kim, John Kimoff, Greg Kinney, Barbara Konkle, Charles Kooperberg, Holly Kramer, Christoph Lange, Ethan Lange, Leslie Lange, Cathy Laurie, Cecelia Laurie, Meryl LeBoff, Jiwon Lee, Sandra Lee, Wen-Jane Lee, Jonathon LeFaive, David Levine, Dan Levy, Joshua Lewis, Xiaohui Li, Yun Li, Henry Lin, Honghuang Lin, Xihong Lin, Simin Liu, Yongmei Liu, Yu Liu, Ruth J.F. Loos, Steven Lubitz, Kathryn Lunetta, James Luo, Ulysses Magalang, Michael Mahaney, Barry Make, Ani Manichaikul, Alisa Manning, JoAnn Manson, Lisa Martin, Melissa Marton, Susan Mathai, Rasika Mathias, Susanne May, Patrick McArdle, Merry-Lynn McDonald, Sean McFarland, Stephen McGarvey, Daniel McGoldrick, Caitlin McHugh, Becky McNeil, Hao Mei, James Meigs, Vipin Menon, Luisa Mestroni, Ginger Metcalf, Deborah A Meyers, Emmanuel Mignot, Julie Mikulla, Nancy Min, Mollie Minear, Ryan L Minster, Braxton D. Mitchell, Matt Moll, Zeineen Momin, May E. Montasser, Courtney Montgomery, Donna Muzny, Josyf C Mychaleckyj, Girish Nadkarni, Rakhi Naik, Take Naseri, Pradeep Natarajan, Sergei Nekhai, Sarah C. Nelson, Bonnie Neltner, Caitlin Nessner, Deborah Nickerson, Osuji Nkechinyere, Kari North, Jeff O'Connell, Tim O'Connor, Heather Ochs-Balcom, Geoffrey Okwuonu, Allan Pack, David T. Paik, Nicholette Palmer, James Pankow, George Papanicolaou, Cora Parker, Gina Peloso, Juan Manuel Peralta, Marco Perez, James Perry, Ulrike Peters, Patricia Peyser, Lawrence S Phillips, Jacob Pleiness, Toni Pollin, Wendy Post, Julia Powers Becker, Meher Preethi Boorgula, Michael Preuss, Bruce Psaty, Pankaj Qasba, Dandi Qiao, Zhaohui Qin, Nicholas Rafaels, Laura Raffield, Mahitha Rajendran, Vasan S. Ramachandran, D.C. Rao, Laura Rasmussen-Torvik, Aakrosh Ratan, Susan Redline, Robert Reed, Catherine Reeves, Elizabeth Regan, Alex Reiner, Muagututi‘a Sefuiva Reupena, Ken Rice, Stephen Rich, Rebecca Robillard, Nicolas Robine, Dan Roden, Carolina Roselli, Jerome Rotter, Ingo Ruczinski, Alexi Runnels, Pamela Russell, Sarah Ruuska, Kathleen Ryan, Ester Cerdeira Sabino, Danish Saleheen, Shabnam Salimi, Sejal Salvi, Steven Salzberg, Kevin Sandow, Vijay G. Sankaran, Jireh Santibanez, Karen Schwander, David Schwartz, Frank Sciurba, Christine Seidman, Jonathan Seidman, Frederic Sériès, Vivien Sheehan, Stephanie L. Sherman, Amol Shetty, Aniket Shetty, Wayne Hui-Heng Sheu, M. Benjamin Shoemaker, Brian Silver, Edwin Silverman, Robert Skomro, Albert Vernon Smith, Jennifer Smith, Josh Smith, Nicholas Smith, Tanja Smith, Sylvia Smoller, Beverly Snively, Michael Snyder, Tamar Sofer, Nona Sotoodehnia, Adrienne M. Stilp, Garrett Storm, Elizabeth Streeten, Jessica Lasky Su, Yun Ju Sung, Jody Sylvia, Adam Szpiro, Daniel Taliun, Hua Tang, Margaret Taub, Kent D. Taylor, Matthew Taylor, Simeon Taylor, Marilyn Telen, Timothy A. Thornton, Machiko Threlkeld, Lesley Tinker, David Tirschwell, Sarah Tishkoff, Hemant Tiwari, Catherine Tong, Russell Tracy, Michael Tsai, Dhananjay Vaidya, David Van Den Berg, Peter VandeHaar, Scott Vrieze, Tarik Walker, Robert Wallace, Avram Walts, Fei Fei Wang, Heming Wang, Jiongming Wang, Karol Watson, Jennifer Watt, Daniel E. Weeks, Bruce Weir, Scott T Weiss, Lu-Chen Weng, Jennifer Wessel, Cristen Willer, Kayleen Williams, L. Keoki Williams, Carla Wilson, James Wilson, Lara Winterkorn, Quenna Wong, Joseph Wu, Huichun Xu, Lisa Yanek, Ivana Yang, Ketian Yu, Seyedeh Maryam Zekavat, Yingze Zhang, Snow Xueyan Zhao, Wei Zhao, Xiaofeng Zhu, Michael Zody, and Sebastian Zoellner.
Declaration of interests
B.P.K. is an inventor on patent applications filed by Mass General Brigham that describe genome engineering technologies, is an advisor to Acrigen Biosciences, and consults for Avectas Inc. and ElevateBio.
Published: April 21, 2021; corrected online April 28, 2021
Footnotes
Supplemental information can be found online at https://doi.org/10.1016/j.ajhg.2021.04.003.
Data and code availability
Data for each participating study can be accessed through dbGaP with the corresponding accession number (Amish, phs000956; ARIC, phs001211; BioMe, phs001644; CARDIA, phs001612; CHS, phs001368; COPDGene, phs000951; FHS, phs000974; GeneSTAR, phs001218; HCHS/SOL, phs001395; JHS, phs000964; MESA, phs001416; SAFS, phs001215; WHI, phs001237). Analysis results for the conditional single variant analyses and the aggregate conditional analyses can be accessed through dbGaP accession number phs001974.
Web resources
INTERVAL study, https://www.intervalstudy.org.uk/
KAISER-Permanente Genetic Epidemiology Research on Aging (GERA) cohort, https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000674.v3.p3/
OMIM, https://www.omim.org/
TOPMed whole genome sequencing methods for freeze 8, https://www.nhlbiwgs.org/topmed-whole-genome-sequencing-methods-freeze-8
Supplemental information
References
- 1.Kuhn V., Diederich L., Keller T.C.S., 4th, Kramer C.M., Lückstädt W., Panknin C., Suvorava T., Isakson B.E., Kelm M., Cortese-Krott M.M. Red blood cell function and dysfunction: redox regulation, nitric oxide metabolism, anemia. Antioxid. Redox Signal. 2017;26:718–742. doi: 10.1089/ars.2016.6954. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Sarma P.R. Red Cell Indices. In: Walker H.K., Hall W.D., Hurst J.W., editors. Clinical Methods: The History, Physical, and Laboratory Examinations. Butterworths; Boston: 1990. [PubMed] [Google Scholar]
- 3.Lippi G., Mattiuzzi C. Updated worldwide epidemiology of inherited erythrocyte disorders. Acta Haematol. 2020;143:196–203. doi: 10.1159/000502434. [DOI] [PubMed] [Google Scholar]
- 4.Evans D.M., Frazer I.H., Martin N.G. Genetic and environmental causes of variation in basal levels of blood cells. Twin Res. 1999;2:250–257. doi: 10.1375/136905299320565735. [DOI] [PubMed] [Google Scholar]
- 5.Patel K.V. Variability and heritability of hemoglobin concentration: an opportunity to improve understanding of anemia in older adults. Haematologica. 2008;93:1281–1283. doi: 10.3324/haematol.13692. [DOI] [PubMed] [Google Scholar]
- 6.Soranzo N., Spector T.D., Mangino M., Kühnel B., Rendon A., Teumer A., Willenborg C., Wright B., Chen L., Li M. A genome-wide meta-analysis identifies 22 loci associated with eight hematological parameters in the HaemGen consortium. Nat. Genet. 2009;41:1182–1190. doi: 10.1038/ng.467. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Ganesh S.K., Zakai N.A., van Rooij F.J.A., Soranzo N., Smith A.V., Nalls M.A., Chen M.-H., Kottgen A., Glazer N.L., Dehghan A. Multiple loci influence erythrocyte phenotypes in the CHARGE Consortium. Nat. Genet. 2009;41:1191–1198. doi: 10.1038/ng.466. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.van der Harst P., Zhang W., Mateo Leach I., Rendon A., Verweij N., Sehmi J., Paul D.S., Elling U., Allayee H., Li X. Seventy-five genetic loci influencing the human red blood cell. Nature. 2012;492:369–375. doi: 10.1038/nature11677. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Astle W.J., Elding H., Jiang T., Allen D., Ruklisa D., Mann A.L., Mead D., Bouman H., Riveros-Mckay F., Kostadima M.A. The allelic landscape of human blood cell trait variation and links to common complex disease. Cell. 2016;167:1415–1429.e19. doi: 10.1016/j.cell.2016.10.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Iotchkova V., Huang J., Morris J.A., Jain D., Barbieri C., Walter K., Min J.L., Chen L., Astle W., Cocca M., UK10K Consortium Discovery and refinement of genetic loci associated with cardiometabolic risk using dense imputation maps. Nat. Genet. 2016;48:1303–1312. doi: 10.1038/ng.3668. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.CHARGE Consortium Hematology Working Group Meta-analysis of rare and common exome chip variants identifies S1PR4 and other loci influencing blood cell traits. Nat. Genet. 2016;48:867–876. doi: 10.1038/ng.3607. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Mousas A., Ntritsos G., Chen M.-H., Song C., Huffman J.E., Tzoulaki I., Elliott P., Psaty B.M., Auer P.L., Johnson A.D., Blood-Cell Consortium Rare coding variants pinpoint genes that control human hematological traits. PLoS Genet. 2017;13:e1006925. doi: 10.1371/journal.pgen.1006925. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Kichaev G., Bhatia G., Loh P.-R., Gazal S., Burch K., Freund M.K., Schoech A., Pasaniuc B., Price A.L. Leveraging polygenic functional enrichment to improve GWAS power. Am. J. Hum. Genet. 2019;104:65–75. doi: 10.1016/j.ajhg.2018.11.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.van Rooij F.J.A., Qayyum R., Smith A.V., Zhou Y., Trompet S., Tanaka T., Keller M.F., Chang L.-C., Schmidt H., Yang M.-L., BioBank Japan Project Genome-wide Trans-ethnic Meta-analysis Identifies Seven Genetic Loci Influencing Erythrocyte Traits and a Role for RBPMS in Erythropoiesis. Am. J. Hum. Genet. 2017;100:51–63. doi: 10.1016/j.ajhg.2016.11.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Jo Hodonsky C., Schurmann C., Schick U.M., Kocarnik J., Tao R., van Rooij F.J., Wassel C., Buyske S., Fornage M., Hindorff L.A. Generalization and fine mapping of red blood cell trait genetic associations to multi-ethnic populations: The PAGE Study. Am. J. Hematol. 2018 doi: 10.1002/ajh.25161. Published online June 15, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Kanai M., Akiyama M., Takahashi A., Matoba N., Momozawa Y., Ikeda M., Iwata N., Ikegawa S., Hirata M., Matsuda K. Genetic analysis of quantitative traits in the Japanese population links cell types to complex human diseases. Nat. Genet. 2018;50:390–400. doi: 10.1038/s41588-018-0047-6. [DOI] [PubMed] [Google Scholar]
- 17.Gurdasani D., Carstensen T., Fatumo S., Chen G., Franklin C.S., Prado-Martinez J., Bouman H., Abascal F., Haber M., Tachmazidou I. Uganda Genome Resource Enables Insights into Population History and Genomic Discovery in Africa. Cell. 2019;179:984–1002.e36. doi: 10.1016/j.cell.2019.10.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Raffield L.M., Ulirsch J.C., Naik R.P., Lessard S., Handsaker R.E., Jain D., Kang H.M., Pankratz N., Auer P.L., Bao E.L., NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium, Hematology & Hemostasis, Diabetes, and Structural Variation TOPMed Working Groups Common α-globin variants modify hematologic and other clinical phenotypes in sickle cell trait and disease. PLoS Genet. 2018;14:e1007293. doi: 10.1371/journal.pgen.1007293. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Vuckovic D., Bao E.L., Akbari P., Lareau C.A., Mousas A., Jiang T., Chen M.-H., Raffield L.M., Tardaguila M., Huffman J.E., VA Million Veteran Program The polygenic and monogenic basis of blood traits and diseases. Cell. 2020;182:1214–1231.e11. doi: 10.1016/j.cell.2020.08.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Hodonsky C.J., Jain D., Schick U.M., Morrison J.V., Brown L., McHugh C.P., Schurmann C., Chen D.D., Liu Y.M., Auer P.L. Genome-wide association study of red blood cell traits in Hispanics/Latinos: The Hispanic Community Health Study/Study of Latinos. PLoS Genet. 2017;13:e1006760. doi: 10.1371/journal.pgen.1006760. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Chen M.-H., Raffield L.M., Mousas A., Sakaue S., Huffman J.E., Moscati A., Trivedi B., Jiang T., Akbari P., Vuckovic D., VA Million Veteran Program Trans-ethnic and Ancestry-Specific Blood-Cell Genetics in 746,667 Individuals from 5 Global Populations. Cell. 2020;182:1198–1213.e14. doi: 10.1016/j.cell.2020.06.045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Beutler E., West C. Hematologic differences between African-Americans and whites: the roles of iron deficiency and alpha-thalassemia on hemoglobin levels and mean corpuscular volume. Blood. 2005;106:740–745. doi: 10.1182/blood-2005-02-0713. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Regier A.A., Farjoun Y., Larson D.E., Krasheninina O., Kang H.M., Howrigan D.P., Chen B.-J., Kher M., Banks E., Ames D.C. Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects. Nat. Commun. 2018;9:4038. doi: 10.1038/s41467-018-06159-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Conomos M.P., Miller M.B., Thornton T.A. Robust inference of population structure for ancestry prediction and correction of stratification in the presence of relatedness. Genet. Epidemiol. 2015;39:276–293. doi: 10.1002/gepi.21896. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Conomos M.P., Reiner A.P., Weir B.S., Thornton T.A. Model-free Estimation of Recent Genetic Relatedness. Am. J. Hum. Genet. 2016;98:127–148. doi: 10.1016/j.ajhg.2015.11.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Conomos M.P., Laurie C.A., Stilp A.M., Gogarten S.M., McHugh C.P., Nelson S.C., Sofer T., Fernández-Rhodes L., Justice A.E., Graff M. Genetic diversity and association studies in US hispanic/latino populations: applications in the hispanic community health study/study of latinos. Am. J. Hum. Genet. 2016;98:165–184. doi: 10.1016/j.ajhg.2015.12.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Sofer T., Zheng X., Gogarten S.M., Laurie C.A., Grinde K., Shaffer J.R., Shungin D., O’Connell J.R., Durazo-Arvizo R.A., Raffield L., NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium A fully adjusted two-stage procedure for rank-normalization in genetic association studies. Genet. Epidemiol. 2019;43:263–275. doi: 10.1002/gepi.22188. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Lin D.-Y. A simple and accurate method to determine genomewide significance for association tests in sequencing studies. Genet. Epidemiol. 2019;43:365–372. doi: 10.1002/gepi.22183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Gogarten S.M., Sofer T., Chen H., Yu C., Brody J.A., Thornton T.A., Rice K.M., Conomos M.P. Genetic association testing using the GENESIS R/Bioconductor package. Bioinformatics. 2019;35:5346–5348. doi: 10.1093/bioinformatics/btz567. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Pedersen B.S., Quinlan A.R. Mosdepth: quick coverage calculation for genomes and exomes. Bioinformatics. 2018;34:867–868. doi: 10.1093/bioinformatics/btx699. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Reiner A.P., Lettre G., Nalls M.A., Ganesh S.K., Mathias R., Austin M.A., Dean E., Arepalli S., Britton A., Chen Z. Genome-wide association study of white blood cell count in 16,388 African Americans: the continental origins and genetic epidemiology network (COGENT) PLoS Genet. 2011;7:e1002108. doi: 10.1371/journal.pgen.1002108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Willer C.J., Li Y., Abecasis G.R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics. 2010;26:2190–2191. doi: 10.1093/bioinformatics/btq340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Chen H., Huffman J.E., Brody J.A., Wang C., Lee S., Li Z., Gogarten S.M., Sofer T., Bielak L.F., Bis J.C., NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium. TOPMed Hematology and Hemostasis Working Group Efficient Variant Set Mixed Model Association Tests for Continuous and Binary Traits in Large-Scale Whole-Genome Sequencing Studies. Am. J. Hum. Genet. 2019;104:260–274. doi: 10.1016/j.ajhg.2018.12.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Wu M.C., Lee S., Cai T., Li Y., Boehnke M., Lin X. Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 2011;89:82–93. doi: 10.1016/j.ajhg.2011.05.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Lee S., Emond M.J., Bamshad M.J., Barnes K.C., Rieder M.J., Nickerson D.A., Christiani D.C., Wurfel M.M., Lin X., NHLBI GO Exome Sequencing Project—ESP Lung Project Team Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. Am. J. Hum. Genet. 2012;91:224–237. doi: 10.1016/j.ajhg.2012.06.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Cummings B.B., Karczewski K.J., Kosmicki J.A., Seaby E.G., Watts N.A., Singer-Berk M., Mudge J.M., Karjalainen J., Satterstrom F.K., O’Donnell-Luria A.H., Genome Aggregation Database Production Team. Genome Aggregation Database Consortium Transcript expression-aware annotation improves rare variant interpretation. Nature. 2020;581:452–458. doi: 10.1038/s41586-020-2329-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Lindeboom R.G.H., Supek F., Lehner B. The rules and impact of nonsense-mediated mRNA decay in human cancers. Nat. Genet. 2016;48:1112–1118. doi: 10.1038/ng.3664. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Lessard S., Manning A.K., Low-Kam C., Auer P.L., Giri A., Graff M., Schurmann C., Yaghootkar H., Luan J., Esko T., NHLBI GO Exome Sequence Project. GOT2D. T2D-GENES. GIANT Consortium Testing the role of predicted gene knockouts in human anthropometric trait variation. Hum. Mol. Genet. 2016;25:2082–2092. doi: 10.1093/hmg/ddw055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Kurita R., Suda N., Sudo K., Miharada K., Hiroyama T., Miyoshi H., Tani K., Nakamura Y. Establishment of immortalized human erythroid progenitor cell lines able to produce enucleated red blood cells. PLoS ONE. 2013;8:e59890. doi: 10.1371/journal.pone.0059890. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Vinjamur D.S., Bauer D.E. Growing and Genetically Manipulating Human Umbilical Cord Blood-Derived Erythroid Progenitor (HUDEP) Cell Lines. Methods Mol. Biol. 2018;1698:275–284. doi: 10.1007/978-1-4939-7428-3_17. [DOI] [PubMed] [Google Scholar]
- 41.Walton R.T., Christie K.A., Whittaker M.N., Kleinstiver B.P. Unconstrained genome targeting with near-PAMless engineered CRISPR-Cas9 variants. Science. 2020;368:290–296. doi: 10.1126/science.aba8853. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Kluesner M.G., Nedveck D.A., Lahr W.S., Garbe J.R., Abrahante J.E., Webber B.R., Moriarity B.S. EditR: A Method to Quantify Base Editing from Sanger Sequencing. CRISPR J. 2018;1:239–250. doi: 10.1089/crispr.2018.0014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Wu Y., Zeng J., Roscoe B.P., Liu P., Yao Q., Lazzarotto C.R., Clement K., Cole M.A., Luk K., Baricordi C. Highly efficient therapeutic gene editing of human hematopoietic stem cells. Nat. Med. 2019;25:776–783. doi: 10.1038/s41591-019-0401-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Giarratana M.-C., Rouard H., Dumont A., Kiger L., Safeukui I., Le Pennec P.-Y., François S., Trugnan G., Peyrard T., Marie T. Proof of principle for transfusion of in vitro-generated red blood cells. Blood. 2011;118:5071–5079. doi: 10.1182/blood-2011-06-362038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Brinkman E.K., Chen T., Amendola M., van Steensel B. Easy quantitative assessment of genome editing by sequence trace decomposition. Nucleic Acids Res. 2014;42:e168. doi: 10.1093/nar/gku936. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Kowalski M.H., Qian H., Hou Z., Rosen J.D., Tapia A.L., Shan Y., Jain D., Argos M., Arnett D.K., Avery C., NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium. TOPMed Hematology & Hemostasis Working Group Use of >100,000 NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium whole genome sequences improves imputation quality and detection of rare variant associations in admixed African and Hispanic/Latino populations. PLoS Genet. 2019;15:e1008500. doi: 10.1371/journal.pgen.1008500. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Benyamin B., Esko T., Ried J.S., Radhakrishnan A., Vermeulen S.H., Traglia M., Gögele M., Anderson D., Broer L., Podmore C., InterAct Consortium Novel loci affecting iron homeostasis and their effects in individuals at risk for hemochromatosis. Nat. Commun. 2014;5:4926. doi: 10.1038/ncomms5926. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Bereshchenko O., Mancini E., Luciani L., Gambardella A., Riccardi C., Nerlov C. Pontin is essential for murine hematopoietic stem cell survival. Haematologica. 2012;97:1291–1294. doi: 10.3324/haematol.2011.060251. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Schofield E.C., Carver T., Achuthan P., Freire-Pritchett P., Spivakov M., Todd J.A., Burren O.S. CHiCP: a web-based tool for the integrative and interactive visualization of promoter capture Hi-C datasets. Bioinformatics. 2016;32:2511–2513. doi: 10.1093/bioinformatics/btw173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Javierre B.M., Burren O.S., Wilder S.P., Kreuzhuber R., Hill S.M., Sewitz S., Cairns J., Wingett S.W., Várnai C., Thiecke M.J., BLUEPRINT Consortium Lineage-Specific Genome Architecture Links Enhancers and Non-coding Disease Variants to Target Gene Promoters. Cell. 2016;167:1369–1384.e19. doi: 10.1016/j.cell.2016.09.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Crispino J.D., Horwitz M.S. GATA factor mutations in hematologic disease. Blood. 2017;129:2103–2110. doi: 10.1182/blood-2016-09-687889. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Spinner M.A., Sanchez L.A., Hsu A.P., Shaw P.A., Zerbe C.S., Calvo K.R., Arthur D.C., Gu W., Gould C.M., Brewer C.C. GATA2 deficiency: a protean disorder of hematopoiesis, lymphatics, and immunity. Blood. 2014;123:809–821. doi: 10.1182/blood-2013-07-515528. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Swaminathan B., Thorleifsson G., Jöud M., Ali M., Johnsson E., Ajore R., Sulem P., Halvarsson B.-M., Eyjolfsson G., Haraldsdottir V. Variants in ELL2 influencing immunoglobulin levels associate with multiple myeloma. Nat. Commun. 2015;6:7213. doi: 10.1038/ncomms8213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Ali M., Ajore R., Wihlborg A.-K., Niroula A., Swaminathan B., Johnsson E., Stephens O.W., Morgan G., Meissner T., Turesson I. The multiple myeloma risk allele at 5q15 lowers ELL2 expression and increases ribosomal gene expression. Nat. Commun. 2018;9:1649. doi: 10.1038/s41467-018-04082-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Du Z., Weinhold N., Song G.C., Rand K.A., Van Den Berg D.J., Hwang A.E., Sheng X., Hom V., Ailawadhi S., Nooka A.K. A meta-analysis of genome-wide association studies of multiple myeloma among men and women of African ancestry. Blood Adv. 2020;4:181–190. doi: 10.1182/bloodadvances.2019000491. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Ye H., Jeong S.Y., Ghosh M.C., Kovtunovych G., Silvestri L., Ortillo D., Uchida N., Tisdale J., Camaschella C., Rouault T.A. Glutaredoxin 5 deficiency causes sideroblastic anemia by specifically impairing heme biosynthesis and depleting cytosolic iron in human erythroblasts. J. Clin. Invest. 2010;120:1749–1761. doi: 10.1172/JCI40372. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Peskin A.V., Pace P.E., Behring J.B., Paton L.N., Soethoudt M., Bachschmid M.M., Winterbourn C.C. Glutathionylation of the active site cysteines of peroxiredoxin 2 and recycling by glutaredoxin. J. Biol. Chem. 2016;291:3053–3062. doi: 10.1074/jbc.M115.692798. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Furuyama K., Kaneko K. Iron metabolism in erythroid cells and patients with congenital sideroblastic anemia. Int. J. Hematol. 2018;107:44–54. doi: 10.1007/s12185-017-2368-0. [DOI] [PubMed] [Google Scholar]
- 59.Zarychanski R., Schulz V.P., Houston B.L., Maksimova Y., Houston D.S., Smith B., Rinehart J., Gallagher P.G. Mutations in the mechanotransduction protein PIEZO1 are associated with hereditary xerocytosis. Blood. 2012;120:1908–1915. doi: 10.1182/blood-2012-04-422253. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Andolfo I., Alper S.L., De Franceschi L., Auriemma C., Russo R., De Falco L., Vallefuoco F., Esposito M.R., Vandorpe D.H., Shmukler B.E. Multiple clinical forms of dehydrated hereditary stomatocytosis arise from mutations in PIEZO1. Blood. 2013;121:3925–3935, S1–S12. doi: 10.1182/blood-2013-02-482489. [DOI] [PubMed] [Google Scholar]
- 61.Knight T., Zaidi A.U., Wu S., Gadgeel M., Buck S., Ravindranath Y. Mild erythrocytosis as a presenting manifestation of PIEZO1 associated erythrocyte volume disorders. Pediatr. Hematol. Oncol. 2019;36:317–326. doi: 10.1080/08880018.2019.1637984. [DOI] [PubMed] [Google Scholar]
- 62.Zhang T., Chi S., Jiang F., Zhao Q., Xiao B. A protein interaction mechanism for suppressing the mechanosensitive Piezo channels. Nat. Commun. 2017;8:1797. doi: 10.1038/s41467-017-01712-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Zhao Q., Zhou H., Chi S., Wang Y., Wang J., Geng J., Wu K., Liu W., Zhang T., Dong M.-Q. Structure and mechanogating mechanism of the Piezo1 channel. Nature. 2018;554:487–492. doi: 10.1038/nature25743. [DOI] [PubMed] [Google Scholar]
- 64.Ma S., Cahalan S., LaMonte G., Grubaugh N.D., Zeng W., Murthy S.E., Paytas E., Gamini R., Lukacs V., Whitwam T. Common PIEZO1 allele in african populations causes RBC dehydration and attenuates plasmodium infection. Cell. 2018;173:443–455.e12. doi: 10.1016/j.cell.2018.02.047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Nguetse C.N., Purington N., Ebel E.R., Shakya B., Tetard M., Kremsner P.G., Velavan T.P., Egan E.S. A common polymorphism in the mechanosensitive ion channel PIEZO1 is associated with protection from severe malaria in humans. Proc. Natl. Acad. Sci. USA. 2020;117:9074–9081. doi: 10.1073/pnas.1919843117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Wang C.-Y., Meynard D., Lin H.Y. The role of TMPRSS6/matriptase-2 in iron regulation and anemia. Front. Pharmacol. 2014;5:114. doi: 10.3389/fphar.2014.00114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.De Falco L., Sanchez M., Silvestri L., Kannengiesser C., Muckenthaler M.U., Iolascon A., Gouya L., Camaschella C., Beaumont C. Iron refractory iron deficiency anemia. Haematologica. 2013;98:845–853. doi: 10.3324/haematol.2012.075515. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Silvestri L., Guillem F., Pagani A., Nai A., Oudin C., Silva M., Toutain F., Kannengiesser C., Beaumont C., Camaschella C., Grandchamp B. Molecular mechanisms of the defective hepcidin inhibition in TMPRSS6 mutations associated with iron-refractory iron deficiency anemia. Blood. 2009;113:5605–5608. doi: 10.1182/blood-2008-12-195594. [DOI] [PubMed] [Google Scholar]
- 69.Benmansour I., Moradkhani K., Moumni I., Wajcman H., Hafsia R., Ghanem A., Abbès S., Préhu C. Two new class III G6PD variants [G6PD Tunis (c.920A>C: p.307Gln>Pro) and G6PD Nefza (c.968T>C: p.323 Leu>Pro)] and overview of the spectrum of mutations in Tunisia. Blood Cells Mol. Dis. 2013;50:110–114. doi: 10.1016/j.bcmd.2012.08.005. [DOI] [PubMed] [Google Scholar]
- 70.Beutler B., Cerami A. The biology of cachectin/TNF--a primary mediator of the host response. Annu. Rev. Immunol. 1989;7:625–655. doi: 10.1146/annurev.iy.07.040189.003205. [DOI] [PubMed] [Google Scholar]
- 71.Hamel A.R., Cabral I.R., Sales T.S.I., Costa F.F., Olalla Saad S.T. Molecular heterogeneity of G6PD deficiency in an Amazonian population and description of four new variants. Blood Cells Mol. Dis. 2002;28:399–406. doi: 10.1006/bcmd.2002.0524. [DOI] [PubMed] [Google Scholar]
- 72.Monteiro W.M., Franca G.P., Melo G.C., Queiroz A.L.M., Brito M., Peixoto H.M., Oliveira M.R.F., Romero G.A.S., Bassat Q., Lacerda M.V.G. Clinical complications of G6PD deficiency in Latin American and Caribbean populations: systematic review and implications for malaria elimination programmes. Malar. J. 2014;13:70. doi: 10.1186/1475-2875-13-70. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Reading N.S., Ruiz-Bonilla J.A., Christensen R.D., Cáceres-Perkins W., Prchal J.T. A patient with both methemoglobinemia and G6PD deficiency: A therapeutic conundrum. Am. J. Hematol. 2017;92:474–477. doi: 10.1002/ajh.24683. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Ramírez-Nava E.J., Ortega-Cuellar D., Serrano-Posada H., González-Valdez A., Vanoye-Carlo A., Hernández-Ochoa B., Sierra-Palacios E., Hernández-Pineda J., Rodríguez-Bustamante E., Arreguin-Espinosa R. Biochemical Analysis of Two Single Mutants that Give Rise to a Polymorphic G6PD A-Double Mutant. Int. J. Mol. Sci. 2017;18:18. doi: 10.3390/ijms18112244. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Sarnowski C., Leong A., Raffield L.M., Wu P., de Vries P.S., DiCorpo D., Guo X., Xu H., Liu Y., Zheng X., TOPMed Diabetes Working Group. TOPMed Hematology Working Group. TOPMed Hemostasis Working Group. National Heart, Lung, and Blood Institute TOPMed Consortium Impact of Rare and Common Genetic Variants on Diabetes Diagnosis by Hemoglobin A1c in Multi-Ancestry Cohorts: The Trans-Omics for Precision Medicine Program. Am. J. Hum. Genet. 2019;105:706–718. doi: 10.1016/j.ajhg.2019.08.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Huang Y., Choi M.Y., Au S.W.N., Au D.M.Y., Lam V.M.S., Engel P.C. Purification and detailed study of two clinically different human glucose 6-phosphate dehydrogenase variants, G6PD(Plymouth) and G6PD(Mahidol): Evidence for defective protein folding as the basis of disease. Mol. Genet. Metab. 2008;93:44–53. doi: 10.1016/j.ymgme.2007.08.122. [DOI] [PubMed] [Google Scholar]
- 77.Wang X.-T., Lam V.M.S., Engel P.C. Marked decrease in specific activity contributes to disease phenotype in two human glucose 6-phosphate dehydrogenase mutants, G6PD(Union) and G6PD(Andalus) Hum. Mutat. 2005;26:284. doi: 10.1002/humu.9367. [DOI] [PubMed] [Google Scholar]
- 78.Chiu D.T., Zuo L., Chao L., Chen E., Louie E., Lubin B., Liu T.Z., Du C.S. Molecular characterization of glucose-6-phosphate dehydrogenase (G6PD) deficiency in patients of Chinese descent and identification of new base substitutions in the human G6PD gene. Blood. 1993;81:2150–2154. [PubMed] [Google Scholar]
- 79.Luzzatto L., Ally M., Notaro R. Glucose-6-phosphate dehydrogenase deficiency. Blood. 2020;136:1225–1240. doi: 10.1182/blood.2019000944. [DOI] [PubMed] [Google Scholar]
- 80.Wojcik G.L., Graff M., Nishimura K.K., Tao R., Haessler J., Gignoux C.R., Highland H.M., Patel Y.M., Sorokin E.P., Avery C.L. Genetic analyses of diverse populations improves discovery for complex traits. Nature. 2019;570:514–518. doi: 10.1038/s41586-019-1310-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Fatumo S., Carstensen T., Nashiru O., Gurdasani D., Sandhu M., Kaleebu P. Complimentary Methods for Multivariate Genome-Wide Association Study Identify New Susceptibility Genes for Blood Cell Traits. Front. Genet. 2019;10:334. doi: 10.3389/fgene.2019.00334. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Velasco-Rodríguez D., Alonso-Domínguez J.-M., González-Fernández F.-A., Muriel A., Abalo L., Sopeña M., Villarrubia J., Ropero P., Plaza M.P., Tenorio M. Laboratory parameters provided by Advia 2120 analyser identify structural haemoglobinopathy carriers and discriminate between Hb S trait and Hb C trait. J. Clin. Pathol. 2016;69:912–920. doi: 10.1136/jclinpath-2015-203556. [DOI] [PubMed] [Google Scholar]
- 83.Antonarakis S.E., Boehm C.D., Serjeant G.R., Theisen C.E., Dover G.J., Kazazian H.H., Jr. Origin of the beta S-globin gene in blacks: the contribution of recurrent mutation or gene conversion or both. Proc. Natl. Acad. Sci. USA. 1984;81:853–856. doi: 10.1073/pnas.81.3.853. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Wong C., Antonarakis S.E., Goff S.C., Orkin S.H., Boehm C.D., Kazazian H.H., Jr. On the origin and spread of beta-thalassemia: recurrent observation of four mutations in different ethnic groups. Proc. Natl. Acad. Sci. USA. 1986;83:6529–6532. doi: 10.1073/pnas.83.17.6529. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Orkin S.H., Antonarakis S.E., Kazazian H.H., Jr. Base substitution at position -88 in a beta-thalassemic globin gene. Further evidence for the role of distal promoter element ACACCC. J. Biol. Chem. 1984;259:8679–8681. [PubMed] [Google Scholar]
- 86.Gonzalez-Redondo J.M., Stoming T.A., Lanclos K.D., Gu Y.C., Kutlar A., Kutlar F., Nakatsuji T., Deng B., Han I.S., McKie V.C. Clinical and genetic heterogeneity in black patients with homozygous beta-thalassemia from the southeastern United States. Blood. 1988;72:1007–1014. [PubMed] [Google Scholar]
- 87.Treisman R., Orkin S.H., Maniatis T. Specific transcription and RNA splicing defects in five cloned beta-thalassaemia genes. Nature. 1983;302:591–596. doi: 10.1038/302591a0. [DOI] [PubMed] [Google Scholar]
- 88.Westaway D., Williamson R. An intron nucleotide sequence variant in a cloned beta +-thalassaemia globin gene. Nucleic Acids Res. 1981;9:1777–1788. doi: 10.1093/nar/9.8.1777. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Spritz R.A., Jagadeeswaran P., Choudary P.V., Biro P.A., Elder J.T., deRiel J.K., Manley J.L., Gefter M.L., Forget B.G., Weissman S.M. Base substitution in an intervening sequence of a beta+-thalassemic human globin gene. Proc. Natl. Acad. Sci. USA. 1981;78:2455–2459. doi: 10.1073/pnas.78.4.2455. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Trecartin R.F., Liebhaber S.A., Chang J.C., Lee K.Y., Kan Y.W., Furbetta M., Angius A., Cao A. beta zero thalassemia in Sardinia is caused by a nonsense mutation. J. Clin. Invest. 1981;68:1012–1017. doi: 10.1172/JCI110323. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Orkin S.H., Goff S.C. Nonsense and frameshift mutations in beta 0-thalassemia detected in cloned beta-globin genes. J. Biol. Chem. 1981;256:9782–9784. [PubMed] [Google Scholar]
- 92.Atweh G.F., Wong C., Reed R., Antonarakis S.E., Zhu D., Ghosh P.K., Maniatis T., Forget B.G., Kazazian H.H., Jr. A new mutation in IVS-1 of the human beta globin gene causing beta thalassemia due to abnormal splicing. Blood. 1987;70:147–151. [PubMed] [Google Scholar]
- 93.Chen Z., Tang H., Qayyum R., Schick U.M., Nalls M.A., Handsaker R., Li J., Lu Y., Yanek L.R., Keating B., BioBank Japan Project. CHARGE Consortium Genome-wide association analysis of red blood cell traits in African Americans: the COGENT Network. Hum. Mol. Genet. 2013;22:2529–2538. doi: 10.1093/hmg/ddt087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Harrison A., Mashon R.S., Kakkar N., Das S. Clinico-Hematological Profile of Hb Q India: An Uncommon Hemoglobin Variant. Indian J. Hematol. Blood Transfus. 2018;34:299–303. doi: 10.1007/s12288-017-0864-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Schmidt R.M., Bechtel K.C., Moo-Penn W.F. Hemoglobin QIndia, alpha 64 (E13) Asp replaced by His, and beta-thalassemia in a Canadian family. Am. J. Clin. Pathol. 1976;66:446–448. doi: 10.1093/ajcp/66.2.446. [DOI] [PubMed] [Google Scholar]
- 96.Sukumaran P.K., Merchant S.M., Desai M.P., Wiltshire B.G., Lehmann H. Haemoglobin Q India (alpha 64(E13) aspartic acid histidine) associated with beta-thalassemia observed in three Sindhi families. J. Med. Genet. 1972;9:436–442. doi: 10.1136/jmg.9.4.436. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Yu X., Mollan T.L., Butler A., Gow A.J., Olson J.S., Weiss M.J. Analysis of human alpha globin gene mutations that impair binding to the alpha hemoglobin stabilizing protein. Blood. 2009;113:5961–5969. doi: 10.1182/blood-2008-12-196030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Giordano P.C., Zweegman S., Akkermans N., Arkesteijn S.G.J., van Delft P., Versteegh F.G.A., Wajcman H., Harteveld C.L. The first case of Hb Groene Hart [alpha119(H2)Pro-->Ser, CCT-->TCT (alpha1)] homozygosity confirms that a thalassemia phenotype is associated with this abnormal hemoglobin variant. Hemoglobin. 2007;31:179–182. doi: 10.1080/03630260701289490. [DOI] [PubMed] [Google Scholar]
- 99.Joly P., Lacan P., Garcia C., Francina A. Description of the phenotypes of 63 heterozygous, homozygous and compound heterozygous patients carrying the Hb Groene Hart [α119(H2)Pro®Ser; HBA1: c.358C>T] variant. Hemoglobin. 2014;38:64–66. doi: 10.3109/03630269.2013.834264. [DOI] [PubMed] [Google Scholar]
- 100.Chami N., Chen M.-H., Slater A.J., Eicher J.D., Evangelou E., Tajuddin S.M., Love-Gregory L., Kacprowski T., Schick U.M., Nomura A. Exome Genotyping Identifies Pleiotropic Variants Associated with Red Blood Cell Traits. Am. J. Hum. Genet. 2016;99:8–21. doi: 10.1016/j.ajhg.2016.05.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Cserti-Gazdewich C.M., Mayr W.R., Dzik W.H. Plasmodium falciparum malaria and the immunogenetics of ABO, HLA, and CD36 (platelet glycoprotein IV) Vox Sang. 2011;100:99–111. doi: 10.1111/j.1423-0410.2010.01429.x. [DOI] [PubMed] [Google Scholar]
- 102.Fillebeen C., Charlebois E., Wagner J., Katsarou A., Mui J., Vali H., Garcia-Santos D., Ponka P., Presley J., Pantopoulos K. Transferrin receptor 1 controls systemic iron homeostasis by fine-tuning hepcidin expression to hepatocellular iron load. Blood. 2019;133:344–355. doi: 10.1182/blood-2018-05-850404. [DOI] [PubMed] [Google Scholar]
- 103.Aljohani A.H., Al-Mousa H., Arnaout R., Al-Dhekri H., Mohammed R., Alsum Z., Nicolas-Jilwan M., Alrogi F., Al-Muhsen S., Alazami A.M., Al-Saud B. Clinical and immunological characterization of combined immunodeficiency due to TFRC mutation in eight patients. J. Clin. Immunol. 2020;40:1103–1110. doi: 10.1007/s10875-020-00851-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Pan D., Kalfa T.A., Wang D., Risinger M., Crable S., Ottlinger A., Chandra S., Mount D.B., Hübner C.A., Franco R.S., Joiner C.H. K-Cl cotransporter gene expression during human and murine erythroid differentiation. J. Biol. Chem. 2011;286:30492–30503. doi: 10.1074/jbc.M110.206516. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Marcoux A.A., Garneau A.P., Frenette-Cotton R., Slimani S., Mac-Way F., Isenring P. Molecular features and physiological roles of K+-Cl- cotransporter 4 (KCC4) Biochim. Biophys. Acta, Gen. Subj. 2017;1861:3154–3166. doi: 10.1016/j.bbagen.2017.09.007. [DOI] [PubMed] [Google Scholar]
- 106.Brugnara C. Sickle cell disease: from membrane pathophysiology to novel therapies for prevention of erythrocyte dehydration. J. Pediatr. Hematol. Oncol. 2003;25:927–933. doi: 10.1097/00043426-200312000-00004. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data for each participating study can be accessed through dbGaP with the corresponding accession number (Amish, phs000956; ARIC, phs001211; BioMe, phs001644; CARDIA, phs001612; CHS, phs001368; COPDGene, phs000951; FHS, phs000974; GeneSTAR, phs001218; HCHS/SOL, phs001395; JHS, phs000964; MESA, phs001416; SAFS, phs001215; WHI, phs001237). Analysis results for the conditional single variant analyses and the aggregate conditional analyses can be accessed through dbGaP accession number phs001974.