Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Jul 1.
Published in final edited form as: Nature. 2025 Jan 1;637(8048):1186–1197. doi: 10.1038/s41586-024-08346-4

Monoallelic expression can govern penetrance of inborn errors of immunity

O’Jay Stewart 1,2,3, Conor Gruber 4,21, Haley E Randolph 1,2,21, Roosheel Patel 3,21, Meredith Ramba 1,2,3,21, Enrica Calzoni 2, Lei Haley Huang 2,5, Jay Levy 3,5, Sofija Buta 1,2, Angelica Lee 2,3, Christos Sazeides 1,2,3, Zoe Prue 1,2, David P Hoytema van Konijnenburg 6, Ivan K Chinn 7,8, Luis A Pedroza 2, James R Lupski 7,9,10, Erica G Schmitt 11, Megan A Cooper 11, Anne Puel 12,13,14, Xiao Peng 15, Stéphanie Boisson-Dupuis 12,13,14, Jacinta Bustamante 12,13,14,16, Satoshi Okada 17, Marta Martin-Fernandez 1,2,3,18, Jordan S Orange 2, Jean-Laurent Casanova 12,13,16,19,20, Joshua D Milner 2,5, Dusan Bogunovic 1,2,5,17,
PMCID: PMC11804961  NIHMSID: NIHMS2050700  PMID: 39743591

Abstract

Inborn errors of immunity (IEIs) are genetic disorders that underlie susceptibility to infection, autoimmunity, autoinflammation, allergy and/or malignancy1. Incomplete penetrance is common among IEIs despite their monogenic basis2. Here we investigate the contribution of autosomal random monoallelic expression (aRMAE), a somatic commitment to the expression of one allele3,4, to phenotypic variability observed in families with IEIs. Using a clonal primary T cell system to assess aRMAE status of genes in healthy individuals, we find that 4.30% of IEI genes and 5.20% of all genes undergo aRMAE. Perturbing H3K27me3 and DNA methylation alters allele expression commitment, in support of two proposed mechanisms5,6 for the regulation of aRMAE. We tested peripheral blood mononuclear cells from individuals with IEIs with shared genetic lesions but discordant clinical phenotypes for aRMAE. Among two relatives who were heterozygous for a mutation in PLCG2 (delEx19), an antibody deficiency phenotype corresponds to selective mutant allele expression in B cells. By contrast, among relatives who were heterozygous for a mutation in JAK1 (c.2099G>A; p.S700N), the unaffected carrier T cells predominantly expressed the wild-type JAK1 allele, whereas the affected carrier T cells exhibited biallelic expression. Allelic expression bias was also documented in phenotypically discordant family members with mutations in STAT1 and CARD11. This study highlights the importance of considering both the genotype and the ‘transcriptotype’ in analyses of the penetrance and expressivity of monogenic disorders.


IEIs are monogenic disorders that arise owing to genetic defects that compromise the innate and/or adaptive functions of immunity1. IEIs can manifest as an increased susceptibility to infection, autoinflammation, allergy, autoimmunity or malignancy. Rapidly developing clinical and biochemical research has led to the discovery of genetic lesions in more than 450 genes that underlie IEIs. Similar to many monogenic disorders, IEIs are generally inherited in a dominant or recessive fashion from autosomal or X-linked genes1,7. However, genetic segregation is often non-Mendelian, as incomplete clinical penetrance is frequent among carriers of these genetic defects. Incomplete penetrance is defined as the absence of clinical disease despite the presence of a genotype that has been otherwise shown to cause disease. Variable expressivity refers to the sliding scale of severity of phenotypes, as well as their diversity. Incomplete penetrance and variable expressivity are often detected across families with an IEI2. Indeed, the prevalence of incomplete penetrance across all IEIs is estimated to be upwards of 20%, but may be higher given the bias toward underreporting in the literature2.

Clinical penetrance in IEIs ranges from complete, in which the presence of the genotype is always reported to confer disease (for example, in the cases of mutations in WAS (ref. 8) and LRBA (ref. 9)), to incomplete, in which many family members are unaffected despite carrying disease-causing mutations (for example, in genes such as IL12RB1 (ref. 10 and CTLA4 (ref. 11)). Beyond clinical penetrance, phenotypic variation can also be observed at the cellular level, whereby individual cells with the same mutation exhibit variable immunological phenotypes. Several explanations for incomplete penetrance have been documented and proposed, including modifier genes12, adaptive immunity rescue of genetics13,14, environmental exposures, complex genetic and regulatory networks2,15, and allele-specific expression16. However, the mechanistic underpinnings remain poorly understood and are unlikely to fully explain incomplete penetrance (for example, twin discordance) for many heritable conditions. Thus, the genetic properties of IEIs render them an ideal model for further investigation of this question.

Although most genes are thought to be subject to biallelic expression, forms of monoallelic expression such as X inactivation and imprinting have long been studied. In individuals with two X chromosomes, approximately 50% of cells express maternal X alleles and 50% express paternal X alleles, a process mediated by regulating the relative expression of the non-coding RNAs Xist and Tsix17,18. Imprinted genes are consistently expressed from the same allele in a parent-of-origin manner throughout all relevant cells in an individual and are encoded by epigenetic marks established early in life19. In addition, somatic commitment to one allele is well documented for gene families such as those encoding the olfactory receptors20, protocadherins21, immunoglobulins22 and T cell receptors23, each rooted in combinatorial power and/or necessity for biological diversity. By contrast, the role and mechanism of autosomal random monoallelic expression (aRMAE)—a somatic phenomenon defined by the random commitment to expression from one allele—is not as fully understood. aRMAE is: (1) somatically acquired at random across cells; (2) independent of parent of origin and neighbouring gene expression; and (3) stable over time and cell division (which renders this challenging to study without lineage tracing). For aRMAE, the commitment does not have to be absolute and may exist in only a subset of cells. The ultimate result is a heterogeneous population of cells in regard to the expressed allele3,4,24-26. In some sense, aRMAE can be described as a stable allelic ‘fate’ rather than a transient gene-regulatory ‘state’ such as transcriptional bursting, for which dynamic, rapidly alternating gene expression occurs from either allele of an inducible gene3,4,24-26. aRMAE is estimated to occur in 2% to 10% of autosomal genes3, with no clear segregation on the basis of gene function or presence on particular chromosomes. Studies using mouse and human clonal cell lines have identified histone methylation marks that are enriched on aRMAE genes5,27, indicating a potential role for dynamic chromatin modulation in the establishment and regulation of aRMAE. Although aRMAE is a well-documented phenomenon biologically, the implications of these observations for human disease remain poorly understood.

Here we examine the aRMAE status of more than 4,000 genes across the genome by harnessing clonal primary T cells derived from healthy donors. Using IEIs as a prototype to interrogate the effect of aRMAE on monogenic disease, we then examine aRMAE in 189 monogenic IEI genes, characterize some of its epigenetic regulatory features, and provide proof of principle that aRMAE can explain incomplete penetrance of monogenic disorders.

Clonal primary T cell system for aRMAE identification

In this study, we first sought to examine the natural propensity of IEI-causing immune genes to undergo aRMAE. To do so, we established a clonal primary T cell system, in which we clonally expanded single CD4+ or CD8+ T lymphocytes (n = 47) from healthy donor (n = 9) peripheral blood mononuclear cells (PBMCs) for a minimum of 8–12 weeks to obtain more than 100,000 cells per clone. Probing clonal cells enables detection of an allele commitment stemming from a single cell while reducing confounding findings from transcriptional bursting. To distinguish between two alleles, we used whole-exome sequencing (WES) on genomic DNA (gDNA) from healthy donor PBMCs to assess the presence of synonymous and nonsynonymous heterozygous single nucleotide polymorphisms (SNPs) located within gene exons, called ‘het-exonic’ SNPs here. To measure the expression levels of each allele at these SNPs, we performed bulk RNA sequencing (RNA-seq) on our expanded T cell clones. Using this system (Fig. 1a), we identified genes with equal expression from both alleles across all assessed clones, defined as biallelic expression, versus genes that expressed 80% or more of the reference allele or alternate allele (Fig. 1b). Genes that are not imprinted or not present on the X chromosome were classified as aRMAE if two or more clones were assessable and at least one clone was committed to expression from the reference or alternate allele. Allelic imbalance (AI) refers to differing expression levels from two alleles of a gene. We chose an aRMAE expression cutoff of at least 80% allelic imbalance for either allele28 (AI ≥ 0.80 or AI ≤ 0.20). We also conducted analyses using more stringent thresholds for allelic imbalance (AI ≥ 90 or AI ≤ 0.10 and AI ≥ 0.95 or AI ≤ 0.05; Supplementary Table 1).

Fig. 1 ∣. Ex vivo clonally expanded primary T cells used for detection of aRMAE.

Fig. 1 ∣

a, Schematic of experiment to establish a clonal (n = 47) primary T cell system to assess aRMAE in healthy donors (n = 9) via expressed transcripts at the site of heterozygous, synonymous SNPs in gene exons (het-exonic SNP). b, Schematic representing the expression pattern of autosomal genes—biallelic (BAE), aRMAE monoallelic reference (Ref) and aRMAE monoallelic alternate (Alt)—and the allelic imbalance threshold (AI ≥ 0.80 or AI ≤ 0.20) used to classify allele committed expression. cf, WES of healthy donor PBMCs (n = 9) and bulk RNA-seq from expanded clones (n = 47) identified 14,694 genes that contain at least 1 het-exonic SNP (c), 3,438 of which have at least 30× read coverage of SNP loci (e) and 318 IEI genes with at least 1 het-exonic SNP (d), 140 of which have at least 30× read coverage of SNP loci (f).

Using an allele-specific expression analysis pipeline (Extended Data Fig. 1a), we define all genes containing at least one het-exonic SNP detected at minimum 30× depth of coverage (at least 30 reads per SNP) as an ‘assessable gene’ for aRMAE analysis. We only considered T cell clones with sufficient, high-quality sequencing data (at least 500 assessable genes per clone). Furthermore, if a gene contained multiple het-exonic SNPs, we only considered the SNP with the highest coverage (Extended Data Fig. 2). Across the assessed healthy donors, WES analysis confidently detected a nucleotide variation in 18,091 protein-coding genes. Het-exonic SNPs were identified in 81% (n = 14,694) of these genes in at least one donor (Fig. 1c). In the smaller subset of IEI genes, a het-exonic SNP was detected in 70% (n = 318) of genes (Fig. 1d). RNA-seq reads mapped to het-exonic SNPs in 19% (n = 3,438) of protein-coding genes (Fig. 1e). Ultimately, 44% (n = 140) of assessable IEI genes were sufficiently covered (at least 30 reads) by RNA-seq of synonymous het-exonic SNPs (Fig. 1f). The use of missense variants in addition to synonymous variants from donor WES increased all assessable genes to 30% (n = 4,366) (Extended Data Fig. 1b) and IEI genes to 59% (n = 189) (Extended Data Fig. 1d). Consideration of only missense variants resulted in 2,155 (Extended Data Fig. 1c) assessable genes across the genome and 117 IEI genes (Extended Data Fig. 1e). The expression of T cell receptor (TCR)-β variable (TRBV) genes from bulk RNA-seq of the T cell clones was used to verify the largely single-cell origin of each clone (Extended Data Fig. 3a-c and Supplementary Table 3). Up to 22% of single T cells have previously been shown to express multiple TCR-β chains at the level of mRNA29,30. All genes across T cell clones were assessed for reference bias mapping of the heterozygous SNPs (Extended Data Fig. 4). This experimental design enabled us to interrogate aRMAE in a genome-wide and IEI-specific manner.

Detection of aRMAE genes

Using our clonal T cell system, we assessed the aRMAE status of the assessable protein-coding genes defined above. As expected, most autosomal genes expressed alleles in a biallelic manner, but some genes displayed an expression pattern committed to the reference (2.6%) or the alternate (2.9%) alleles (Fig. 2a-c, AI ≥ 0.80 or AI ≤ 0.20, false discovery rate (FDR) < 0.05). Of the 140 IEI genes showing the requisite depth of coverage for detection of a het-exonic SNP, 2.1% committed to expression from the reference allele and 2.8% committed to expressed from the alternate allele (Fig. 2b and Supplementary Table 1, AI ≥ 0.80 or AI ≤ 0.20, FDR ≤ 0.05). These findings were similar when assessing missense variants alone or combined with synonymous variants (Extended Data Fig. 5a-f). Overall, 4.30% of IEI genes were found to undergo aRMAE, with 5.20% of all genes expressed from either allele (Fig. 2c and Supplementary Table 1). Using more stringent cutoffs (AI ≥ 90 or AI ≤ 0.10, and AI ≥ 0.95 or AI ≤ 0.05), aRMAE was detected at rates of 3.08% and 2.18% genome wide, and 2.87% and 1.42% for IEI genes, respectively (Extended Data Fig. 5c,d and Supplementary Table 1), whereas at AI ≥ 0.80 or AI ≤ 0.20, 5–6% of IEI genes and all genes undergo monoallelic expression when mapping with missense or missense and synonymous variants (Extended Data Fig. 5e,f). Preliminary gene set enrichment analysis (GSEA) of aRMAE genes shows enrichment of genes involved in cellular organization and immune responses (Extended Data Fig. 5g). To further validate our analysis pipeline, we examined the allele-specific expression patterns of known imprinted genes and genes on the X chromosome. We verified that if a gene was indeed imprinted within a given donor (AI ≥ 0.80 or AI ≤ 0.20), all clones derived from that donor were committed to expression from that same allele (Fig. 2d). Furthermore, genes on the X chromosome were expressed from a single allele in female donors unless they were known to escape X chromosome inactivation31 (Extended Data Fig. 6a-c and Supplementary Table 2). Unlike imprinted and X chromosome inactivated genes, examination of aRMAE genes across clones within a donor demonstrates that aRMAE genes do not necessarily have the same allelic commitments across clones from the same donor (Fig. 2e and Extended Data Fig. 6d-f). aRMAE genes were detected across all chromosomes, with comparable gene expression between aRMAE and biallelic expression genes (Fig. 2f). In addition, we examined the prevalence of aRMAE across T helper cell subsets and donors, showing limited differences (Extended Data Fig. 7a-g) and—so far—no preference for any T cell subset.

Fig. 2 ∣. aRMAE of widespread autosomal genes and IEI genes revealed with a T cell system.

Fig. 2 ∣

a,b, Allelic imbalance of genes across T cell clones (n = 29) from healthy donors (n = 7) with at least 500 assessable genes per clone. a, Autosomal genes are predominantly expressed in a biallelic manner (98%) per clone; however, collectively, 5% (171 out of 3,438) of genes are capable of exhibiting monoallelic expression. b,c, IEI genes are predominantly expressed in a biallelic manner (98.4%), with 4% (6 out of 140) (c) being capable of monoallelic expression. d, Imprinted genes in expanded T cell clones have a stable allelic imbalance across clones within the same donor. e, Representative plot of two-sided Pearson correlation for aRMAE genes (n = 28) across two clones from a single donor (HD6). aRMAE genes have a significant (P = 0.04182) but poor correlation (r = 0.387) between clones. Error bands represent the 80% confidence interval. f, Biallelic and aRMAE genes have similar expression levels (P = 0.3444, two-tailed Student’s t-test and Welch’s correction). Biallelic: n = 24,709, minimum = 30, maximum = 10,769, median = 77, 25th percentile = 47, 75th percentile = 151; aRMAE: n = 427, minimum = 30, maximum = 4,879, median = 70, 25th percentile = 43, 75th percentile = 135. Whiskers extend to minimum and maximum values. g, IEI genes with known incomplete penetrance were assessed for aRMAE (≥10× coverage at het-exonic SNP loci). aRMAE was detected in JAK1 (6 out of 26), NOD2 (6 out of 19), PLCG2 (2 out of 7), PRF1 (6 out of 24) and STAT1 (2 out of 5) clones across healthy donors (n = 6). ac,g, FDR-adjusted P values from a two-sided binomial test of allelic imbalance by the Benjamni–Hochberg method; P = 0.50. Genes with FDR < 0.05 and AI ≥ 0.8 or AI ≤ 0.2 are shown.

aRMAE of genes in incompletely penetrant IEIs

We next used our clonal T cell system to assess aRMAE in selected genes linked to dominantly inherited IEIs that are known to show incomplete clinical penetrance. We chose to focus on IEIs with autosomal dominant inheritance since the presence of a heterozygous mutation can confer disease and is poised to be influenced by aRMAE. Sequencing of many loci did not achieve our initial requirement of 30× depth of coverage, but our previous study using single-cell RNA-seq (scRNA-seq) of JAK1 identified 10× read depth as sufficient to assess allele-specific expression32. As a result, we relaxed our read depth requirements to a minimum 10× read depth at the site of a het-exonic SNP. We found that expression of JAK1, NOD2, PLCG2, PRF1 and STAT1 can be committed to either allele within healthy individuals, with as many as 50% of clones (for example, NOD2) and as few as 20% of clones (for example, JAK1) expressing these genes from a single allele (Fig. 2g). In contrast to the fixed pattern observed with imprinted genes, the expressed allele for aRMAE genes is not consistent across all clones from an individual (Fig. 2g). In addition, assessed donors diverged in terms of the presence, magnitude and content of genes undergoing aRMAE (Fig. 2g), suggesting that this unique form of gene expression provides many opportunities for achieving greater functional diversity within and amongst individuals, with potential to affect the clinical penetrance and severity of monogenic disease.

Epigenetic regulation of aRMAE

DNA methylation is known to have an important role in the expression of imprinted loci. Previous studies have reported a correspondence between aRMAE loci and differential DNA methylation, as well as enrichment for chromatin marks associated with transcriptional activation (histone H3 lysine 4 trimethylation (H3K4me3)) and repression (histone H3 lysine 27 trimethylation (H3K27me3))5,6,32,33. To assess the effect of altering these signals (H3K4me3, H3K27me3 and 5′-methylcytosine) on aRMAE, we perturbed genes encoding the demethylase enzymes for H3K4me3 (JARID1B) and H3K27me3 (JMJD3), as well as the methyltransferase enzyme (DNMT1) for DNA methylation, by treating expanded T cell clones from healthy donors with small interfering RNAs (siRNAs) to reduce expression levels of JARID1B, JMJD3 or DNMT1, and determining the effect on allelic expression of PLCG2 (Fig. 3a-j and Extended Data Fig. 8a-g). Specifically, T cell clones from healthy donor 1 (HD1, n = 5) and healthy donor 2 (HD2, n = 4) were targeted for siRNA-mediated knockdown of genes encoding demethylases for H3K4me3 (JARID1B siRNA) and H3K27me3 (JMJD3 siRNA). T cell clones in the JARID1B siRNA group exhibited no significant changes in allelic imbalance of PLCG2 across clones from HD1 and HD2 (Fig. 3b,e,h). HD2 T cell clones from the JMJD3 siRNA group had a significantly altered PLCG2 allelic imbalance (Fig. 3c), with three clones having a difference of more than 20% (Fig. 3f,i). However, no clones from HD1 exhibited an appreciable change in allelic imbalance (Fig. 3c,f). DNMT1 siRNA clones exhibited significant allelic imbalance shifts of PLCG2 in clones from HD1 (Fig. 3d,g,j), but only a moderate increase in HD2 clones (Fig. 3d), with only one clone from HD2 exhibiting a change of at least 20% in allelic imbalance (Fig. 3g). Of note, the magnitude of change in allelic imbalance was greater among the DNMT1 siRNA clones (Fig. 3d,g) compared with the JMJD3 siRNA clones (Fig. 3c,f).

Fig. 3 ∣. H3K27me3 and DNA methylation regulate allele commitment in aRMAE.

Fig. 3 ∣

a, Schematic showing siRNA-mediated knockdown of genes encoding demethylating enzymes for H3K4me3 (JARID1B) and H3K27me3 (JMJD3) and DNA methylating enzyme (DNMT1) in expanded T cell clones from healthy donors (HD1 and HD2), and assessment of changes in PLCG2 expression. bd, Shift in the allelic imbalance of PLCG2 in all T cell clones from two healthy donors targeted for knockdown of JARID1B (b; n = 9), JMJD3 (c; n = 9) and DNMT1 (d; n = 7). Dunnett’s multiple comparisons test. Data are mean ± s.e.m. ***P < 0.001; *P < 0.01. eg, Shift in allelic imbalance of PLCG2 of individual T cell clones from HD1 and HD2 targeted for knockdown of JARID1B (e), JMJD3 (f) and DNMT1 (g). hj, Representative ddPCR plots corresponding to eg, respectively. Results representative of two independent experiments. NS, not significant.

The directionality of changes in allelic imbalance can provide useful insight into the regulation of allele-specific expression. Clones in the JARID1B siRNA group did not show appreciable changes in expression from either allele (Extended Data Fig. 8a). Two clones in the JMJD3 siRNA group shifted from exclusive expression of the reference allele of PLCG2 to biallelic expression (93% to 63% reference allele (Ref) expression, and 89% to 62% Ref expression), whereas one clone further committed to expression from the reference allele (79% to 93% Ref expression; Extended Data Fig. 8b). Prior to siRNA transfection, 5 out of 7 clones in the DNMT1 siRNA group were committed to expression (80–100%) from the reference allele of PLCG2, whereas the others exhibited either biallelic (48% Ref and 52% alternate allele (Alt)) or biased expression (37% Ref and 63% Alt) of PLCG2. In these two latter clones, DNMT1 knockdown led to at least 90% expression from the reference allele, whereas 1 out of 5 clones that previously committed to expression from the reference allele shifted towards biallelic expression (99% Ref to 75% Ref; Extended Data Fig. 8c).

Expression levels of JARID1B and JMJD3 were significantly reduced after siRNA-mediated knockdown (Extended Data Fig. 8d). However, DNMT1 expression levels were not significantly reduced, despite trending downwards (Extended Data Fig. 9d). Protein levels of H3K4me3 were increased, albeit slightly, after siRNA targeting of JARID1B (Extended Data Fig. 8e), and 5′-methylcytosine levels were slightly reduced after targeting of DNMT1 (Extended Data Fig. 8g). Despite siRNA reducing mRNA expression levels of JMJD3 (Extended Data Fig. 8d), we were not able to detect significant differences in global H3K27me3 levels via immunofluorescence (Extended Data Fig. 8f), despite a clear functional effect on PLCG2 allelic imbalance (Fig. 3c). Together, these results indicate that disrupting JMJD3 function may, but not exclusively, correspond with a shift from monoallelic expression towards biallelic expression, whereas reduced DNA methylation may, but not exclusively, correspond with a shift from biallelic expression towards monoallelic expression. These signals are also mechanistically linked, particularly for the establishment of heterochromatin, so they may also affect each other. In general, our data and previous studies5,6,32,33 suggest that histone modifications and DNA methylation status both have the potential to affect the allelic expression of a gene. However, further analyses are needed to definitively link epigenetic marks and regulation of aRMAE. Given the complexities of other important contributions to gene regulation, such as long-range cis-regulatory interactions, nucleosome dynamics, and trans-factor recruitment, further studies are needed to unpack the mechanisms underlying the suggestive observations we have made.

aRMAE of PLCG2 in donor T cell clones

The discovery of aRMAE of IEI genes suggests that this form of gene expression may have an effect on disease penetrance, especially in the case of autosomal dominant diseases. Autosomal dominant IEIs provide a well-suited platform for assessing the influence of aRMAE on incomplete penetrance and disease expressivity, as genetically heterozygous cells could either preferentially express the mutant allele to manifest a pathogenic phenotype or be spared by expressing the wild-type allele. As the proportion of these two transcriptional phenotypes can vary across individuals and within cell types, so could the penetrance and expressivity of clinical disease.

RNA-seq of T cell clones identified PLCG2 as an IEI gene that exhibits aRMAE (Fig. 2g). We further evaluated the aRMAE status of PLCG2 using digital droplet PCR (ddPCR). We used nine T cell clones across two healthy donors (healthy donor 5 (HD5) and healthy donor 6 (HD6)), who both carried a silent genetic variant in an exon of PLCG2 (rs1143688) in the heterozygous state. The expression of the reference or alternate allele of this SNP was detected via fluorescent probes specifically designed to distinguish between each allele (Ref-VIC and Alt-FAM). Both assessed donors were heterozygous for rs1143688 in gDNA: 50.5% Ref and 49.5% Alt (HD6), and 50.4% Ref and 49.6% Alt (HD5) (Fig. 4a). Most clones (6 out of 9) exhibited biallelic expression, whereas monoallelic expression was detected in 2 clones from HD5, which expressed 91.5% and 94% from the reference allele, respectively (Fig. 4a). This observation validates our initial finding from bulk RNA-seq data, which demonstrated aRMAE of PLCG2 (Fig. 2g). We next assessed the stability of this allelic imbalance, by testing the same clonally expanded T cells within a four-week span in culture. We noted that the expression of either allele was indeed stable, as the clone initially expressed 94% from the reference allele of PLCG2 and, after re-assessment, expressed 84% from the reference allele (Extended Data Fig. 9a). Given the additional support for aRMAE of PLCG2 from ddPCR, we next assessed the effect of the allele-specific commitment of expression on the phenotypic penetrance of PLCG2 monogenic disease.

Fig. 4 ∣. Expression of PLCG2 from a single allele influences penetrance of antibody deficiency.

Fig. 4 ∣

a, aRMAE of PLCG2 detected using nine expanded T cell clones assessed across two healthy donors. Expression of reference or alternate alleles of a PLCG2 het-exonic SNP (rs1143688) determined using ddPCR. Results representative of two experiments. b, Pedigree of a family with carriers of a heterozygous exon 19 deletion in PLCG2 (PLCG2wt/ΔEx19). Allele-specific expression in III-11 (low Ig levels) and IV-4 (normal Ig levels), who have discordant antibody deficiency phenotypes. c, Schematic of Ca2+ flux assay to separate PLCG2wt/ΔEx19 CD3CD19+ B cells on the basis of cell function. d, Representative flow cytometry plot of Ca2+ flux CD3CD19+ B cell sort. eg, Allele-specific expression of PLCG2wt (wild-type (WT)) and PLCG2ΔEx19 (mutant (MUT)) alleles in Ca2+ flux-positive (Flux+) and Ca2+ Flux-negative (Flux) bulk CD3CD19+ B cells from a healthy donor (e; n = 1) and patients with PLAID (III-11 (f) and IV-4 (g); n = 2), detected by RT–qPCR. Experiments with human samples in eg were performed three times with similar results.

aRMAE in B cells of patients with PLCG2 mutations

With the additional evidence for PLCG2 aRMAE above, we next assessed the impact of the allele-specific commitment of expression on the phenotypic penetrance of PLCG2-related conditions. PLCG2 encodes PLCγ2, a protein with key roles in calcium flux and cellular activation of B and natural killer (NK) cells that are notably temperature dependent34. Heterozygosity for deletion of exon 19 (PLCG2wt/ΔEx19) and exons 20–22 (PLCG2wt/ΔEx20-22) in PLCG2 is associated with PLCG2-associated antibody deficiency and immune dysregulation (PLAID), which leads to cold-induced urticaria and incompletely penetrant antibody deficiency34. We examined aRMAE in two family members (designated III-11 and IV-4) from a large four-generation family affected by PLAID (Fig. 4b), who are heterozygous for deletion of PLCG2 exon 19 (PLCG2wt/ΔEx19) but show divergent humoral immunity phenotypes (Extended Data Fig. 9b).

First, CD19+ cells from III-11 and IV-4 were sorted on the basis of a functional assay that measures sustained influx of extracellular calcium across the plasma membrane, using Indo-1, a cell-permeant ratiometric Ca2+ indicator, and an anti-IgM crosslinking antibody for cell activation and subsequent calcium flux (Fig. 4c). UV fluorescence intensity detected cells that have normal flux (Flux+) or lack flux (Flux) (Fig. 4d). III-11 and IV-4 both had decreased amounts of Flux+ B cells (Extended Data Fig. 9c) and reduced peak levels of internal Ca2+ flux (Extended Data Fig. 9d,e) compared with healthy controls. Expression of PLCG2wt and PLCG2ΔEx19 alleles were detected via quantitative PCR with reverse transcription (RT–qPCR). Patient III-11, who had hypogammaglobu-linemia (Extended Data Fig. 9b), showed increased expression of the PLCG2ΔEx19 allele in FluxCD19+ cells (Fig. 4e-g). IV-4, who had normal levels of serum immunoglobulin (Extended Data Fig. 9b) and no significant infectious history, had similar amounts of PLCG2ΔEx19 and PLCG2wt expression in FluxCD19+ and Flux+CD19+ cells (Fig. 4e-g). This demonstrates gene expression bias from a single allele in a non-imprinted or X-inactive gene influencing cellular phenotype (signalling for B cell calcium flux) and clinical disease phenotype (antibody deficiency).

In summary, allele-specific PLCG2 expression was demonstrated in bulk-sorted B cells, which corresponded directly with evidence of cellular activity, as B cells deficient in a Ca2+ flux response isolated from a patient who was unable to produce antibodies were found to almost exclusively express the mutant allele. These data suggest that selective expression of mutant alleles may contribute to the penetrance of some immune disease phenotypes, such as antibody deficiency.

aRMAE of JAK1 in donor T cell clones

We previously found evidence for allele-specific expression of JAK1 via targeted scRNA-seq in an individual with a gain-of-function (GOF) mutation (p.S703I) in the pseudokinase domain of JAK135. As these studies were carried out without lineage tracing, it remains unclear whether the biased JAK1 expression represents transient variation in allele usage (for example, transcriptional bursts, which are unlikely to alter the allelic balance of the protein) or stable commitment, as in aRMAE. To assess the propensity of JAK1 to undergo aRMAE, we evaluated 47 clones across 9 healthy donors who were heterozygous for silent exon variants (Fig. 2g). Among these, 26 out of 47 clones across 6 healthy donors were assessable for JAK1 aRMAE and 6 out of 26 clones in 6 donors exhibited committed JAK1 expression from a single allele, confirming aRMAE of JAK1. To technically validate the results, we used cDNA and performed ddPCR on two clones that were assessed above for JAK1 aRMAE via bulk RNA-seq. The ddPCR results were accurate within 3% of reference allele expression (Extended Data Fig. 10a).

Next, also using ddPCR, we detected the expression of the reference and alternate alleles of the JAK1 het-SNP rs3737139 in two unrelated healthy donors (HD3 and HD5) across 12 additional established T cell clones (6 clones each). Both donors demonstrated heterozygosity for rs3737139 in gDNA and mRNA from bulk PBMCs (49.63% Ref and 50.37% Alt for HD3, and 51.01% Ref and 48.99% Alt for HD5) (Fig. 5a). When single cells were isolated and expanded, most clonal samples (four out of six) from HD3 displayed monoallelic expression, with reference allele-specific expression. However, most clones assessed from HD5 maintained biallelic expression, except one clone for which the ref allele accounted for 97% of JAK1 expression (Fig. 5a). These findings from additional expanded T cell clones of healthy individuals provide further evidence for aRMAE manifesting as stable allele-specific expression at JAK1, but with marked inter-individual variability.

Fig. 5 ∣. Predominant expression of the wild-type allele in T cells drives incomplete penetrance of JAK1-mediated disease.

Fig. 5 ∣

a, aRMAE of JAK1 identified using 12 additional expanded T cell clones assessed across two healthy donors (HD3 and HD5). Expression of each allele was determined by the intensity of TaqMan probes binding to cDNA of the reference or alternate alleles of a JAK1 het-exonic SNP (rs3737139) using ddPCR. Results are representative of two experiments. b, Pedigree of a family with carriers of a heterozygous mutation in JAK1. II-2 (affected by disease) and III-1 (healthy carrier) display incomplete penetrance of JAK1-mediated inflammatory disease. c, Sanger sequencing of a heterozygous mutation in II-2 and III-1. d, Allele-specific expression analysis of bulk PBMCs from III-1 (78% JAK1wt, 22% JAK1mut) and II-2 (64% JAK1wt, 36% JAK1mut). WT, wild-type. e, Allelic imbalance of [WT/(WT + mut)] JAK1 alleles in CD19+ B cells, CD3+ T cells, CD14+ monocytes, CD56+ NK cells and CD3+ T cells from the affected (II-2) and unaffected (III-1) family members. Data are from ddPCR reactions for detection of fluorescence for VIC (wild type) and FAM (mutant (mut)) using a custom Taqman probe for the JAK1 c.2099G>A p.S700N locus. Two-sided binomial test, P = 0.50 (biallelic expression). ****P < 0.001. f, Allelic imbalance of JAK1 c.2099G>A p.(S700N) in CD3+ T cells from affected (II-2) and unaffected (III-1) family members. Data are mean ± 95% confidence interval (binomial test). Experiments with human samples in df were performed twice with similar results.

aRMAE in leukocytes of patients with JAK1 GOF

To understand how JAK1 aRMAE may contribute to disease penetrance, we studied a family harbouring heterozygosity for a JAK1 c.2099G>A, (p.S700N) variant that is documented to affect a pseudokinase domain residue of JAK15. JAK1 encodes a key mediator of cytokine signalling through the JAK–STAT signalling axis, and GOF mutations in this gene lead to JAACD syndrome (JAK1 GOF-associated autoimmunity, atopy, colitis and dermatitis), an incompletely penetrant autoinflammatory disorder35-37. Biochemically, we previously showed that the JAK1 variant found in this family leads to cytokine-independent activation of JAK–STAT signalling, as indicated by increased baseline levels of STAT1 and STAT2 phosphorylation, consistent with the predicted GOF effect37. Out of three family members who shared this mutation (I-1, II-2 and III-1; Fig. 5b), only one (II-2) showed clinically penetrant disease (Extended Data Fig. 10b-d), so we compared the allelic presence and expression of wild-type and mutant JAK1 (Fig. 5c) in leukocytes from II-2 and III-1 (we were unable to obtain blood samples from I-1).

Expression levels of the wild-type and c.2099G>A (mutant) alleles were detected using fluorescently labelled TaqMan probes via ddPCR. We first assessed mRNA levels of each allele in bulk PBMCs. The unaffected family member (III-1) expressed the mutant allele at 22%, as opposed to the 50% expected from gDNA heterozygosity. By contrast, the affected relative (II-2) expressed the mutant allele at 36% in bulk PBMCs (Fig. 5d). Although neither PBMCs exhibited more than 80% expression of mutant or wild-type allele, the difference in allelic imbalance estimates between the unaffected (III-1) and affected (II-2) heterozygotes suggests that mutant allele expression levels may contribute to disease manifestation.

PBMCs from both heterozygous individuals (II-2 and III-1) and a healthy family member without the JAK1 mutation (II-1) were sorted into bulk CD3+ T cells, CD19+ B cells, CD56+ NK cells and CD14+ monocytes. Using ddPCR, we examined the expression of wild-type and mutant alleles within each sorted cell type to determine the selective expression of either allele (Fig. 5e,f and Extended Data Fig. 10c,d). No cell types demonstrated biased expression favouring the mutant allele. Instead, all cell types tested in the affected proband (II-2), except T cells, showed committed expression to the wild-type allele despite DNA heterozygosity (Fig. 5e,f and Extended Data Fig. 10c,d). Of note, NK cells from the unaffected family member (III-1) also showed biallelic expression of JAK1 (Fig. 5e and Extended Data Fig. 10c,d), which may be attributed to the greater ability of NK cells to tolerate GOF signalling, as we have previously shown for the individual with the GOF p.S703I mutation in JAK135. This suggests that reduced mutant allele expression may correspond with lower penetrance of clinical disease. In other words, evaluation of the sorted cells from heterozygotes III-1 and II-2 (Extended Data Fig. 10c,d) found selective expression of the wild-type allele in cell types tested from the healthy female family member (III-1), whereas B cells, NK cells and monocytes from her affected mother (II-2) generally followed the same trend. In II-2, only T cells exhibited nearly equal mRNA expression of wild-type and mutant alleles, which may result in sufficient production of the mutant protein to cross a critical signalling threshold for pathogenesis (Fig. 5e,f and Extended Data Fig. 10c,d). These findings indicate that allelic imbalances between wild-type and disease-causing alleles can indeed modulate the presence or absence of disease in individuals with the same gDNA dosage of a disease-causing mutation.

aRMAE in STAT1 LOF patient leukocytes

In healthy clonal T cells, we also identified STAT1 as a gene that undergoes aRMAE (Fig. 2g). To evaluate the effect of STAT1 aRMAE on disease penetrance we assessed a family with a heterozygous mutation in STAT1. The STAT1 c.1976T>C (p.I659T) mutation leads to autosomal dominant inheritance of disease due to loss of function (LOF) of STAT1 activity38. Autosomal dominant STAT1 LOF deficiency is a known cause of Mendelian susceptibility to mycobacterial disease39,40, owing to a loss of facilitation of gene transcription by STAT1 after activation by type II interferon (IFNγ) via IFNγR39,40, as well as impaired antiviral immunity if type I IFN responses are affected. We analysed two individuals from the family sharing the c.1976T>C mutation in STAT1 who had discordant disease presentation (Fig. 6a). The proband II-1 had penetrant disease (severe multifocal tuberculosis), whereas I-1 was unaffected despite carrying the deleterious mutation.

Fig. 6 ∣. aRMAE in people with mutations in STAT1 and CARD11.

Fig. 6 ∣

a, Family pedigree of an affected individual (II-1) and an unaffected carrier (I-1) with a mutation at the STAT1 locus. b, Experimental design to assess aRMAE of STAT1 in lymphoid and myeloid cell subsets. c, Allelic imbalance of [WT/(WT + mut)] STAT1 alleles in CD19+ B cells (CD19), CD3+ T cells (CD3), CD56+ NK cells (CD56), classical monocytes (CM), intermediate monocytes (IM) and non-classical monocytes (NCM) from a healthy control and the affected (II-1) and unaffected (I-1) family members. Data are from dPCR reactions for detection of fluorescence for VIC (wild type) and FAM (mutant) using a custom Taqman probe for the STAT1 c.1976T>C p.(I659T) locus. Binomial test, with P = 0.50 (biallelic expression). ****P < 0.001. d, Experimental design to assess aRMAE of STAT1 in CD3+ T cells and monocyte subsets. e, Allelic imbalance [WT/(WT + mut)] of STAT1 alleles in intermediate monocytes from P1 (carrying p.R274Q), CD3+ T cells and non-classical monocytes from P2 (carrying p.K388E), and CD3+ T cells, classical monocytes and non-classical monocytes from P3 (carrying p.T385M). Data are from dPCR reactions for detection of fluorescence for VIC (wild type) and FAM (mutant) using a custom Taqman probe for the STAT1 p.R274Q, p.K388E and p.T385M loci. Two-sided binomial test, P = 0.50 (biallelic expression). *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001. f, Family pedigree of three individuals with CARD11 mutations (I-1, II-1 and II-2). Minimally affected father (I-1) denoted in grey. g, Schematic illustrating design of experiment to assess aRMAE of CARD11. h, Expression of wild-type and mutant CARD11 alleles in naive T cells stimulated with anti-CD3 and anti-CD28 for three days. Data shown are from bulk RNA-seq reads that mapped to the CARD11 c.223 C > T locus. Two-sided binomial test, P = 0.50 (biallelic expression). **P < 0.01. Experiments with human samples in c,e,g were performed once.

To determine whether aRMAE contributes to the discrepancy within this family, we sorted six subtypes of leukocytes (CD3+ T cells, CD19+ B cells, CD56+ NK cells, classical monocytes, intermediate monocytes and non-classical monocytes) from each individual and assessed the expression of both alleles across the cell types (Fig. 6b). The expression of the wild-type and mutant STAT1 alleles were determined via digital PCR (dPCR) using custom TaqMan probes, which can distinguish the single nucleotide change at this locus (Fig. 6b).

As expected, cells from a healthy control individual only expressed the wild-type allele of STAT1 (AI = 0.99–1.0). The sick carrier (II-1) exhibited biallelic expression of the wild-type and mutant STAT1 alleles in CD19+ B cells (AI = 0.41) and classical monocytes (AI = 0.50), whereas their CD3+ T cells (AI = 0.90), CD56+ NK cells (AI = 0.99) and intermediate monocytes (AI = 1.0) selectively expressed the wild-type allele of STAT1. Notably, all the cell subsets assessed from the unaffected carrier (I-1) selectively expressed the wild-type allele of STAT1 (AI = 0.98–1.0) (Fig. 6c). Given that the affected carrier II-1 has biallelic expression in B cells and classical monocytes, this committed expression to the wild-type allele across leukocytes in I-1 may contribute to the observed discordant phenotype (Fig. 6c).

aRMAE in STAT1 GOF patient leukocytes

Autosomal dominant STAT1 GOF mutations can lead to enhanced STAT1 protein expression, phosphorylation and transcriptional activity41. The predominant symptom of this disease is a persistent infection of the skin, nails and mucosa by Candida albicans, referred to as chronic mucocutaneous candidiasis (CMC). CMC arises owing to the lack of IL-17-producing T cells in the circulation, which are vital for anti-fungal immunity. In addition to CMC, autosomal dominant STAT1 GOF disease presents with many other symptoms of immune dysregulation, such as bacterial and viral infections, mycobacteria, autoimmunity, atopy, bronchiectasis and cancer42,43.

To determine the effect of STAT1 aRMAE on autosomal dominant STAT1 GOF disease, we assessed STAT1 expression across three mutations—c.821G>A (p.R274Q), c.1162A>G (p.K388E) and c.1154C>T (p.T385M)—which are known to cause STAT1 GOF disease. For each mutation, a single patient with the mutation was examined and found to have varying presence of symptoms associated with autosomal dominant STAT1 GOF disease (Fig. 6d). P1 (carrying p.R274Q) presented with CMC and bacterial infections. P2 (carrying p.K388E) presented with CMC, bacterial infections and autoimmunity (thyroid dysfunction and other autoimmune symptoms). P3 (carrying p.T385M) presented with CMC, bacterial infections and autoimmunity, in addition to viral infections, atopy, lymphangioma and bronchiectasis (Supplementary Table 4).

Custom TaqMan probes were designed to distinguish between the wild-type and mutant alleles of each of the GOF mutations. CD3+ T cells and monocytes (classical, intermediate and non-classical) were sorted from patient PBMCs, and the expression of the wild-type and mutant STAT1 alleles was quantified via dPCR (Fig. 6d). Cells assessed from P1 (p.R274Q) selectively expressed the wild-type allele (AI = 0.99), whereas cells from P2 (p.K388E) and P3 (p.T385M) expressed both alleles or selectively expressed the mutant allele, except for CD3+ T cells from P3, which selectively expressed the wild-type allele (AI = 0.99) (Fig. 6e). Although preliminary, these data indicates that the expressivity of disease in individuals with STAT1 GOF is affected by the extent of selective expression of the mutant allele.

CARD11 aRMAE and disease expressivity

CARD11 encodes the membrane-associated guanylate kinase (MAGUK) family of proteins, which form a complex with BCL10 and MALT1 to facilitate IKK and NF-κB activation during lymphocyte activation. Dominant-negative, hypomorphic mutations in CARD11 can lead to CARD11-associated atopy with dominant interference of NF-κB signalling44,45 (CADINS). Signs and symptoms of the disease vary substantially even within affected families, and can include atopic disease (asthma, allergy, atopic dermatitis and increased IgE), viral skin infection, respiratory infection, autoimmunity, colitis and psoriasis44,45. CARD11 aRMAE was not assessable in the aforementioned T cell clones (Fig. 2g), owing to the absence of het-exonic SNPs.

We assessed the effect of CARD11 aRMAE on disease expressivity in a family with three carriers of the CARD11 c.223C>T; p.(R75W) mutation (Supplementary Table 4). I-1 was an adult with minimal symptoms over his lifetime, limited to mild psoriasis. His son (II-1) had moderate symptoms, including respiratory tract infection and psoriasis, and the proband II-2 had severe symptoms of disease, which eventually necessitated haematopoietic stem cell transplantation (Fig. 6f). Previous work has highlighted the deleterious effect of CARD11 dominant-negative mutations on T cell activation and proliferation, and naive T cell differentiation7.

To determine whether CARD11 aRMAE might influence the severity of symptoms among the family members, we sorted naive CD4+ T cells from their PBMCs, activated them for three days with anti-CD3 and anti-CD28 in the presence of IL-2 and quantified the expression of each CARD11 allele using bulk RNA-seq (Fig. 6g). Although the expression of CARD11 was biallelic, it skewed towards the mutant allele (AI = 0.41) in the severely affected family member (II-2) and skewed towards the wild-type allele (AI = 0.58) in the moderately affected family member (II-1). However, there was significant selective expression of the wild-type allele (AI = 0.72) in the minimally affected family member (I-1) (Fig. 6h). This finding indicates that in this family, biased allele expression at the CARD11 locus also correlates with disease severity.

Discussion

Here we provide evidence that the penetrance and expressivity of a monogenic disease can be driven by aRMAE or biased allele expression. This defines a mechanism of phenotypic variability. Biallelic expression has been considered the rule for most autosomal genes and thus, when variable expressivity or incomplete penetrance is seen in Mendelian diseases, it has typically been assumed that these result from differences in environmental exposures, modifier loci and other processes that are extrinsic to the pathogenic allele. Using cells from donors carrying a PLCG2 mutation, we show that cells can commit to gene expression from a single allele and that exclusive expression from the mutant allele of B cells corresponds to failure to activate and to a focal clinical immunophenotype. We also show for family members carrying a JAK1 GOF mutation or a STAT1 LOF mutation that selective expression of the wild-type allele among leukocytes can determine the presence or absence of disease. In the carriers of STAT1 GOF and CARD11 LOF mutations, we demonstrate that severity of disease is linked to expression levels of the mutant allele.

We also study aRMAE in ex vivo clonally expanded human primary T cells using bulk RNA-seq. Previous studies assessed aRMAE using bulk RNA-seq of mouse cell lines25,46-49, human fibroblasts50 and immortalized lymphoblast lines3 or single-cell analysis of T cells51, uncovering up to 5% of genes that display aRMAE. In line with previous studies, our data suggest that 5% of all genes and about 4% of IEI genes show preferential monoallelic expression. We demonstrate aRMAE for a number of IEI genes associated with autosomal dominant inherited conditions, hinting at the selective potential of wild-type versus mutant allele expression for specific cellular outcomes that ultimately result in the presence or absence of disease phenotypes.

In the context of aRMAE, it has been hypothesized that allelic selection in heterozygotes may be influenced by various gene-regulatory processes, including DNA methylation6,26,32 and chromatin modifications5,27,33. Perturbing the enzymes involved in the regulation of H3K4me3 (JARID1B), H3K27me3 (JMJD3) and DNA methylation (DNMT1) in T cell clones using siRNAs allowed us to highlight the role of these epigenetic marks in aRMAE. We show that allelic commitment of aRMAE genes can be altered when JMJD3 or DNMT1 levels are reduced, whereas altering JARID1B expression levels led to no changes in allele commitment at the examined loci. Although these are only some of the many biochemical changes to DNA and nucleosomes that are known to affect transcriptional activity, our results, although preliminary, suggest that some of the same silencing mechanisms used to establish facultative heterochromatin during lineage differentiation may be important for establishing and maintaining stable monoallelic gene expression. Additional work is needed to assess the regulatory dynamics of aRMAE and to determine how the observed changes contribute to the regulatory mechanisms of aRMAE.

Our work expands the understanding of incomplete penetrance and phenotypic variability in genetic disease. Although the term ‘epigenetic’ was historically introduced to refer to all heritable mechanisms acting above the level of gDNA, we propose introducing finer granularity through the use of the term ‘transcriptotype’ to bridge the gap between genotype and phenotype. Dominantly inherited disorders are understood to manifest with gDNA allele frequencies of around 50%, but a transcriptotype of reduced mutant allele expression may be considered consistent with the expectation of clinically minimal or asymptomatic disease, as seen in the JAK1 GOF family and the STAT1 dominant-negative family. aRMAE may also result in a predominantly mutant transcriptotype in certain cell lineages, thereby presenting in transcriptional homozygosity, consistent with the expectation of fully symptomatic disease, as in the case of our PLCG2 family.

In summary, we have shown that IEI genes can be expressed from a single allele in a temporally stable manner, reflecting discordant clinical outcomes even among closely related individuals who share the same disease-causing mutation. Thus, genetically driven disease must be considered at the level of the genome, as well as at the level of the transcriptome, specifically considering expression patterns of wild-type and mutant alleles across relevant cell types. Our findings suggest that many other families with individuals who are variably affected by autosomal dominant inherited IEIs could benefit from the evaluation of allelic imbalance in bulk and single cells, as aRMAE may help to explain at least some of the observed incomplete penetrance and variable expressivity. However, to document true, clonally stable aRMAE in these families, it will be necessary to derive clones from the leukocytes of family members and assess their allele expression patterns. In addition, scRNA-seq with sufficient coverage at the mutant loci would enable us to further probe the mechanisms that establish, select for and maintain aRMAE. The aRMAE genes identified in this study provide a foundation for further analyses of incomplete penetrance and variable expressivity associated with genes that cause IEIs, other monogenic disorders, and—potentially—more complex, polygenic disease landscapes in which the ‘transcriptotype’ might also be key.

Methods

Human samples

Samples from healthy human donors were acquired from the New York Blood Center (NYBC) for exome sequencing from PBMCs and single T cell isolation and expansion. PBMCs from family members with mutations in PLCG2, JAK1, STAT1, and CARD11 were genotyped using WES or with a targeted genotyping panel for IEI genes (Invitae). Informed consent was obtained for use of patient samples in accordance with human subjects protocol as approved by the IRB, Mount Sinai Health System IRB (study 16-01286). Experiments were conducted in accordance with regulations as approved by the IRB.

WES sample preparation, raw data acquisition and processing, variant calling and filtration

Healthy donor PBMCs were thawed quickly in a 37 °C water bath and rested overnight in RPMI medium supplemented with 10% fetal bovine serum, 1% GlutaMax, and 1% penicillin–streptomycin (complete RPMI). gDNA was isolated using the QIAmp DNA Blood Mini Kit per the manufacturer’s instructions (Qiagen). Exon capture was performed using the SureSelect Human All Exon V8 kit (Agilent) and paired-end sequencing was performed to average 50× depth of coverage with average 150 bp read lengths (Illumina). FASTQ files were aligned to the GRCh38 reference genome, and variants were called using GATK-DRAGEN software provided by Illumina (v4.2.4). Variants were then filtered for location, confidence, zygosity, and predicted deleterious effect (QCI Interpret, Qiagen). For our initial aRMAE analysis, we considered only heterozygous variants located within 20 bp of an exon that did not confer an amino acid change (‘synonymous’) (Fig. 2). Missense and nonsense variants were subsequently included in the set of variants considered for secondary analysis of aRMAE (Extended Data Fig. 5).

Clonal T cell isolation by single-cell sorting of human CD4+ and CD8+ T cells

Cryopreserved PBMCs were thawed and rested overnight in complete RPMI medium (RPMI complemented with 10% FBS). In preparation for fluorescence-activated cell sorting (FACS), cells were washed in PBS and then stained with Live/Dead Blue (Invitrogen) for 30 min on ice. Cells were again washed in PBS and then stained for T cell markers CD3-APC, CD4-PE, and CD8-FITC (BioLegend) for 30 min on ice in PBS with 0.1% BSA (FACS Buffer). Cells were washed and resuspended in FACS Buffer prior to sorting. Stained donor PBMCs were single-cell sorted for Live+CD3+CD4+ or Live+CD3+CD8+ into 96-well v-bottom plates (Invitrogen) using the Cytek Aurora CS. Each well of the 96-well plate contained 10,000 CD3-autologous PBMC irradiated feeder cells in 50 μl of complete RPMI supplemented with 10 IU ml−1 IL-2 (BioLegend) and 1 μg ml−1 phytohemagglutinin (PHA) (Sigma Aldrich). Feeder cells were CD3-depleted using REAlease CD3 MicroBead Kit (Miltenyi) and irradiated with 5,000 rad.

Clonal expansion of sorted CD4+ and CD8+ T cells

For the initial two weeks after single-cell sorting of T cell clones, 25 μl of fresh complete RPMI supplemented with 25 IU ml−1 IL-2 and 2.5 μg ml−1 PHA was added every 3–4 days to each well. An additional 10,000 irradiated CD3-autologous feeder cells were added each week and resuspended in the IL-2- and PHA-containing medium. After 2 weeks, plates were centrifuged at 300g for 5 min to pellet the T cell clones, and 100 μl of medium was removed from each well. The IL-2 concentration was then increased to 100 IU ml−1, PHA to 5 μg ml−1, and the feeder layer was increased to 100,000 cells for continued clonal expansion. Fresh medium with IL-2 was then added every 3–4 days, and feeder cells and PHA were added every 2 weeks. After eight weeks of clonal expansion, T cell clones (~100,000 cells per well) were either separated from the feeder layer by FACS for mRNA isolation or continued to be expanded in RPMI containing 100 IU ml−1 IL-2 and 5 μg ml−1 PHA.

Separating T cell clones from CD3 feeder cells

Cells were stained and prepared for FACS as stated above, using Live/Dead Blue, CD3-APC, CD4-PE, and CD8-FITC to remove CD3-depleted feeder cells from the clonally expanded T cells. Cells from a well containing a clone expanded from a CD4+ T cell were sorted on the basis of Live/Dead, CD3-APC and CD4-PE, and cells that were expanded from a CD8+ T cell clone were sorted using Live/Dead, CD3-APC and CD8-FITC. Approximately 100,000 cells from each clone were sorted into 100 μl of RLT lysis buffer in a 15 ml tube (Eppendorf).

Bulk RNA-seq of expanded T cell clones

Total RNA from T cell clones was isolated from sorted CD3+CD4+ or CD3+CD8+ cells using the RNeasy Mini Kit (Qiagen). Cells in RLT buffer were vortexed for 1 min to homogenize the lysate prior to adding 100 μl of 80% ethanol, mRNA was then isolated following the manufacturer’s instructions, and mRNA concentration was quantified using a NanoDrop 2000/2000c Spectrophotometer (ThermoFisher). Bulk RNA-seq libraries were prepared using NEBNext Ultra II RNA Library Prep (Illumina). Libraries (that is, clones) were sequenced to an average depth of 30,000,000 reads using an Illumina Novaseq instrument or equivalent, with >90% of bases being >Q30.

Read alignment and allele calling in expanded T cell clones

Bulk RNA-seq reads were mapped to the hg38 (GRCh38 release 100) human reference genome using STAR v2.7.5b. STAR was run with the parameter --waspOutputMode set to SAMtag to annotate which reads passed WASP filtering. Reads that passed WASP filtering (indicated by the vW:i:1 tag) were retained while all other reads that did not pass WASP filtering (for example, multi-mapping) were removed prior to downstream analysis. Duplicate reads were removed using the Picard (v3.1.1) MarkDuplicates function (parameter REMOVE_DUPLICATES set to true). ASEReadCounter (GATK v 4.3.0.0) was then used to calculate allele counts per allele for downstream allele-specific expression analysis. Input variant call format (VCF) files used to generate allele count data were derived from filtered donor VCF files that only contained exonic, heterozygous SNPs representing synonymous mutations. Secondary analyses included missense and nonsense variants along-side synonymous variants for allele-specific read count calculations (Extended Data Fig. 5). Gene regions were then defined from exon bed files created from the GRCh38 gtf file (Homo_sapiens.GRCh38.100.gtf). Summary tables of allele-specific read counts were generated at the level of each SNP using a custom script created by the Gimelbrant Lab (counts_to_snp.R).

Analysis of aRMAE in expanded T cell clones

We evaluated aRMAE using the SNP-level summary table of allele-specific read counts. We initially considered as assessable for aRMAE testing any genes with at least one heterozygous exonic (het-exonic) SNP covered at depth ≥30 reads across combined reference and alternate alleles. If a gene contained multiple het-exonic SNPs, we only considered the SNP with the highest coverage. Only synonymous variants were considered for initial analysis. Furthermore, we only considered T cell clones that had ≥500 assessable genes per clone. We then calculated allelic imbalance (AI) fractions for all SNPs across all T cell clones that passed our filters. AI refers to the deviation of a gene from expected biallelic expression (50% Ref, 50% Alt). We defined AI as the proportion of reference alleles over the total number of alleles observed at a locus—that is, AI = number of Ref counts/(number of Ref counts + number of Alt counts). Committed allele-specific expression (≥80% for Ref or ≤20% for Alt) of a non-imprinted autosomal gene strongly indicates the presence of aRMAE. To test whether a SNP significantly deviated from the null expectation under a biallelic expression model (AI = 0.5), we performed a binomial test per SNP using the PerformBinTestAIAnalysisForConditionNPoint_knownCC function from the Qllelic52 package with QCC = 1. FDRs were calculated from the binomial test P values using the p.adjust function in R (method = “BH”) to account for multiple testing (Benjamini–Hochberg method). We defined genes with significant aRMAE as those with AI cutoffs ≥0.80 or ≤0.20 and FDR < 0.05.

Assessing gene expression from imprinted loci

Imprinted genes provide an internal control for assessing allelic imbalance estimations in our expanded T cell clones. Therefore, we used our system to examine the expression patterns of known imprinted genes, showing the allele count data for two examples of known imprinted genes in each donor (Fig. 2d). Using the approach described above, we illustrate that if a gene is imprinted within a donor, all clones display significant (FDR < 0.05) aRMAE and are committed to the same allele across clones (Fig. 2d).

Assessing gene expression from the X chromosome

Genes on the X chromosome provide a control for allele-specific expression in T cell clones from female donors. We identified genes on the X chromosome that are expressed in our T cell clones and classified their inactivation status as either X-chromosome inactive or X-chromosome escape on the basis of published datasets31. The allelic imbalance of inactive and escape genes was plotted, with escape genes mostly having an AI of ~0.5 and inactive genes mostly having an AI of ≤0.2 or ≥0.8 (Extended Data Fig. 6 and Supplementary Table 2).

Classifying T helper subsets

Using bulk RNA-seq data we were able to subset CD4 T cells, with the identities of T helper 1 (TH1), T helper 2 (TH2), regulatory T cell (Treg) and T helper 17 TH17 populations across multiple donors. T cell polarization was determined by assessing marker gene expression using the following gene sets: TH1: TNF, LTA, CD38, IFNG, GNLY, IL12RB1, PRF1, STAT4, TBX21; TH2: IL5, IL4, IL13, GATA3, IL10RB, IL10RA; Treg: ICOS, TGFBR1, CTLA4, TGFB3, TGFBR2, IL2RA, AHR, FOXP3; and TH17: IL21, IL22, IL17RA, IL23A, IL21R. Data shown in Extended Data Fig. 7 are expressed as z-scores of the batch-corrected log(counts per million (CPM)) values, as calculated by voom from the R package limma53. Data were plotted using GraphPad Prism 10.0.0.

Analysis of aRMAE in IEI genes with known incomplete penetrance

The 2022 IUIS Nosology has catalogued 485 known IEI genes; of these, 189 IEI genes were assessable given our genome-wide read count coverage cutoff (≥30 counts), while an additional 129 genes did not meet the read coverage filter. Previous allele-specific expression analysis work identified 10 reads as sufficient for assigning allele specificity35. To classify the aRMAE status of a subset of IEI genes linked to incompletely penetrant disease, we considered het-exonic SNPs with a minimum read count of 10, opening up our search space for aRMAE and calculated allelic imbalance for an additional set of lower-coverage genes. If a gene contained multiple het-exonic SNPs, we only considered the SNP with the highest coverage. With this method, we were able to conduct preliminary analysis of additional IEI genes for aRMAE.

Targeting epigenetic regulators with siRNAs and assessing changes in PLCG2 allelic imbalance

Healthy control T cell clones were transfected with siRNAs (10 μM) directed against JMJD3, JARD1B and DNMT1 (Qiagen) using the RNAiMAX reagent (Invitrogen) for 72 h. Cells were then collected and half of the cells from each T cell clone were used for mRNA isolation and cDNA synthesis for ddPCR of PLCG2 (see protocol in ‘Assessing aRMAE of PLCG2 in T cell clones via ddPCR’) and the other half were subjected to immunofluorescence staining. Changes in DNA methylation (5′-methylcytosine), H3K27me3 and H3K4me3 were assessed using primary antibodies against 5′-methylcytosine (D3S2Z, Cell Signaling), trimethyl Histone H3 (Lys27) (C36B11, Cell Signaling), trimethyl Histone H3 (Lys4) (C42D8, Cell Signaling), in the T cells that were previously fixed and methanol permeabilized. Secondary antibody goat anti-rabbit Alexa Fluor 488 (Invitrogen) and DAPI were used. Mean fluorescence intensity was acquired and analysed on a Celigo Imaging Cytometer (Nexcelom).

Assessing aRMAE of PLCG2 in T cell clones via ddPCR

T cell clones (n = 9) were expanded from PBMCs of two healthy donors as described above. Custom TaqMan SNP genotyping probes (ThermoFisher) were designed to differentiate between the reference (VIC) and alternate alleles (FAM) of rs1143688, a het-exonic SNP in PLCG2 shared by both donors. Synthetic reference and alternate oligonucleotides (each 99 bp in length) flanking the site of the SNP were designed (Sigma Aldrich) to validate the ability of the probe to discriminate the single nucleotide change. Control mixtures of oligonucleotides ranging from 100% Ref to 100% Alt (100/0, 75/25, 50/50, 25/75 and 0/100) were used to generate a standard curve for calibrating results obtained from the T cell clones. To quantify the expression of each allele from bulk PBMCs and clonal T cells, we used ddPCR (Bio-Rad). mRNA was isolated from bulk PBMCs and T cell clones, followed by cDNA synthesis (Applied Biosystems). ddPCR reactions were prepared in 20 μl volume, with TaqMan probes used at 40× concentration (0.5 μl), 10 μl of ddPCR Supermix added (Bio-Rad, 1863010), 5 μl of cDNA or blank template, and 4.5 μl H2O. Thirty-five microlitres of Droplet Generation Oil (Bio-Rad, 1863005) was added to the reaction mix, droplets were generated using the QX200 Droplet Generator (Bio-Rad) and the templates were amplified via PCR. The expression of each allele was quantified (ref-VIC, alt-FAM) using the QX200 Droplet Reader (Bio-Rad).

Validating the clonality of expanded T cells

aRMAE analysis is dependent on the clonal nature of assessed cell types. We used the MIXCR (MiLaboratories) pipeline to capture the TCR repertoire of the T cell clones from bulk RNA-seq data. From this, we determined the purity of the expanded T cell clones on the basis of uniquely expressed transcripts from TRBV genes. The RNA-seq reads from T cell clones were aligned against reference V, D, J and C segments, and the alignments were assembled into clonotypes. T cell clones with ≥80% expression from one (cloneType I) or two (cloneType II) TRBV genes were defined as monoclonal (Extended Data Fig. 3 and Supplementary Table 3) as up to 22% of single T cells can express multiple TCR-β chains29,30.

GSEA of aRMAE genes

GSEA was performed using GOrilla38 with default parameters, which relies on a hypergeometric model to test for enrichment considering two unranked gene lists. The union of all MAE genes (AI ≥ 0.8 or AI ≤ 0.2, FDR < 0.05) across clones was considered the ‘target set’, and the union of all genes detected across clones was considered the ‘background set’. The Benjamini–Hochberg adjusted FDR q-values calculated by GOrilla are reported in Extended Data Fig. 5g.

Sorting Ca2+ Flux+ and Ca2+ Flux cells from control and patients with disease caused by PLCG2 mutation

PBMCs were thawed from two healthy donors and two patients with PLCG2 mutations (III-1, IV-4) and rested overnight in complete RPMI (RPMI 1640, 10% FBS, 1% penicillin/streptomycin) at 37 °C. The next day, cells were washed 2 times and labelled with Indo-1 AM (Invitrogen, I1223) ratiometric Ca2+ indicator in warm complete RPMI for 30 min at 37 °C at a concentration of 1 × 109 cell per ml. Cells were then washed with FACS buffer (PBS, 2% FBS) and stained with fluorescent-conjugated antibodies targeting CD3:AF700 (1:100), CD19:APC-Cy7 (1:100), and a live/dead dye:propidium iodide (1:1,000). After staining, cells were washed for at least two times with and resuspended in ice-cold Ringer solution. Cells were maintained on ice until 5 min before being acquired at the flow cytometer when they were warmed up at 37°C. CD3CD19+ cells were then acquired at the flow cytometer for 30 s to detect Ca2+ levels at the basal unstimulated state. Cells were then stimulated using a combination of 10 μg ml−1 anti-IgM F(ab′)2 (for B cells) and acquired on the flow cytometer. Moreover, CD3CD19+ cells that were Flux+ and Flux were simultaneously sorted for 5 min into 15 ml tubes (Falcon) for downstream RNA isolation.

Quantifying PLCG2wt and PLCG2ΔEx19 allelic expression in Flux+ and Flux patient cells

RNA was isolated from sorted patient and healthy donor CD3CD19+ cells using the Superscript IV Single Cell/Low Input cDNA PreAmp Kit (Thermofisher) with a minimum input of 100 cells per cell type per manufacturer’s instructions. Custom TaqMan gene expression probes (ThermoFisher) were designed to distinguish between wild-type and mutant PLCG2 alleles, as well as synthetic wild-type and mutant oligonucleotides. The genotyping analysis module of RT–qPCR was used to determine allele-specific expression in the flow-sorted B cells (Flux+ and Flux) from healthy donors and patients. The relative expression of mutant vs wild-type allele in each sample was computed using normalized end-point fluorescence (dRN). The abundance of each allele was normalized to a standard curve generated with set percentages (100/0, 75/25, 50/50, 25/75, 0/100) of synthetic wild-type and mutant oligonucleotides (Fig. 4e-g).

Assessing aRMAE of JAK1 in T cell clones via ddPCR

T cell clones (n = 12) were expanded from PBMCs of two healthy donors (HD3 and HD5). Custom TaqMan SNP genotyping probes (ThermoFisher) were designed to differentiate between the reference (ref-VIC) and alternate alleles (alt-FAM) of rs3737139, a het-exonic SNP in JAK1. As described above for PLCG2, ddPCR was used to quantify the expression of reference and alternate JAK1 alleles. Synthetic oligonucleotides for the reference and alternate alleles of rs3737139 were used to validate the specificity of the TaqMan probes for distinguishing the two alleles. This system enabled the identification of JAK1 aRMAE in the 12 assessed T cell clones (Fig. 5a).

Two T cell clones from HD3 exhibited monoallelic expression of JAK1 in our bulk RNA-seq T cell clone data. To technically validate this finding, we used ddPCR to assess monoallelic expression of JAK1 in these same clones using the allele-specific probes described above for rs3737139. RNA-seq is limited by read count coverage at a locus, but ddPCR enables a targeted assessment of JAK1 reference and alternate allele expression (Extended Data Fig. 10a).

Sorting leukocytes from JAK1 GOF patients

We were referred one family with three members heterozygous for the JAK1 c.2099 G > A (p.S700N) mutation for detailed immunophenotyping. The index patient (II-2) was characterized via CyTOF and shown to have bona fide JAK1 GOF disease37. PBMCs from the healthy non-carrier (II-1), healthy carrier (III-1) and affected carrier (II-2) were thawed overnight in complete RPMI, washed in PBS and stained with Live-Dead:Blue (1:1,000), then washed and stained in flow buffer (PBS and 0.05% BSA) with CD3:APC (1:100), CD19:PE-Cy7 (1:100), CD56:BV421 (1:100), HLA-DR:BV711 (1:100) and CD14:FITC (1:100). T cells, B cells, monocytes, and NK cells from each family member were sorted into separate 15 ml tubes (Eppendorf).

Expression of wild-type and mutant alleles in bulk PBMCs and sorted cell types from JAK1 GOF patients

To assess the effect of JAK1 allelic expression on disease penetrance, RNA was isolated from each patient’s sorted cells, and cDNA was synthesized as described above. Custom TaqMan SNP genotyping probes (Thermofisher) were designed to distinguish between the wild-type and mutant alleles. The expression of each allele was quantified from the sorted cells using ddPCR. The percentages of each allele expressed in bulk patient PBMCs (Fig. 5d), as well as cell subtypes (Fig. 5e-h), were determined.

Sorting leukocytes from patients with a mutation in STAT1

We obtained PBMCs for two individuals with the c.1976T>C (p.I659T) mutation and one individual for the p.K388E, p.T385M and p.R274Q mutations in STAT1 from our collaborators J.-L.C., J.B., S.O., A.P. and S.B.-D. PBMCs from all samples were thawed overnight in complete RPMI, washed in PBS and stained with Live-Dead:Blue (1:1,000), then washed and stained in flow buffer (PBS and 0.05% BSA) with CD3-APC (1:100), CD19-PE-Cy7 (1:100), CD56-BV421 (1:100), HLA-DR-BV711 (1:100), CD14-FITC (1:100) and CD16-PE (1:100). For the two carriers of the c.1976T>C p.(I659T) mutation in STAT1, T cells, B cells, NK cells, and classical, non-classical and intermediate monocytes from each family member were sorted into separate 15 ml tubes (Eppendorf). For the three individuals, each a carrier of a different STAT1 GOF mutation (p.K388E, p.T385M and p.R274Q), PBMCs were sorted into CD3+ T cells, classical, non-classical, and intermediate monocytes.

Assessing aRMAE of STAT1 in patient immune cells via dPCR

Custom TaqMan SNP genotyping probes (ThermoFisher) were designed to differentiate between the reference (ref-VIC) and alternate alleles (alt-FAM) of mutations in STAT1 (p.I659T, p.K388E, p.T385M and p.R274Q). Patient cDNA was synthesized using the SuperScript IV Single Cell/Low Input cDNA PreAmp Kit (Thermofisher). dPCR (ThermoFisher, Absolute Q) was used to quantify the expression of STAT1 mutant and wild-type alleles within the sorted cell types. Allelic imbalance of STAT1 [WT/(WT + mut)] was determined by the proportion of positive dPCRs for the VIC and FAM (mut) fluorescent channels (Fig. 6c,f).

Assessing aRMAE of CARD11 in patient naive T cells via bulk RNA-seq

For RNA-seq experiments of patients with CARD11 c.223C>T (p.R75W) and healthy controls, PBMCs were thawed and rested in complete RPMI 1640 medium overnight in 37 °C CO2 incubator. Naive CD4+ T cells were isolated by sorting (FACSAria) for CD4+CD45RO cells. A total of 1 × 105 cells were cultured in 96-well flat bottom plate in RPMI 1640 medium with 2.0 mM glutamine, in the presence of MACSiBead particles loaded with biotinylated antibodies against human CD2, CD3, CD28 (Miltenyi Biotec) for 3 days, with 25 ng ml−1 recombinant human IL-2 (Peprotech). After 3 days of culture, 3 × 105 cells were collected, pelleted, and RNA was isolated using a miRNAeasy kit (Qiagen) according to the manufacturer’s instructions, including on-column DNaseI treatment. After elution, RNA was quantified with bioanalyzer assay using RNA Pico chip (Agilent). For bulk RNA-seq, the Clontech Ultra Low v4 kit (Takara Bio) was used for cDNA synthesis and the Nextera XT kit (Illumina) was used for library preparation. Libraries were sequenced using Element AVITI at JP Sulzberger Columbia Genome Center with paired-end reads at 75 bp length targeting 40 million reads per sample. Reads were aligned to the human reference genome (GRCh38 with Gencode v45 annotations) using STAR and mpileup used to identify the reads that align to the CARD11 c.223C>T (p.R75W) locus.

Extended Data

Extended Data Fig. 1 ∣. aRMAE Workflow.

Extended Data Fig. 1 ∣

a, Analysis pipeline to assess aRMAE using paired donor WES with T cell clone RNA-seq. be, Assessable genes in T cell system with synonymous and missense variants combined - (b) genome wide (4,366) and (d) IEI (189) or missense only variants - (c) genome wide (2,155) and (e) IEI (117).

Extended Data Fig. 2 ∣. Allelic imbalance across T cell clones with ≥1 heterozygous SNPs in gene exons.

Extended Data Fig. 2 ∣

Correlation of gene allelic imbalance determined by Pearson’s coefficient (r).

Extended Data Fig. 3 ∣. T cell clonotypes.

Extended Data Fig. 3 ∣

a, Relative proportion of the expressed (single or dual) TCR beta chains in T cell clones. Data generated from MIXCR pipeline analysis, cloneFraction = 0.80 indicates the threshold for expanded T cells originating from a single clone. b, T cell clones grouped into cloneTypes based on the expression of one (cloneType I; n = 35) or two (cloneType II; n = 12) functional TCR beta chains. c, CloneFraction of TCR beta chains from cloneType = I and cloneType = II T cell clones.

Extended Data Fig. 4 ∣. Allelic imbalances in T cells.

Extended Data Fig. 4 ∣

Allelic imbalance distributions for the T cell clones with ≥500 assessable genes at 30+ read count at the location of a het-exonic SNP.

Extended Data Fig. 5 ∣. aRMAE at increased thresholds of Allelic Imbalance.

Extended Data Fig. 5 ∣

ab, Allelic Imbalance of genes across T cell clones with allele-specific analysis conducted using healthy donor vcfs containing only missense het-SNPs (a) or missense and synonymous het-SNPs (b). cd, Percentage of genes exhibiting biallelic (BAE) and monoallelic (aRMAE) expression at increasing thresholds of Allelic Imbalance (AI). In c, the thresholds were set at AI ≥ 0.9 and AI ≤ 0.1, while in d, the thresholds were AI ≥ 0.95 and AI ≤ 0.05. ef, Percentage of genes that are aRMAE or BAE at AI ≥ 0.8 and AI ≤ 0.2 for missense variants or missense and synonymous variants. (g) Gene set enrichment analysis of aRMAE genes using GOrilla. The FDR adjusted q-values comparing target-set (aRMAE genes) and background set (all genes) are plotted.

Extended Data Fig. 6 ∣. Expression of X-chromosome genes and aRMAE gene correlation.

Extended Data Fig. 6 ∣

a, Allelic imbalance (WT/(WT + MUT)) of genes on the X-chromosome, separated by status of inactive or escape expression pattern. Each dot represents a X-chromosome gene detected in a clone and the blue horizontal line illustrates the mean AI of the escape group (n = 81, mean = 0.5172294) or the inactive group (n = 21, mean = 0.6471123). One-sided F-test determined significance of variance between the escape and inactive groups, p = 4.582842e-18. b, X-chromosome genes with known XCI-inactive expression pattern and their allele expression. c, X-chromosome genes with XCI-escape expression pattern and their allele expression. d, Pearson correlation coefficient of Allelic Imbalance for each aRMAE gene in relation to the number of genes compared across clones. Error bands indicate 95% CI. e, Range of Pearson correlation (R) across T cell clones (mean = 0.69). f, Two-sided Pearson correlation of aRMAE genes from a single T cell clone separated in culture for 8 weeks with separate libraries prepared and sequenced. Error bands indicate 95% CI.

Extended Data Fig. 7 ∣. T-helper cell subsets.

Extended Data Fig. 7 ∣

a,CD4 + T cell clones grouped into different subclasses of T-helper cells based on expression level of defining genes.Data is expressed as z-scores of the batch-corrected logCPM values, as calculated by voom from the R package limma. b, Percent aRMAE genes across T cell clones defined as Th1 (n = 2), Th2 (n = 3), Treg (n = 3) and Th17 (n = 2) cell subsets. c, Percent aRMAE genes across HD1 (n = 2), HD2 (n = 3), HD3 (n = 3) and HD4 (n = 2). dg, Percent aRMAE genes across different CD4 + T cell subsets in (d) HD1 (n = 2), (e) HD2 (n = 3), (f) HD3 (n = 3) and (g) HD4 (n = 2). Statistical significance assessed using a two-tailed Student’s t-test, ns indicates not significant, error bars indicate mean ± sd.

Extended Data Fig. 8 ∣. siRNA targeting JARID1B, JMJD3 and DNMT1.

Extended Data Fig. 8 ∣

ac, Allelic imbalance shifts of PLCG2 in clones transfected with scramble siRNA and with siRNA targeting JARID1B, JMJD3 or DNMT1 assessed using ddPCR. d, Expression levels of JARID1B (n = 4, p value = 0.0042), JMJD3 (n = 4, p < 0.0001) and DNMT1 (n = 6, p = 0.1084) quantified using ddPCR (JARID1B, JMJD3) or RT-qPCR (DNMT1) after siRNA knockdown. Significance assessed by Dunnett’s multiple comparisons test. eg, Quantification of H3K4me3 (p value = 0.0015) (e), H3K27me3 (f) and DNA methylation (p value = 0.0409) (g) marks in control and siRNA transfected T cell clones. Significance in eg assessed using two-sided student’s t-test. Results in ag are representative of two experiments.

Extended Data Fig. 9 ∣. PLCG2 Ca2+ Flux and aRMAE.

Extended Data Fig. 9 ∣

a, Stability of allele commitment assessed in a T cell clone from HD6 at Weeks 8 and 12 of T cell culture. PLCG2 allele-specific expression determined by ddPCR using TaqMan probes targeting the reference (ref) and alternate (alt) alleles of a PLCG2 het-exonic SNP (rs1143688). Results are representative from two experiments. b, Symptoms observed in patients III-11 (low Ig levels) and IV-4 (normal Ig levels) with discordant antibody deficiency. c, Proportion of CD3CD19+ B cells that have normal Ca2+ Flux levels (Flux+) or low Ca2+ Flux (Flux) from III-11, IV-4 and healthy donor (HD) control. d, Cytoplasmic Ca2+ levels in sorted CD3CD19+ B cells from HD, III-11 and IV-4 derived by the ratiometric Ca2+ indicator Indo-1. Flow cytometry plots shown in e. Experiments with patient samples were conducted once.

Extended Data Fig. 10 ∣. JAK1 genotype and transcriptotype.

Extended Data Fig. 10 ∣

a, Technical validation of aRMAE findings from RNA-seq analysis using ddPCR. Expression of a JAK1 het-exonic SNP (rs3737139) in identical clones (Clone1, Clone2) from D311 was used to determine aRMAE. RNA-sequencing was conducted once, ddPCR results are representative of two experiments. b, Symptoms observed in patients III-1 (unaffected carrier) and II-2 (affected carrier) with discordant penetrance for JAK1 mediated inflammatory disease. cd, Sanger sequencing of gDNA isolated from sorted immune cells of II-2 (c) and III-1 (d) showing the heterozygous presence of the mutated allele.

Supplementary Material

Supplementary Table 4
Supplementary Table 2
Supplementary Table 3
Supplementary Table 1

Acknowledgements

The authors thank the healthy donors, patients and their families for participating in this study. This study was funded by the National Institute of Allergy and Infectious Diseases Grants R01AI127372, R01AI151029, R01AI148963 and R24AI167802. D.P.H.v.K. was supported by the National Institute of Allergy and Infectious Diseases T32 training grant 5T32AI007512. O.S. was supported by the National Institute of Child Health and Human Development T32 training grant T32HD075735 and the National Institute of Allergy and Infectious Diseases F31 training grant F31AI174808. This research was funded in part through the NIH/NCI Cancer Center Support Grant P30CA013696 and used the Genomics and High Throughput Screening Shared Resource and US National Institutes of Health (NIH) R35 NS105078. The Laboratory of Human Genetics of Infectious Diseases was funded in part by the St Giles Foundation; the Rockefeller University; Institut National de la Santé et de la Recherche Médicale (INSERM); the Imagine Institute; Paris Cité University; the National Center for Research Resources; the National Center for Advancing Sciences of the NIH (UL1TR001866); NIH (R01AI095983, R01AI127564, U19AI162568); the Square Foundation, Grandir - Fonds de solidarité pour l’enfance; the SCOR Corporate Foundation for Science; the French National Research Agency (ANR) under the “Investments for the Future” programme (ANR-10-IAHU-01); the Integrative Biology of Emerging Infectious Diseases Laboratory of Excellence (ANR-10-LABX-62-IBEID); ANR MAFMACRO (ANR-22-CE92-0008); the ANRS project ECTZ170784-ANRS0073; the French Foundation for Medical Research (EQU201903007798); W. E. Ford, G. Caillaux and the General Atlantic Foundation. Figs. 1a, 3a, 4c and 6a,b,d,f,g and Extended Data Fig. 1a were created with BioRender.com.

Footnotes

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Online content

Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41586-024-08346-4.

Competing interests D.B. is the founder of Lab11 Therapeutics. The other authors declare no competing interests.

Supplementary information The online version contains supplementary material available at https://doi.org/10.1038/s41586-024-08346-4.

Data availability

The data supporting the findings of this study are available within the Article and its supplementary information. The raw sequencing datasets are available at the Gene Expression Omnibus under accession GSE279855.

Code availability

The scripts used for data analysis are available via GitHub (https://github.com/BogunovicLab/Stewart_MAE).

References

  • 1.Tangye SG et al. Human inborn errors of immunity: 2022 update on the classification from the International Union of Immunological Societies Expert Committee. J. Clin. Immunol 42, 1473–1507 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Gruber C. & Bogunovic D. Incomplete penetrance in primary immunodeficiency: a skeleton in the closet. Hum. Genet 139, 745–757 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Gimelbrant A, Hutchinson JN, Thompson BR & Chess A. Widespread monoallelic expression on human autosomes. Science 318, 1136–1140 (2007). [DOI] [PubMed] [Google Scholar]
  • 4.Chess A. Monoallelic gene expression in mammals. Annu. Rev. Genet 50, 317–327 (2016). [DOI] [PubMed] [Google Scholar]
  • 5.Nag A. et al. Chromatin signature of widespread monoallelic expression. eLife 2, e01256 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.da Rocha ST & Gendrel AV The influence of DNA methylation on monoallelic expression. Essays Biochem. 63, 663–676 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Bousfiha AA et al. Primary immunodeficiency diseases worldwide: more common than generally thought. J. Clin. Immunol 33, 1–7 (2013). [DOI] [PubMed] [Google Scholar]
  • 8.Wengler GS et al. High prevalence of nonsense, frame shift, and splice-site mutations in 16 patients with full-blown Wiskott–Aldrich syndrome. Blood 86, 3648–3654 (1995). [PubMed] [Google Scholar]
  • 9.Gámez-Díaz L. et al. The extended phenotype of LPS-responsive beige-like anchor protein (LRBA) deficiency. J. Allergy Clin. Immunol 137, 223–230 (2016). [DOI] [PubMed] [Google Scholar]
  • 10.Fieschi C. et al. Low penetrance, broad resistance, and favorable outcome of interleukin 12 receptor β1 deficiency: medical and immunological implications. J. Exp. Med 197, 527–535 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Schwab C. et al. Phenotype, penetrance, and treatment of 133 cytotoxic T-lymphocyte antigen 4-insufficient subjects. J. Allergy Clin. Immunol 142, 1932–1946 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Timberlake AT et al. Two locus inheritance of non-syndromic midline craniosynostosis via rare SMAD6 and common BMP2 alleles. eLife 5, e20125 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Spaan AN et al. Human OTULIN haploinsufficiency impairs cell-intrinsic immunity to staphylococcal α-toxin. Science 376, eabm6380 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Israel L. et al. Human adaptive immunity rescues an inborn error of innate immunity. Cell 168, 789–800.e10 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Castel SE et al. Modified penetrance of coding variants by cis-regulatory variation contributes to disease risk. Nat. Genet 50, 1327–1334 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Buzby JS, Williams SA, Schaffer L, Head SR & Nugent DJ Allele-specific wild-type TP53 expression in the unaffected carrier parent of children with Li–Fraumeni syndrome. Cancer Genet. 211, 9–17 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Lyon MF Gene action in the X-chromosome (Mus musculus L.). Nature 190, 372–373 (1961). [DOI] [PubMed] [Google Scholar]
  • 18.Brown C. et al. A gene from the region of the human X inactivation centre is expressed exclusively from the inactive X chromosome. Nature 349, 38–44 (1991). [DOI] [PubMed] [Google Scholar]
  • 19.Barlow DP Gametic imprinting in mammals. Science 270, 1610–1613 (1995). [DOI] [PubMed] [Google Scholar]
  • 20.Chess A, Simon I, Cedar H, & Axel R. Allelic inactivation regulates olfactory receptor gene expression. Cell 78, 823–834 (1994). [DOI] [PubMed] [Google Scholar]
  • 21.Gimelbrant AA, Ensminger AW, Qi P, Zucker J. & Chess A. Monoallelic expression and asynchronous replication of p120 catenin in mouse and human cells. J. Biol. Chem 280, 1354–1359 (2005). [DOI] [PubMed] [Google Scholar]
  • 22.Davie JM, Paul WE, Mage RG, & Goldman MB Membrane-associated immunoglobulin of rabbit peripheral blood lymphocytes: allelic exclusion at the b locus. Proc. Natl Acad. Sci. USA 68, 430–434 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Gascoigne NRJ & Alam SM Allelic exclusion of the T cell receptor α-chain: developmental regulation of a post-translational event. Semin. Immunol 11, 337–347 (1999). [DOI] [PubMed] [Google Scholar]
  • 24.Reinius B. & Sandberg R. Random monoallelic expression of autosomal genes: stochastic transcription and allele-level regulation. Nat. Rev. Genet 16, 653–664 (2015). [DOI] [PubMed] [Google Scholar]
  • 25.Deng Q, Ramsköld D, Reinius B. & Sandberg R. Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells. Science 343, 193–196 (2014). [DOI] [PubMed] [Google Scholar]
  • 26.Marion-Poll L. et al. Locus specific epigenetic modalities of random allelic expression imbalance. Nat. Commun 12, 5330 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Nag A, Vigneau S, Savova V, Zwemer LM & Gimelbrant AA Chromatin signature identifies monoallelic gene expression across mammalian cell types. G3 5, 1713–1720 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Xu J. et al. Landscape of monoallelic DNA accessibility in mouse embryonic stem cells and neural progenitor cells. Nat. Genet 49, 377–386 (2017). 2017 493. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Zhu L. et al. scRNA-seq revealed the special TCR β & α V(D)J allelic inclusion rearrangement and the high proportion dual (or more) TCR-expressing cells. Cell Death Dis. 14, 487 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Stubbington MJT et al. T cell fate and clonality inference from single-cell transcriptomes. Nat. Methods 13, 329–332 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Tukiainen T. et al. Landscape of X chromosome inactivation across human tissues. Nature 550, 244–248 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Gupta S. et al. RNA sequencing-based screen for reactivation of silenced alleles of autosomal genes. G3 12, jkab428 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Vinogradova S, Saksena SD, Ward HN, Vigneau S. & Gimelbrant AA MaGIC: a machine Learning tool set and web application for monoallelic gene inference from chromatin. BMC Bioinformatics 20, 106 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Ombrello MJ et al. Cold urticaria, immunodeficiency, and autoimmunity related to PLCG2 deletions. N. Engl. J. Med 366, 330–338 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.CN G. et al. Complex autoinflammatory syndrome unveils fundamental principles of JAK1 kinase transcriptional and biochemical function. Immunity 53, 672–684.e11 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Del Bel KL et al. JAK1 gain-of-function causes an autosomal dominant immune dysregulatory and hypereosinophilic syndrome. J. Allergy Clin. Immunol 139, 2016–2020.e5 (2017). [DOI] [PubMed] [Google Scholar]
  • 37.Horesh ME et al. Individuals with JAK1 variants are affected by syndromic features encompassing autoimmunity, atopy, colitis, and dermatitis. J. Exp. Med 221, e20232387 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Gourdan P. et al. Multifocal tuberculosis: a phenotype of Mendelian susceptibility to mycobacterial disease. Arch. Dis. Child 109, 673–673 (2024). [DOI] [PubMed] [Google Scholar]
  • 39.Dupuis S. et al. Impairment of mycobacterial but not viral immunity by a germline human STAT1 mutation. Science 293, 300–303 (2001). [DOI] [PubMed] [Google Scholar]
  • 40.Chapgier A. et al. Novel STAT1 alleles in otherwise healthy patients with mycobacterial disease. PLoS Genet. 2, e131 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Liu L. et al. Gain-of-function human STAT1 mutations impair IL-17 immunity and underlie chronic mucocutaneous candidiasis. J. Exp. Med 208, 1635–1648 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Okada S. et al. Human STAT1 gain-of-function heterozygous mutations: chronic mucocutaneous candidiasis and type i interferonopathy. J. Clin. Immunol 40, 1065–1081 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Depner M. et al. The extended cLinical phenotype of 26 patients with chronic mucocutaneous candidiasis due to gain-of-function mutations in STAT1. J. Clin. Immunol 36, 73–84 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Ma CA et al. Germline hypomorphic CARD11 mutations in severe atopic disease. Nat. Genet 49, 1192–1201 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Dorjbal B. et al. Hypomorphic caspase activation and recruitment domain 11 (CARD11) mutations associated with diverse immunologic phenotypes with or without atopic disease. J. Allergy Clin. Immunol 143, 1482–1495 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Li SM et al. Transcriptome-wide survey of mouse CNS-derived cells reveals monoallelic expression within novel gene families. PLoS ONE 7, e31751 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Zwemer LM et al. Autosomal monoallelic expression in the mouse. Genome Biol. 13, R10 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Gendrel A-V et al. Developmental dynamics and disease potential of random monoallelic gene expression. Dev. Cell 28, 366–380 (2014). [DOI] [PubMed] [Google Scholar]
  • 49.Eckersley-Maslin MA et al. Random monoallelic gene expression increases upon embryonic stem cell differentiation. Dev. Cell 28, 351–365 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Borel C. et al. Biased allelic expression in human primary fibroblast single cells. Am. J. Hum. Genet 96, 70–80 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Reinius B. et al. Analysis of allelic expression patterns in clonal somatic cells by single-cell RNA-seq. Nat. Genet 48, 1430–1435 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Mendelevich A. et al. Replicate sequencing libraries are important for quantification of allelic imbalance. Nat. Commun 12, 3370 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Ritchie ME et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Table 4
Supplementary Table 2
Supplementary Table 3
Supplementary Table 1

Data Availability Statement

The data supporting the findings of this study are available within the Article and its supplementary information. The raw sequencing datasets are available at the Gene Expression Omnibus under accession GSE279855.

The scripts used for data analysis are available via GitHub (https://github.com/BogunovicLab/Stewart_MAE).

RESOURCES