Abstract
The JAK-STAT pathway is central to cytokine signaling and controls normal physiology and disease. Aberrant activation via mutations that change amino acids in proteins of the pathway can result in diseases. While disease-centric databases like COSMIC catalog mutations in cancer, their prevalence in healthy populations remains underexplored. We systematically studied such mutations in the JAK-STAT genes by comparing COSMIC and the population-focused All of Us database. Our analysis revealed frequent mutations in all JAK and STAT domains, particularly among white females. We further identified three categories: Mutations uniquely found in All of Us that were associated with cancer in the literature but could not be found in COSMIC, underscoring COSMIC’s limitations. Mutations unique to COSMIC underline their potential as drivers of cancer due to their absence in the general population. Mutations present in both databases, e.g., JAK2Val617Phe/V617F - widely recognized as a cancer driver in hematopoietic cells, but without disease associations in All of Us, raising the possibility that combinatorial SNPs might be responsible for disease development. These findings illustrate the complementarity of both databases for understanding mutation impacts and underscore the need for multi-mutation analyses to uncover genetic factors underlying complex diseases and advance personalized medicine.
Supplementary Information
The online version contains supplementary material available at 10.1038/s41598-025-90788-5.
Subject terms: Cancer, Genetics, Immunology, Molecular biology
Introduction
Single Nucleotide Polymorphisms (SNPs) are the most common type of genetic variation among people1. SNPs can be classified based on their DNA location and potential impact on the expression or function of one or several genes2. SNPs can play critical roles in gene regulation within non-coding regions by influencing elements such as promoters and enhancers3–5. Within coding regions, SNPs are divided into synonymous SNPs, which do not alter the amino acid sequence of the encoded protein but can affect translation speed, and nonsynonymous/missense SNPs, which result in an amino acid change and may affect protein folding and/or function2. SNPs in the JAK-STAT pathway—a crucial signaling cascade involved in immune responses and cancer development (Suppl. Figure 16–10), – have been widely recognized as risk factors or causes of a multitude of diseases11.
Mutations in JAK and STAT proteins frequently result in a promiscuous activation of the JAK-STAT pathway, a characteristic feature of various hematological cancers. For example, the JAK2Val617Phe (also called V617F) mutation is common in myeloproliferative neoplasms, occurring in roughly 90–95% of polycythemia vera (PV) patients and in 50–60% of those with essential thrombocythemia and primary myelofibrosis12,13. This mutation results in a gain-of-function effect, leading to increased signaling through the JAK-STAT pathway even in the absence of cytokine stimulation, thereby promoting cell proliferation and survival14,15. Similarly, mutations in JAK1 and JAK3 have been identified in T-cell acute lymphoblastic leukemia (T-ALL), where they contribute to the hyperactivation of the JAK-STAT signaling cascade16,17. Another example is the Pseudokinase domain mutation JAK1Val666Gly, which has been shown to impair JAK3 phosphorylation and interleukin-2 (IL-2) signaling, which is critical for T-cell proliferation18. Moreover, the role of the JAK-STAT pathway extends beyond hematological malignancies. Aberrant activation of this pathway has been implicated in various solid tumors and autoimmune diseases, indicating its broader relevance in human health19,20. For instance, mutations in the STAT3 gene have been associated with increased tumor invasiveness and poor prognosis in various cancers21,22. Additionally, STAT3 mutations are particularly prevalent in T-cell neoplasms, where they contribute to oncogenesis by promoting cell proliferation and survival. For instance, mutations in the SH2 domain of STAT3 can lead to constitutive activation, resulting in enhanced transcriptional activity of target genes involved in cell growth and survival5,16,23. In T-ALL, approximately 20–30% of patients harbor mutations in STAT3 or other components of the JAK-STAT pathway, which are associated with poor prognosis and increased disease aggressiveness16,23. Similarly, mutations in STAT5B have been implicated in various hematological malignancies, including acute myeloid leukemia (AML) and chronic lymphocytic leukemia (CLL). The constitutive activation of STAT5B can enhance the expression of anti-apoptotic proteins, thereby promoting cell survival and contributing to the malignant phenotype24. In Waldenström’s macroglobulinemia, for example, constitutive activation of STAT5A and STAT5B has been shown to regulate immunoglobulin secretion, further emphasizing the role of these proteins in B-cell malignancies24.
This literature, among many more published articles, shows that many missense mutations in the JAK and STAT genes are already known to cause disease. Most of the SNPs published literature are included in SNP databases, such as COSMIC25,26, gnomAD27, dbSNP28, and All of Us29. Each database has its unique focus and utility: COSMIC specializes in somatic mutations related to cancer, gnomAD offers a broad survey of human genetic diversity from various global populations, dbSNP provides a catalog of SNPs found in other databases, and All of Us focuses on capturing the diversity of the United States population with an emphasis on underrepresented groups.
However, the current literature lacks a systematic analysis of the frequency of missense mutations in specific domains of the JAK and STAT genes and the prevalence of these mutations across different ethnicities and sexes at birth in the general and mostly healthy population. Additionally, there is a gap in studies comparing SNPs in JAK and STAT genes between disease-specific databases like COSMIC and general population databases such as All of Us. Hence, we examine the prevalence of missense mutations in the COSMIC and All of Us databases representing the cancerous and general mostly healthy populations, respectively. We focus on the main components of this pathway: JAK1-3, TYK2, and STAT1-6 (including STAT5A and STAT5B). We first determine the general frequency of SNPs altering amino acids in these genes within the general population and compare more frequent SNPs (which we found in at least 20 individuals in All of Us) to their prevalence in the disease-centric COSMIC database. We then add what has been published about these SNPs in the literature and highlight inconsistencies between the literature and the All of Us database.
Results and discussion
Assessing the frequency of JAK and STAT missense mutations in the gene domains in the general population
For this analysis, we visualize in Fig. 1 the percentage of how many amino acids per domain are mutated in at least one person in the All of Us database. Despite being important in immune response, we can observe that some gene domains of members within the JAK and STAT gene families are mostly heavily mutated (Fig. 1). The STAT genes have the following domains: (i) the N-terminal domain, (ii) the coiled-coil domain, (iii) the DNA-binding domain, (iv) the linker domain, (v) the SH2 domain, and (vi) the TAD domain. The N-terminal domain, essential for dimerization, seems to be lowly mutated. STAT6 has the lowest mutation rate in the N-terminal domain at 4%, followed by STAT2, STAT4, STAT5A, and STAT5B, each with 7%. STAT1, with a mutation rate of 11%, displays the highest degree of variation. In the coiled-coil domain, which facilitates protein-protein interactions, STAT6 again has the lowest mutation rate at 15%, followed by STAT2 at 16%, and STAT1, STAT3, and STAT4 at 18%, indicating a relatively consistent level of mutational flexibility across these proteins. More differences are observed in the DNA-binding domain, which is responsible for interacting with gene promoters. STAT4 has the lowest mutation rate at 27%, followed by STAT1 at 29% and STAT3 and STAT5B at 33%. STAT6 shows 34%, and STAT5A has the highest rate at 37%. The linker domain, connecting functional regions, presents less variability compared to the DNA binding domain. STAT5B has the lowest mutation rate at 3%, followed by STAT2 at 5%, STAT1 at 6%, STAT3 at 7%, STAT5A at 9%, and STAT6 at 10%. For the SH2 domain, which plays a crucial role in phosphorylation-dependent signaling, STAT5B shows the least mutational change at 24%, followed by STAT5A and STAT1, both at 26%, STAT6 at 29%, STAT4 at 30%, STAT3 at 32%, and STAT2 with the highest mutation rate of 34%, suggesting greater flexibility in phosphorylation engagement across the family. Lastly, the TAD, which regulates transcriptional activity, varies significantly across the STAT proteins. STAT1 has the lowest mutation rate at 29%, followed by STAT3 and STAT5B, both at 33%. STAT5A shows a mutation rate of 37%, while STAT6, with a rate of 42%, exhibits the most variation30,31. Overall, it seems that the SH2 and TAD domains of the STAT genes are more heavily mutated throughout the general mostly healthy population than the other domains30–33.
Fig. 1.
Distribution of mutations in JAK/STAT gene domains in the general population provided by All of Us. The JAK family (JAK1, JAK2, JAK3, and TYK2) and STAT family (STAT1–STAT6) are presented with pie charts and bar graphs that compare the proportion of mutated (red) versus non-mutated (blue) amino acid residues within each domain (i.e., N-terminal, Coiled-coil, DNA-binding, Linker, SH2, and TAD for the STAT family and FERM, SH2, Pseudokinase, Kinase, and links between those domains for the JAK family).
The JAK genes have the following domains: (i) the starter link, (ii) the FERM domain, (iii) the FERM link, (iv) the SH2 domain, (v) the SH2 link, (vi) the Pseudokinase, (vii) the Pseudokinase link, and (viii) the Kinase. The Starter link exhibits varying levels of mutation across the JAK family. JAK1 shows the lowest mutation rate at 2%, followed by TYK2 at 5% and JAK2 at 7%, while JAK3 has the highest, with 100% of residues affected. This wide variation suggests differing levels of flexibility and conservation in this domain across the JAK family. The FERM domain, critical for interacting with cytokine receptors, shows considerable variability. JAK1 displays the lowest mutation rate at 51%, followed by JAK2 at 65%. JAK3, with a mutation rate of 74%, and TYK2 at 100%, reflect a high degree of variation. The FERM link also exhibits distinct levels of mutation across the JAK family. JAK1 has the lowest mutation rate at 22%, followed by TYK2 at 47% and JAK2 at 66%. JAK3, with 100% of residues mutated, reflects the highest degree of mutational flexibility in this connecting region. In the SH2 domain, which is essential for recognizing and binding phosphorylated tyrosine residues, JAK1 shows the least variation, with a mutation rate of 40%, followed by TYK2 at 46% and JAK2 at 69%. JAK3, with 70%, exhibits the most variability in this domain, highlighting potential differences in phosphorylation-dependent signaling across the JAK family. The SH2 link domain demonstrates moderate levels of mutation across the JAK family. JAK1 and TYK2 exhibit the lowest rates at 11%, followed by JAK2 at 18%, while JAK3 shows the highest mutation rate at 100%, indicating significant flexibility in this region, particularly in JAK3. The Pseudokinase domain, which regulates the kinase activity, shows high mutation rates across the JAK family. JAK1 displays the lowest rate at 40%, while JAK2 shows 47%, and JAK3 reaches 45%. TYK2 demonstrates the highest mutation rate in this domain at 57%. In the Pseudokinase linker, JAK2 has the lowest mutation rate at 1%, followed by TYK2 at 18%, JAK1 at 47%, and JAK3 at 100%. Lastly, the kinase domain, which is vital for catalytic activity, exhibits substantial differences in mutation rates. JAK1 has the lowest rate at 35%, while JAK3 and TYK2 show similar rates at 39% and 40%, respectively. JAK2, however, demonstrates the highest rate at 65%, reflecting a broader range of variation in the kinase domain across the family30–33.
It seems that variations in the JAK domains generally occur more often than in STAT domains. This data suggests that while certain domains are more prone to mutation, no JAK and STAT gene family domain is fully conserved in the general population. These variations may subtly influence how individuals respond to signaling and immune challenges, even in the absence of disease. The findings from the All of Us database challenge the assumption that essential domains in both the JAK and STAT families are highly conserved throughout the healthy population since a significant number of mutation rates are observed across key functional domains.
Comparing missense single nucleotide variants in the JAK/STAT genes that were identified in All of Us or COSMIC to what is known in the literature and how they are associated with disease in All of Us
We examined the All of Us database for missense mutations that occur in at least 20 individuals within the JAK and STAT gene families and evaluated whether these mutations are associated with any diseases based on the available data in All of Us (Fig. 2). Additionally, we conducted a literature review for each of these mutations. We first discuss the SNPs identified in general, focusing on their associations with sex at birth and ethnicity. Next, we investigate some SNPs that are predominant in the Asian-American population. Then, we explored the SNPs found in both All of Us and COSMIC, reviewing the existing literature on these variants. Following this, we assessed the SNPs that were present in All of Us but absent in COSMIC and vice versa.
Fig. 2.
Missense mutations found in more than 20 samples in All of Us or COSMIC for STAT5B, STAT6, JAK1, JAK2, and JAK3. STAT1, STAT2, STAT3, STAT4, STAT5A, and TYK2 can be found in Suppl. Figures 2–4. If the amino acid change is labeled in red color then it exists in at least 20 individuals in COSMIC but not in All of Us. If the amino acid change is labeled in black color then it exists in at least 20 individuals in All of Us. We labeled it with COSMIC and All of Us numbers, if an amino acid change exists in both databases.
Suppl. Table 1 shows that mutations in members of the JAK-STAT pathway predominantly occur in females. STAT1 mutations are most frequently found in Black and Hispanic females. STAT2 and STAT3 also show a significant representation of Black and Hispanic individuals but include a substantial number of mutations observed in white females, especially at the end of the protein sequence. On the other hand, STAT4 mutations are primarily observed in Black females. Both STAT5A and STAT5B exhibit similar trends, with mutations predominantly affecting females in White and Black populations. STAT6 mutations, while also more frequent in females, show a strong presence in White populations. For the JAK family, JAK1 and JAK2 mutations also trend toward female prevalence, particularly in White populations. JAK3 exhibits a similar pattern of female predominance across various ethnicities. TYK2 mutations follow this overarching trend, with a notable prevalence among Hispanic and White females. In the All of Us database, none of these SNPs could be significantly associated with a disease.
While the analysis highlights significant representation among Black, Hispanic, and White populations, we recognize the presence of mutations affecting Asian populations as well. A few JAK-STAT pathway mutations have been observed in Asian females and males, mainly with no health condition associated with the All of Us database. A total of 205 have been identified in All of Us with rs56118985 (JAK2Gly127Asp), most of them Asian females. In the literature, this missense mutation is associated with Philadelphia-negative myeloproliferative neoplasms (Ph-MPN), a group of cancers that cause the body to produce too many blood cells34. The authors highlight that rs56118985 co-occurs exclusively with rs77375493 (polycythemia vera) in their patients. Additionally, Motegi et al. showed that the variant rs201917359 (TYK2Arg231Trp) was characterized as a gain-of-function mutation in the Japanese population, which could enhance TYK2 signaling and contribute to the pathogenesis of Rheumatoid Arthritis35. In the study by Nemoto et al. (2018), the SNP rs201917359 in the TYK2 gene was linked to primary immunodeficiency in two siblings, characterized by T-cell lymphopenia36. In the COSMIC database, neither rs56118985 nor rs201917359 was identified in 20 patients or more, and none of the individuals in the All of Us database was diagnosed with any of the diseases mentioned in the literature.
Next, we investigated whether some variants were identified in COSMIC and All of Us. The most prominent example is rs77375493 (polycythemia vera, JAK2Val617Phe). rs77375493 has been identified in 48,389 individuals in COSMIC and 278 individuals, from which 79% are 65 years or older, in All of Us. This mutation has been widely described as a risk factor in literature37–51. From the numbers in COSMIC, we can assume that it is directly related to cancer. However, upon closer look into All of Us, we cannot find any cancer diagnosis for those with rs77375493, even though most of the individuals are 65 years or older, and we would expect them to develop the disease. Similar to the previous findings, we also analyzed rs3213409 (JAK3Val722Ile, germline, also reported as somatic in a few solid tumors52), present in the All of Us and COSMIC databases. rs3213409 has been identified in 48 individuals in COSMIC and 4449 individuals within All of Us, predominantly affecting white females. However, no disease association is recorded in the All of Us database. In contrast, the literature has linked rs3213409 to several cancer types, including acute lymphoblastic leukemia52–57. These observations could be skewed since the individuals in All of Us who have rs77375493 or rs3213409 could be diagnosed with the disease after submitting their data.
We also found some further variations associated with various cancers in the literature that were not present in the COSMIC database but were present in the All of Us database. rs3212723 (JAK3Pro132Thr), a variation that was identified in 10,507 individuals (predominantly Black females) in All of Us but not in COSMIC, was associated with acute megakaryoblastic leukemia58,59, head and neck cancer60, and Ameloblastoma61 in the literature. In All of Us, we can only identify that the individuals suffer from essential hypertension and chest pain in large numbers. We also identified the SNP rs139504737 (JAK2Gly571Ser) in the All of Us database, where it was found in 289 individuals, predominantly among Hispanic females. However, no corresponding disease association was noted in the database. In the literature, rs139504737 has been linked to various malignancies, including acute lymphoblastic leukemia, myeloproliferative neoplasms, and thrombocythemia, suggesting a potential role in hematological disorders despite not being identified in large numbers in COSMIC62–65. rs372254348 (JAK2Ile724Thr) is present in the All of Us database but lacks cancer associations within the database. This SNP has been documented in the literature as being associated with myeloproliferative neoplasms, including conditions such as polycythemia vera, essential thrombocythemia, and primary myelofibrosis66. We observed rs142269166 (TYK2Asn1108Ser) in the All of Us database, where it was found in 1,357 individuals, primarily among white females, without any recorded disease associations. However, this SNP has been extensively studied in the literature, linking it to various myeloproliferative neoplasms, including polycythemia vera and myelofibrosis, which have been implicated in the transformation of myeloproliferative neoplasms into acute myeloid leukemia43,67,68. We detected rs200077579 (JAK3Arg840Cys) in the All of Us database but not in COSMIC, where it appears without any associated cancer diagnoses. This variant has been documented in the literature as a heterozygous mutation linked to Cytotoxic T Lymphocyte Antigen-4-Dependent Immune Dysregulation Syndrome69. rs201335603 (TYK2Gly761Val) is present in the All of Us database but lacks an associated disease diagnosis. This SNP has been implicated in the literature concerning various hematologic malignancies, particularly its association with acute lymphoblastic leukemia (ALL) and other cancers. Specifically, germline activating mutations in the TYK2 gene, including this SNP, have been linked to increased susceptibility to ALL, indicating a potential role in oncogenesis; however, none of the 66 individuals in All of Us seem to have any of the conditions70,71. The variant rs141331848 (STAT4Thr446Ile, 316 individuals in All of Us) has been documented in the literature as potentially linked to classic Kaposi sarcoma (cKS), with evidence suggesting its role in genetic predisposition. Specifically, this SNP was found in a Finnish family with affected individuals, indicating a possible association with the disease72. We also noticed rs41316003 (JAK2Arg1063His), which has been documented in All of Us with 2,042 individuals. However, no associated disease diagnoses are present in the database. In the literature, this variant has been linked to several conditions, including myeloproliferative neoplasms and familial ischemic stroke44,73–75. None of these SNPs were associated with the diseases described in the literature in All of Us.
We reverse the search to investigate the SNPs that were found in COSMIC but not in All of Us. rs2081548277 (STAT3Gly618Arg) was identified in 21 samples in COSMIC but not in All of Us. rs2081548277 is associated with various hematological and lymphoid malignancies, particularly T-cell lymphomas. It is categorized as a gain-of-function (GOF) mutation that increases STAT3 activation, promoting cell proliferation and contributing to disease pathology76–78. In the context of large granular lymphocyte leukemia (LGLL), gain-of-function mutations in STAT3, including this specific variant, have been linked to aberrant cytokine signaling, specifically involving IL-6 and IL-15, and the upregulation of epigenetic regulators such as DNMT1 and EZH2. These mutations contribute to global hypermethylation, increased oxidative stress, and a proliferative advantage in affected cells76–78. The SNP rs770986654 (STAT3Asn647Ile) has been identified in 30 individuals in COSMIC but none in All of Us. In the literature, this mutation is associated with large granular lymphocyte (LGL) leukemia. rs770986654 is described as a gain-of-function mutation, leading to enhanced STAT3 activation. In the context of T-cell large granular lymphocytic leukemia, these gain-of-function mutations promote the survival and proliferation of leukemic T cells by upregulating cytokine signaling pathways, particularly involving interleukin-6 (IL-6) and interleukin-15 (IL-15). Moreover, such mutations are linked to increased activity of epigenetic regulators, contributing to the dysregulated gene expression observed in the disease78–81. The SNP rs747639500 (STAT3Asp661Tyr) is associated with large granular lymphocyte (LGL) leukemia in literature. This mutation has been identified in COSMIC in 127 individuals and is classified as a somatic gain-of-function mutation in the STAT3 gene81–84. The SNP rs938448224 (STAT5BAsn642His) has been identified in 120 individuals in COSMIC but not in All of Us. rs938448224, a gain-of-function mutation of STAT5B, has been implicated in several hematological disorders, particularly chronic myeloid neoplasms and eosinophilia. The mutation enhances STAT5B signaling, promoting abnormal cell proliferation and survival by increasing cytokine signaling85–87. The SNP rs1057519721 (JAK2Arg683Gly) has been detected in 104 individuals in COSMIC, though it was not found in the All of Us database. rs1057519721, a GOF mutation in JAK2, has been implicated in hematological malignancies, particularly ALL. The mutation enhances JAK2 signaling, promoting abnormal cell proliferation and survival by dysregulating cytokine signaling pathways, contributing to leukemic progression88–92. The SNP rs121913504 (JAK3Ala572Val) has been identified in 28 individuals in COSMIC. This variant has been implicated in various hematologic malignancies, including T-cell malignancies, particularly JAK3-mutation-positive leukemia. It is a gain-of-function mutation that enhances the signaling pathway of JAK3, contributing to abnormal cell proliferation and survival through dysregulated cytokine signaling. The mutation impacts downstream epigenetic regulators and cytokine-mediated pathways, further exacerbating disease progression93–97. The SNP rs2147686240 (JAK3Ala573Val) has been identified in 28 individuals in COSMIC. It is a well-known gain-of-function mutation in JAK3, located in the Pseudokinase domain. It has been associated with various hematological malignancies, particularly natural killer/T-cell lymphoma (NKTCL). The mutation enhances JAK3-mediated STAT5 activation, leading to cytokine-independent cell growth and contributing to the pathogenesis of NKTCL by promoting uncontrolled proliferation and survival of malignant cells53,98–101. The SNP rs758959409 (JAK3Arg657Gln) was identified in 48 individuals in COSMIC but not in All of Us. This variant is involved in hematological malignancies, particularly in Down syndrome-related acute megakaryoblastic leukemia (AMKL) and transient myeloproliferative disorder (TMD). Mutations like this promote constitutive activation of JAK3, leading to enhanced signaling through the JAK-STAT pathway, thereby driving abnormal cell proliferation and survival in affected cells101,102. These mutations seem to be promising candidates for disease driver mutations due to their absence in the All of Us database.
Finally, we identified one SNP (rs369530676, TYK2Arg118Gln) that is exclusive to the All of Us database and is found in 34 individuals. A notable majority of individuals carrying this variant are diagnosed with Type 2 Diabetes. This SNP has not been discussed in the literature yet and might represent a suitable candidate for future experimental investigation.
Limitations and considerations
The JAK-STAT signaling pathway is known for its evolutionary conservation across species, which underscores its fundamental role in cellular communication and immune responses103,104. The analysis of missense mutations within the JAK-STAT pathway in the All of Us database reveals a first sight into a complex landscape of genetic variations within the mostly healthy population, which we found in every domain of the JAK and STAT proteins. However, most of the identified SNPs could not be associated with any disease in the All of Us database. Our findings indicate that while certain domains of the JAK and STAT proteins exhibit higher mutation rates, no domain of any member of the JAK or STAT gene family is entirely conserved across the U.S. population. However, these mutations could subtly affect immune responses and signaling pathways among individuals. The presence of mutations in critical functional domains raises the possibility that they may not universally lead to deleterious effects but could instead contribute to population-level genetic diversity105–107.
The analysis of SNPs within the JAK-STAT pathway in the All of Us database reveals a notable predominance of these genetic variations in white females. This observation raises important considerations regarding the implications of ethnic and sex-based disparities in genetic susceptibility to diseases associated with the JAK-STAT signaling pathway, particularly hematological malignancies, and autoimmune disorders, which are also more prevalent in females108, highlighting a potential association for future studies. Previous studies have indicated that certain genetic variants exhibit differential frequencies across ethnic groups, which can impact disease susceptibility and treatment responses109,110.
Our investigation of missense SNPs in the JAK-STAT pathway, using both the All of Us and COSMIC databases, reveals discrepancies in mutation representation and disease associations. The SNPs can be categorized into three groups: those found in both databases, those unique to All of Us, and those unique to COSMIC. Each category provides insight into the limitations of current genetic data and its interpretation.
An important consideration is the concept of clonal hematopoiesis (CH), increasingly recognized as a premalignant condition in which somatic mutations—including those in genes such as JAK2, JAK3, and STAT3—may be detected in otherwise healthy individuals. The 5th edition of the World Health Organization (WHO) Classification of Haematolymphoid Tumours underscores the frequent occurrence of these mutations in CH and their clinical significance111. In particular, JAK2Val617Phe is well-established as the main driver in myeloproliferative neoplasms (MPNs). However, its detection in asymptomatic individuals may reflect clonal hematopoiesis of indeterminate potential (CHIP) and not an overt MPN. A fraction of carriers eventually develop MPN, influenced by factors such as inflammation, additional mutations112, and an increased allelic burden113. Similarly, JAK3Val722Ile can appear as a germline variant in seemingly healthy individuals yet has been reported as a somatic change in certain solid tumors. These observations could help explain why recognized cancer driver mutations appear in the general population database (All of Us).
The next category, mutations found only in All of Us, such as rs56118985 and rs139504737, raises intriguing questions. These SNPs, particularly prevalent in certain ethnic populations, are associated with malignancies in the literature but show no corresponding disease diagnoses in All of Us and were not discovered in COSMIC. This suggests either underdiagnosis or that these mutations have not yet manifested as a disease in these individuals at the time of the sampling. It emphasizes the importance of including diverse populations in genetic studies to capture variations that may not be reflected in cancer-centric databases like COSMIC that, in general, overrepresent European ancestry114,115.
Mutations like rs2081548277 and rs938448224, found only in COSMIC, highlight COSMICs value. These variants associated with cancer are absent in the All of Us database, likely due to fewer people with a cancer diagnosis. These SNPs might represent important driver mutations since they cannot be found in the general population. COSMIC emphasizes disease endpoints, particularly malignancies, while All of Us includes individuals earlier in their disease progression or who may never develop cancer. These observations highlight the complementary strengths and limitations of different genetic databases. This showcases that considering both disease-focused and general population databases when selecting SNPs for investigation can provide a more comprehensive understanding of their potential significance in disease.
Ultimately, the well-studied JAK2Val617Phe mutation is present in both databases and widely considered as a driver. It shows no disease associations in All of Us and raises the possibility that some mutations may require additional mutations for disease development. Moreover, the identification of novel mutations exclusive to the All of Us database, such as rs369530676 in TYK2, which correlates strongly with Type 2 Diabetes, suggests that the JAK-STAT pathway may play a broader role in metabolic diseases beyond its established involvement in cancer116,117.
Moreover, it is increasingly evident that most complex diseases might not be explained by a single SNP but rather by a combination of SNPs working together. Tools like NeEDL allow us to interpret the statistical significance of these combinations, revealing that with 3–7 SNPs, we can achieve a higher predictive score for complex heritable diseases like Rheumatoid Arthritis, Diabetes, diabetes, or Alzheimer’s disease118–120. This insight suggests that a similar approach could be applied to cancer biology, where multiple incidental SNPs might collectively serve as a root cause of cancer121–123.
Conclusion
In conclusion, this study underscores the need for integrative approaches that combine population-level data with disease-focused resources. While COSMIC provides critical insight into cancer-associated mutations, All of Us expands the scope by capturing genetic diversity across the general, mostly healthy, population and underrepresented groups. The discrepancies between disease-specific and general population databases underscore the importance of integrating genetic data from various sources to enhance our understanding of the role of these mutations in disease pathogenesis. We showed that neither database alone is sufficient to investigate the impact of genetic mutations on disease. In most cases, we believe that a single SNP is insufficient to drive the onset of complex diseases. Hence, we plan to expand our future research into multi-SNP analyses and integrate diverse data sources to improve the prediction of important genetic factors in complex diseases. By combining the strengths of multiple databases with different objectives and integrating multi-SNP analyses, we can unlock new insights into complex diseases and drive advancements in personalized genomic medicine.
Materials and methods
All of us Data Explorer
The All of Us Research Program is a national initiative aimed at collecting and analyzing health data from a diverse cohort of participants, with a particular focus on underrepresented populations in biomedical research. For this study, we accessed genetic data (All of Us Controlled Tier Dataset v7, which includes 413,000 participants) through the All of Us Data Browser (https://databrowser.researchallofus.org/), which allows users to explore aggregated genomic data, including single nucleotide polymorphisms (SNPs) and their associations with demographic variables such as ethnicity, sex at birth, and disease diagnosis. We examined missense mutations within the JAK-STAT gene families using this browser, focusing on SNPs that alter amino acid sequences and their prevalence across different demographic groups. We also explored the associations between these variants and known diseases within the All of Us dataset. All retrieved data adhered to the ethical standards of the program, which anonymizes participant information to protect privacy. The genetic variants of interest were filtered based on their mutation type, and we analyzed the frequency of these mutations in the general population. We further stratified our analysis by sex at birth and ethnicity to investigate population-specific variations in mutation prevalence.
We used all missense mutations that were found in at least 1 participant in the section “Assessing the frequency of JAK and STAT domain missense mutations in the general population”. For the closer analysis in the section “Comparing missense single nucleotide variants in the JAK/STAT genes that were identified in All of Us or COSMIC to what is known in the literature and how they are associated with disease in All of Us”, we used all missense mutations that were found in at least 20 participants due to the All of Us database policy.
COSMIC (Catalogue of Somatic Mutations in Cancer)
The Catalogue of Somatic Mutations in Cancer (COSMIC) database (https://cancer.sanger.ac.uk/cosmic) is a comprehensive resource documenting somatic mutations observed in cancer. We used COSMIC (COSMIC v100) to examine the prevalence of missense mutations in the JAK-STAT pathway, specifically focusing on mutations that have been identified in cancer samples. COSMIC provides detailed information about the mutational spectrum, tissue distribution, and associated cancers for each SNP, which allowed us to compare these findings with the data from the All of Us cohort.
The mutations were extracted using COSMIC’s online tools, which offer a range of filtering options, including mutation type and tissue of origin. For each SNP identified in COSMIC, we recorded the number of samples carrying the mutation and linked this information to disease associations as documented in the literature. For consistency reasons with All of Us, we used only missense mutations that were found in at least 20 individuals.
Comparison between All of Us and COSMIC
After retrieving the relevant SNP data from both databases, we manually conducted a comparative analysis to identify the overlap between mutations found in both resources and mutations unique to each dataset. This work had to be done manually since, currently, no data transfer between COSMIC and All of Us is allowed. This comparison enabled us to assess the clinical relevance of mutations across both general and cancer-focused populations. For mutations present in both datasets, we further reviewed existing literature to determine their disease associations and clinical implications. We investigated the relevance of mutations found only in COSMIC to specific cancer types. Similarly, mutations exclusive to All of Us were analyzed in the context of population-specific genetic variation and disease prevalence.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Acknowledgements
The authors want to thank Jakub Jankowski, Sung-Gwon Lee, Hye Kyung Lee, Priscilla A. Furth, and the members of the Laboratory of Cell & Molecular Biology (LCMB), NIDDK, NIH, for their valuable input. The figures were created with Biorender.com. Parts of the figures include icons from Flaticon.com under a paid license. The text was partly rephrased using ChatGPT version 4, Grammarly, and scite.ai under a paid license. Paperpile, under a paid license, was used to collect references in the right format.We gratefully acknowledge All of Us participants for their contributions, without whom this research would not have been possible. We also thank the National Institutes of Health’s All of Us Research Program for making available the participant data, samples, and cohort examined in this study.
Author contributions
M.H. planned the project, executed the analysis, and wrote the manuscript. L.H. planned, supervised the project, and revised the manuscript. All authors read and approved the final version of the manuscript.
Funding
Open access funding provided by the National Institutes of Health
This work was supported by the Intramural Research Programs (IRPs) of the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK).
Data availability
This study used data from the All of Us Research Program’s controlled Tier Dataset v.7, available to authorized users on the Researcher Workbench (https://databrowser.researchallofus.org/). Data from COSMIC v100 is available at (https://cancer.sanger.ac.uk/cosmic).
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Shen, L. X., Basilion, J. P. & Stanton, V. P. Jr. Single-nucleotide polymorphisms can cause different structural folds of mRNA. Proc. Natl. Acad. Sci. U S A. 96, 7871–7876 (1999). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Chu, D. & Wei, L. Nonsynonymous, synonymous and nonsense mutations in human cancer-related genes undergo stronger purifying selections than expectation. BMC Cancer19, (2019). [DOI] [PMC free article] [PubMed]
- 3.Peña-Martínez, E. G. & Rodríguez-Martínez, J. A. Decoding non-coding variants: Recent approaches to studying their role in gene regulation and human diseases. Front. Biosci. (Schol Ed). 16, 4 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Hoffmann, M. et al. TF-Prioritizer: A Java pipeline to prioritize condition-specific transcription factors. Gigascience12, giad026 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Hecker, D. et al. Computational tools for inferring transcription factor activity. Proteomics e2200462 (2023). [DOI] [PubMed]
- 6.Brooks, A. J. & Putoczki, T. JAK-STAT signalling pathway in cancer. Cancers (Basel). 12, 1971 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Hu, X., Li, J., Fu, M., Zhao, X. & Wang, W. The JAK/STAT signaling pathway: From bench to clinic. Signal. Transduct. Target. Ther.6, 402 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Xue, C. et al. Evolving cognition of the JAK-STAT signaling pathway: Autoimmune disorders and cancer. Signal. Transduct. Target. Ther.8, 1–24 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Hoffmann, M. et al. Blood transcriptomics analysis offers insights into variant-specific immune response to SARS-CoV-2. Sci. Rep.14, 1–11 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Lee, H. K., Jung, O. & Hennighausen, L. JAK inhibitors dampen activation of interferon-stimulated transcription of ACE2 isoforms in human airway epithelial cells. Commun. Biol.4, 654 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Erdogan, F. et al. Structural and mutational analysis of member-specific STAT functions. Biochim. Biophys. Acta Gen. Subj.1866, 130058 (2022). [DOI] [PubMed] [Google Scholar]
- 12.Rampal, R. et al. Integrated genomic analysis illustrates the central role of JAK-STAT pathway activation in myeloproliferative neoplasm pathogenesis. Blood123, e123–e133 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Perner, F., Perner, C., Ernst, T. & Heidel, F. H. Roles of JAK2 in aging, inflammation, hematopoiesis and malignant transformation. Cells8, 854 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Raivola, J., Haikarainen, T., Abraham, B. G. & Silvennoinen, O. Janus kinases in leukemia. Cancers (Basel). 13, 800 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Constantinescu, S. N., Leroy, E., Gryshkova, V., Pecquet, C. & Dusa, A. Activating Janus kinase pseudokinase domain mutations in myeloproliferative and other blood cancers. Biochem. Soc. Trans.41, 1048–1054 (2013). [DOI] [PubMed] [Google Scholar]
- 16.Girardi, T. et al. The T-cell leukemia-associated ribosomal RPL10 R98S mutation enhances JAK-STAT signaling. Leukemia32, 809–819 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Waldmann, T. A. JAK/STAT pathway directed therapy of T-cell leukemia/lymphoma: Inspired by functional and structural genomics. Mol. Cell. Endocrinol.451, 66–70 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Grant, A. H. et al. JAK1 pseudokinase V666G mutant dominantly impairs JAK3 phosphorylation and IL-2 signaling. Int. J. Mol. Sci.24, (2023). [DOI] [PMC free article] [PubMed]
- 19.O’Shea, J. J. et al. The JAK-STAT pathway: Impact on human disease and therapeutic intervention. Annu. Rev. Med.66, 311–328 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Łączak, M. et al. JAK and STAT gene mutations and JAK-STAT pathway activation in lympho- and myeloproliferative neoplasms. Hematol. Clin. Pract.12, 89–104 (2022). [Google Scholar]
- 21.Deng, M. et al. A novel STAT3 gain-of-function mutation in fatal infancy-onset interstitial lung disease. Front. Immunol.13, 866638 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Klein, K., Stoiber, D., Sexl, V. & Witalisz-Siepracka, A. Untwining anti-tumor and immunosuppressive effects of JAK inhibitors-A strategy for hematological malignancies? Cancers (Basel). 13, 2611 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Wahnschaffe, L. et al. JAK/STAT-activating genomic alterations are a hallmark of T-PLL. Cancers (Basel). 11, 1833 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Hodge, L. S. et al. Constitutive activation of STAT5A and STAT5B regulates IgM secretion in Waldenström’s macroglobulinemia. Blood123, 1055–1058 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Bamford, S. et al. The COSMIC (catalogue of somatic mutations in Cancer) database and website. Br. J. Cancer. 91, 355–358 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Sondka, Z. et al. COSMIC: A curated database of somatic variants and clinical data for cancer. Nucleic Acids Res.52, D1210–D1217 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Chen, S. et al. Genome aggregation database (gnomAD) consortium. Nature625, 92–100 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Smigielski, E. M. dbSNP: A database of single nucleotide polymorphisms. Nucleic Acids Res.28, 352–355 (2000). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.The All of Us Research Program Genomics Investigators. Genomic data in the all of Us Research Program. Nature627, 340–346 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Hu, Q. et al. JAK/STAT pathway: Extracellular signals, diseases, immunity, and therapeutic regimens. Front. Bioeng. Biotechnol.11, 1110765 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Silver-Morse, L. & Li, W. X. JAK-STAT in heterochromatin and genome stability. JAKSTAT2, e26090 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Puleo, D. E. et al. Identification and characterization of JAK2 pseudokinase domain small molecule binders. ACS Med. Chem. Lett.8, 618–621 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Rah, B. et al. JAK/STAT signaling: Molecular targets, therapeutic opportunities, and limitations of targeted inhibitions in solid malignancies. Front. Pharmacol.13, (2022). [DOI] [PMC free article] [PubMed]
- 34.Wang, Z. et al. Clinical laboratory characteristics and gene mutation spectrum of Ph-negative MPN patients with atypical variants of JAK2, MPL, or CALR. Cancer Med.13, (2024). [DOI] [PMC free article] [PubMed]
- 35.Motegi, T. et al. Identification of rare coding variants in TYK2 protective for rheumatoid arthritis in the Japanese population and their effects on cytokine signalling. Ann. Rheum. Dis.78, 1062–1069 (2019). [DOI] [PubMed] [Google Scholar]
- 36.Nemoto, M. et al. Compound heterozygous TYK2 mutations underlie primary immunodeficiency with T-cell lymphopenia. Sci. Rep.8, (2018). [DOI] [PMC free article] [PubMed]
- 37.Lee, T. S., Ma, W., Zhang, X., Kantarjian, H. & Albitar, M. Structural effects of clinically observed mutations in JAK2 exons 13–15: comparison with V617F and exon 12 mutations. BMC Struct. Biol.9, 58 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Zhang, Y., Zhao, Y., Liu, Y., Zhang, M. & Zhang, J. New advances in the role of JAK2 V617F mutation in myeloproliferative neoplasms. Cancer10.1002/cncr.35559 (2024). [DOI] [PubMed] [Google Scholar]
- 39.Haji Paiman, N. S., Mat Nasir, N., Miptah, H. N. & Saidon, N. & Abdul Monir, M. Challenges in diagnosing polycythemia Vera in primary care: A 55-year-old Malaysian woman with atypical presentation. Am. J. Case Rep.25, (2024). [DOI] [PMC free article] [PubMed]
- 40.Eichstaedt, C. A. et al. Myeloproliferative diseases as possible risk factor for development of chronic thromboembolic pulmonary hypertension-A genetic study. Int. J. Mol. Sci.21, (2020). [DOI] [PMC free article] [PubMed]
- 41.Carlos, J. A. E. G., Lima, K., Rego, E. M., Costa-Lotufo, L. V. & Machado-Neto, J. A. The survivin/XIAP suppressant YM155 impairs clonal growth and induces apoptosis in JAK2V617F cells. Hematol. Transfus. Cell. Ther.10.1016/j.htct.2024.05.012 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Bourrienne, M. C. et al. Impaired fibrinolysis in JAK2V617F-related myeloproliferative neoplasms. J. Thromb. Haemost. 10.1016/j.jtha.2024.07.031 (2024). [DOI] [PubMed] [Google Scholar]
- 43.Schulze, S. et al. Concomitant and noncanonical JAK2 and MPL mutations in JAK2V617F- and MPLW515 L-positive myelofibrosis. Genes Chromosomes Cancer. 58, 747–755 (2019). [DOI] [PubMed] [Google Scholar]
- 44.Mambet, C. et al. Cooccurring JAK2 V617F and R1063H mutations increase JAK2 signaling and neutrophilia in myeloproliferative neoplasms. Blood132, 2695–2699 (2018). [DOI] [PubMed] [Google Scholar]
- 45.Pace, M. et al. Myeloid sarcoma of the breast as blast phase of JAK2-mutated (Val617Phe exon 14p) essential thrombocythemia: A case report and a systematic literature review. Pathobiology90, 123–130 (2023). [DOI] [PubMed] [Google Scholar]
- 46.Patchell, D., Keohane, C., O’Shea, S. & Langabeer, S. E. Incidence and impact of non-canonical JAK2 p.(Val617Phe) mutations in myeloproliferative neoplasm molecular diagnostics. J. Clin. Pathol. (2024). jcp-2023-209276. [DOI] [PubMed]
- 47.Choi, D. C. et al. JAK2V617F impairs lymphoid differentiation in myeloproliferative neoplasms. Leukemia10.1038/s41375-024-02388-3 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Liosi, M. E. et al. Selective Janus kinase 2 (JAK2) pseudokinase ligands with a diaminotriazole core. J. Med. Chem.63, 5324–5340 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Veitia, R. A. & Innan, H. Pathogenic germline variants associated with myeloproliferative disorders in apparently normal individuals: Inherited or acquired genetic alterations? Clin. Genet.101, 371–374 (2022). [DOI] [PubMed] [Google Scholar]
- 50.Dusa, A. et al. Substitution of pseudokinase domain residue val-617 by large non-polar amino acids causes activation of JAK2. J. Biol. Chem.283, 12941–12948 (2008). [DOI] [PubMed] [Google Scholar]
- 51.Brooks, S. A. et al. JAK2V617I results in cytokine hypersensitivity without causing an overt myeloproliferative disorder in a mouse transduction-transplantation model. Exp. Hematol.44, 24–29 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Elli, E. M. et al. Idiopathic erythrocytosis: A germline disease? Clin. Exp. Med.24, (2024). [DOI] [PMC free article] [PubMed]
- 53.Bouchekioua, A. et al. JAK3 deregulation by activating mutations confers invasive growth advantage in extranodal nasal-type natural killer cell lymphoma. Leukemia28, 338–348 (2014). [DOI] [PubMed] [Google Scholar]
- 54.Xu, L. et al. Potential pitfalls of mass spectrometry to uncover mutations in childhood soft tissue sarcoma: A report from the Children’s Oncology Group. Sci. Rep.6, (2016). [DOI] [PMC free article] [PubMed]
- 55.Ehrentraut, S. et al. Th17 cytokine differentiation and loss of plasticity after SOCS1 inactivation in a cutaneous T-cell lymphoma. Oncotarget7, 34201–34216 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.de Martino, M. et al. JAK3 in clear cell renal cell carcinoma: Mutational screening and clinical implications. Urol. Oncol.31, 930–937 (2013). [DOI] [PubMed] [Google Scholar]
- 57.Alghamdi, K. Delayed diagnosis of a hyper functioning parathyroid cyst. A case report and genetic analysis. Acta Endocrinol. (Buchar). 12, 215–218 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Riera, L. et al. Description of a novel Janus kinase 3 P132A mutation in acute megakaryoblastic leukemia and demonstration of previously reported Janus kinase 3 mutations in normal subjects. Leuk. Lymphoma. 52, 1742–1750 (2011). [DOI] [PubMed] [Google Scholar]
- 59.Walters, D. K. et al. Activating alleles of JAK3 in acute megakaryoblastic leukemia. Cancer Cell.10, 65–75 (2006). [DOI] [PubMed] [Google Scholar]
- 60.Guerrero-Preston, R. et al. JAK3 variant, immune signatures, DNA methylation, and social determinants linked to survival racial disparities in head and neck cancer patients. Cancer Prev. Res. (Phila). 12, 255–270 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.González-González, R. et al. Current concepts in ameloblastoma-targeted therapies in B-raf proto-oncogene serine/threonine kinase V600E mutation: Systematic review. World J. Clin. Oncol.11, 31–42 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Lin, M. et al. JAK2 p.G571S in B-cell precursor acute lymphoblastic leukemia: A synergizing germline susceptibility. Leukemia33, 2331–2335 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Panovska-Stavridis, I. et al. Essential thrombocythemia associated with germline JAK2 G571S variant and somatic CALR type 1 mutation. Clin. Lymphoma Myeloma Leuk.16, e55–e57 (2016). [DOI] [PubMed] [Google Scholar]
- 64.Bahar, B., Barton, K. & Kini, A. R. The role of the exon 13 G571S JAK2 mutation in myeloproliferative neoplasms. Leuk. Res. Rep.6, 27–28 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Alghasham, N., Alnouri, Y., Abalkhail, H. & Khalil, S. Detection of mutations in JAK2 exons 12–15 by Sanger sequencing. Int. J. Lab. Hematol.38, 34–41 (2016). [DOI] [PubMed] [Google Scholar]
- 66.Puli’uvea, C. et al. Insights into the role of JAK2-I724T variant in myeloproliferative neoplasms from a unique cohort of New Zealand patients. Hematology29, (2024). [DOI] [PubMed]
- 67.Oliveira, Costa, A., Barreira, A., Cunha, M. & Salvador, F. Polycythemia and JAK2 variant N1108S: Cause-and-effect or coincidence? Hematol. Transfus. Cell. Ther.10.1016/j.htct.2023.01.006 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Benton, C. B. et al. Janus kinase 2 variants associated with the transformation of myeloproliferative neoplasms into acute myeloid leukemia. Cancer125, 1855–1866 (2019). [DOI] [PubMed] [Google Scholar]
- 69.Sic, H. et al. An activating Janus kinase-3 mutation is associated with cytotoxic T lymphocyte antigen-4-dependent immune dysregulation syndrome. Front. Immunol.8, (2017). [DOI] [PMC free article] [PubMed]
- 70.Woess, K. et al. Oncogenic TYK2 P760L kinase is effectively targeted by combinatorial TYK2, mTOR and CDK4/6 kinase blockade. Haematologica108, 993–1005 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Waanders, E. et al. Germline activating TYK2 mutations in pediatric patients with two primary acute lymphoblastic leukemia occurrences. Leukemia31, 821–828 (2017). [DOI] [PubMed] [Google Scholar]
- 72.Aavikko, M. et al. Whole-genome sequencing identifiesSTAT4as a putative susceptibility gene in classic Kaposi sarcoma. J. Infect. Dis.211, 1842–1851 (2015). [DOI] [PubMed] [Google Scholar]
- 73.Schulze, S. et al. Concomitant and noncanonical JAK2 and MPL mutations in JAK2V617F- and MPLW515 L‐positive myelofibrosis. Genes Chromosomes Cancer 58, 747–755 (2019). [DOI] [PubMed]
- 74.Ilinca, A. et al. Whole-exome sequencing in 22 young ischemic stroke patients with familial clustering of stroke. Stroke51, 1056–1063 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Kapralova, K. et al. Cooperation of germ line JAK2 mutations E846D and R1063H in hereditary erythrocytosis with megakaryocytic atypia. Blood128, 1418–1423 (2016). [DOI] [PubMed] [Google Scholar]
- 76.Kim, D. et al. STAT3 activation in large granular lymphocyte leukemia is associated with cytokine signaling and DNA hypermethylation. Leukemia35, 3430–3443 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Ramsey, M. C. et al. Case Report: Identification of a novel STAT3 mutation in EBV-positive inflammatory follicular dendritic cell sarcoma. Front. Oncol.13, (2023). [DOI] [PMC free article] [PubMed]
- 78.Kristensen, T. et al. Clinical relevance of sensitive and quantitative STAT3 mutation analysis using next-generation sequencing in T-cell large granular lymphocytic leukemia. J. Mol. Diagn.16, 382–392 (2014). [DOI] [PubMed] [Google Scholar]
- 79.Koskela, H. L. M. et al. Somatic STAT3 mutations in large granular lymphocytic leukemia. N Engl. J. Med.366, 1905–1913 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Shen, M. A case report of T-LGL leukemia-associated pure red cell aplasia harboring STAT3, TNFAIP3, and KMT2D mutation. Transl Cancer Res.12, 1054–1059 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Cheon, H. et al. Genomic landscape of TCRαβ and TCRγδ T-large granular lymphocyte leukemia. Blood139, 3058–3072 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Olson, K. C. et al. Large granular lymphocyte leukemia serum and corresponding hematological parameters reveal unique cytokine and sphingolipid biomarkers and associations with STAT3 mutations. Cancer Med.9, 6533–6549 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Rivero, A. et al. Clinicobiological characteristics and outcomes of patients with T-cell large granular lymphocytic leukemia and chronic lymphoproliferative disorder of natural killer cells from a single institution. Cancers (Basel). 13, 3900 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Tanahashi, T. et al. Cell size variations of large granular lymphocyte leukemia: Implication of a small cell subtype of granular lymphocyte leukemia with STAT3 mutations. Leuk. Res.45, 8–13 (2016). [DOI] [PubMed] [Google Scholar]
- 85.Yin, C. C. et al. STAT5B mutations in myeloid neoplasms differ by disease subtypes but characterize a subset of chronic myeloid neoplasms with eosinophilia and/or basophilia. Haematologica 109, (2023). [DOI] [PMC free article] [PubMed]
- 86.Hu, Z. et al. T-cell prolymphocytic leukemia with t(X;14)(q28;Q11.2): A clinicopathologic study of 15 cases. Am. J. Clin. Pathol.159, 325–336 (2023). [DOI] [PubMed] [Google Scholar]
- 87.Freiche, V. et al. on kieslinger A recurrent STAT5BN642H driver mutation in feline alimentary T cell lymphoma. Cancers. 13, 5238. Cancers (Basel) 14, 4593 (2022). (2021). [DOI] [PMC free article] [PubMed]
- 88.Tomoyasu, C. et al. Copy number abnormality of acute lymphoblastic leukemia cell lines based on their genetic subtypes. Int. J. Hematol.108, 312–318 (2018). [DOI] [PubMed] [Google Scholar]
- 89.Skoczen, S. et al. Genetic signature of acute lymphoblastic leukemia and netherton syndrome co-incidence—first report in the literature. Front. Oncol.9, (2020). [DOI] [PMC free article] [PubMed]
- 90.Roncero, A. M. et al. Contribution of JAK2 mutations to T-cell lymphoblastic lymphoma development. Leukemia30, 94–103 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Hassan, N. M. et al. Prognostic significance of CRLF2 overexpression and JAK2 mutation in Egyptian pediatric patients with B-precursor acute lymphoblastic leukemia. Clin. Lymphoma Myeloma Leuk.22, e376–e385 (2022). [DOI] [PubMed] [Google Scholar]
- 92.Carreño-Tarragona, G. et al. A typical acute lymphoblastic leukemia JAK2 variant, R683G, causes an aggressive form of familial thrombocytosis when germline. Leukemia35, 3295–3298 (2021). [DOI] [PubMed] [Google Scholar]
- 93.Downregulation of chloride channel. ClC-2 by Janus kinase 3. J. Membr. Biol.247, 387–393 (2014). [DOI] [PubMed] [Google Scholar]
- 94.Warsi, J. et al. Upregulation of excitatory amino acid transporters by coexpression of Janus kinase 3. J. Membr. Biol.247, 713–720 (2014). [DOI] [PubMed] [Google Scholar]
- 95.Rivera-Munoz, P. et al. Partial trisomy 21 contributes to T-cell malignancies induced by JAK3-activating mutations in murine models. Blood Adv.2, 1616–1627 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Basheer, F., Bulleeraz, V., Ngo, V. Q. T., Liongue, C. & Ward, A. C. In vivo impact of JAK3 A573V mutation revealed using zebrafish. Cell. Mol. Life Sci.79, (2022). [DOI] [PMC free article] [PubMed]
- 97.Agarwal, A. et al. Functional RNAi screen targeting cytokine and growth factor receptors reveals oncorequisite role for interleukin-2 gamma receptor in JAK3-mutation-positive leukemia. Oncogene34, 2991–2999 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Sim, S. H. et al. Novel JAK3-activating mutations in extranodal NK/T-cell lymphoma, nasal type. Am. J. Pathol.187, 980–986 (2017). [DOI] [PubMed] [Google Scholar]
- 99.Koo, G. C. et al. Janus kinase 3–activating mutations identified in natural killer/T-cell lymphoma. Cancer Discov. 2, 591–597 (2012). [DOI] [PubMed] [Google Scholar]
- 100.Steven Martinez, G., Ross, A., Kirken, A. & J. & Transforming mutations of Jak3 (A573V and M511I) show differential sensitivity to selective Jak3 inhibitors. Clin. Cancer Drugs. 3, 131–137 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Bergmann, A. K. et al. Recurrent mutation of JAK3 in T-cell prolymphocytic leukemia. Genes Chromosomes Cancer. 53, 309–316 (2014). [DOI] [PubMed] [Google Scholar]
- 102.Sato, T. et al. Functional analysis of JAK3 mutations in transient myeloproliferative disorder and acute megakaryoblastic leukaemia accompanying Down syndrome. Br. J. Haematol.141, 681–688 (2008). [DOI] [PubMed] [Google Scholar]
- 103.Witalisz-Siepracka, A. et al. Loss of JAK1 drives innate immune deficiency. Front. Immunol.9, 3108 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Gonneaud, A., Turgeon, N., Boisvert, F. M., Boudreau, F. & Asselin, C. JAK-STAT pathway inhibition partially restores intestinal homeostasis in Hdac1- and Hdac2-intestinal epithelial cell-deficient mice. Cells10, 224 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Viganò, E. et al. Somatic IL4R mutations in primary mediastinal large B-cell lymphoma lead to constitutive JAK-STAT signaling activation. Blood131, 2036–2046 (2018). [DOI] [PubMed] [Google Scholar]
- 106.Drennan, A. C. & Rui, L. HiJAKing the epigenome in leukemia and lymphoma. Leuk. Lymphoma. 58, 2540–2547 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Tang, S. et al. Association analyses of the JAK/STAT signaling pathway with the progression and prognosis of colon cancer. Oncol. Lett.10.3892/ol.2018.9569 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Shah, A. A. et al. Development of a disease registry for autoimmune bullous diseases: Initial analysis of the pemphigus vulgaris subset. Acta Derm Venereol.95, 86–90 (2015). [DOI] [PubMed] [Google Scholar]
- 109.Yang, Y., Huang, G., Yan, X. & Qing, Z. Clinical analysis of thyroglobulin antibody and thyroid peroxidase antibody and their association with vitiligo. Indian J. Dermatol.59, 357–360 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Sinha, S. et al. IL-13-mediated gender difference in susceptibility to autoimmune encephalomyelitis. J. Immunol.180, 2679–2685 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111.Khoury, J. D. et al. The 5th edition of the World Health Organization classification of Haematolymphoid Tumours: Myeloid and histiocytic/dendritic neoplasms. Leukemia36, 1703–1719 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.Manolio, T. A. et al. Finding the missing heritability of complex diseases. Nature461, 747–753 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Luque Paz, D., Kralovics, R. & Skoda, R. C. Genetic basis and molecular profiling in myeloproliferative neoplasms. Blood141, 1909–1921 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 114.Lehmann, B., Mackintosh, M., McVean, G. & Holmes, C. Optimal strategies for learning multi-ancestry polygenic scores vary across traits. Nat. Commun.14, 4023 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115.Troubat, L., Fettahoglu, D., Henches, L., Aschard, H. & Julienne, H. Multi-trait GWAS for diverse ancestries: Mapping the knowledge gap. BMC Genom.25, 375 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116.Liu, Y. et al. JAK/STAT signaling in diabetic kidney disease. Front. Cell. Dev. Biol.11, 1233259 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 117.Gurzov, E. N., Stanley, W. J., Pappas, E. G., Thomas, H. E. & Gough, D. J. The JAK/STAT pathway in obesity and diabetes. FEBS J.283, 3002–3015 (2016). [DOI] [PubMed] [Google Scholar]
- 118.Blumenthal, D. B., Baumbach, J., Hoffmann, M. & Kacprowski, T. List, M. A framework for modeling epistatic interaction. Bioinformatics10.1093/bioinformatics/btaa990 (2020). [DOI] [PubMed] [Google Scholar]
- 119.Hernández-Lorenzo, L. et al. On the limits of graph neural networks for the early diagnosis of Alzheimer’s disease. Sci. Rep.12, 17632 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 120.Hoffmann, M. et al. Network medicine-based epistasis detection in complex diseases: Ready for quantum computing. Nucleic Acids Res.10.1093/nar/gkae697 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Marcus, M. W. et al. Incorporating epistasis interaction of genetic susceptibility single nucleotide polymorphisms in a lung cancer risk prediction model. Int. J. Oncol.49, 361–370 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.Rocha, J. et al. Identification of driver epistatic gene pairs combining germline and somatic mutations in cancer. Int. J. Mol. Sci.24, 9323 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123.Moore, J. H. & Williams, S. M. Epistasis and its implications for personal genetics. Am. J. Hum. Genet.85, 309–320 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
This study used data from the All of Us Research Program’s controlled Tier Dataset v.7, available to authorized users on the Researcher Workbench (https://databrowser.researchallofus.org/). Data from COSMIC v100 is available at (https://cancer.sanger.ac.uk/cosmic).


