Abstract
Background
Genetic factors contributing to sex-associated dimorphic brain development may also underlie gender identity–related anxiety disorders.
Aim
To establish a high-throughput whole-exome sequencing (WES) and bioinformatics pipeline for identifying rare variants in sex-dimorphic neural pathways and explore their association with gender identity–related anxiety.
Methods
Peripheral genomic DNA was collected from 23 patients (13 Assigned male at birth (AMAB), 10 Assigned female at birth (AFAB)) presenting with gender identity–related anxiety at Shanghai Mental Health Centre between March 2020 and February 2022. WES libraries were prepared and sequenced to an average depth of 100×. Raw reads underwent stringent quality control, alignment, variant calling, and annotation against public databases (gnomAD, ClinVar). Rare (minor allele frequency [MAF] < 1%) high-confidence variants were filtered to focus on exonic, splice-site, and insertion and deletion (indel) events. Candidate genes were subjected to Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment and Gene Ontology (GO) analyses to identify overrepresented neural development pathways, with particular emphasis on estrogen receptor–mediated signaling.
Outcome
A total of 479 rare, potentially pathogenic variants across 19 estrogen receptor–mediated neurodevelopmental genes were identified for further validation.
Results
After quality control, 266 265 high-confidence variants were retained; of 217 757 exomic calls, 48 508 (22.3%) were exonic (49.8% nonsynonymous), including 4.1% splice-site, 225 missense, 27 nonsense, 43 frameshift, and various indels. KEGG analysis highlighted significant enrichment in axon guidance signaling, while GO terms pointed to neuronal projection and synaptic assembly. Nineteen genes within the estrogen receptor pathway harbored rare deleterious variants, suggesting disruptions in sex hormone–driven neural differentiation.
Clinical Translation
This WES-based framework enables the identification of novel candidate loci for diagnostic panels and may inform personalized interventions for gender identity–related anxiety.
Strengths and Limitations
This study leveraged high-depth whole-exome sequencing, stringent bioinformatics filtering, and a pathway-focused approach to pinpoint rare variants in sex-dimorphic neurodevelopmental genes; however, its small sample size, lack of functional validation, potential population stratification bias, and cross-sectional design limit causal inference.
Conclusion
Our integrated WES and bioinformatics pipeline uncovers rare variants in estrogen receptor–mediated neurodevelopmental genes, providing new insights into the genetic architecture of sex-dimorphic brain development and its role in gender identity–related anxiety.
Study registration
The article was registered at the ISRCTN Registry https://www.isrctn.com/ISRCTN18336816 (no. 18336816) under an observational study record.
Keywords: transgender individuals, whole exome sequencing; susceptibility loci, estrogen receptor, gene mutation
Introduction
Gender incongruence (GI) is defined as a marked and persistent incongruence between an individual’s experienced gender and their assigned sex at birth. This term replaces the previous classification of “gender dysphoria” in the International Classification of Diseases, 10th Revision (ICD-10), and is no longer categorized as a mental disorder.1-3 Instead, it is included in the chapter on conditions related to sexual health. Importantly, the minority-stress model—which attributes higher rates of depression, anxiety, and suicidal ideation among gender-diverse populations to chronic social stigma, prejudice, and discrimination—provides a critical lens for understanding the psychosocial challenges these individuals face.4
Terminology in this field is complex and varies across cultures and disciplines. Terms such as “transgender,” “nonbinary,” “gender nonconforming,” and “gender diverse” are often used to describe individuals whose gender identity or expression differs from societal expectations associated with their assigned sex at birth.5,6 It is important to distinguish between biological sex, which refers to physical and genetic characteristics, and gender, which encompasses the roles, behaviors, and identities that societies attribute to individuals.
The experiences of transgender and gender-diverse individuals are diverse and multifaceted. Some individuals identify within the traditional binary framework (male or female), while others identify outside of this binary, embracing identities such as nonbinary or gender-fluid. These identities are valid and reflect the rich diversity of human gender experiences.7,8 Research indicates that transgender and gender-diverse individuals often face significant psychosocial challenges, including higher rates of depression, anxiety, and suicidal ideation compared to the general population. For instance, a study focusing on transgender and gender-diverse youth in Australia reported that 74.6% had been diagnosed with depression and 72.2% with anxiety, 82.4% had experienced suicidal thoughts, and 48.1% had attempted suicide.9 Similarly, a study involving hospitalized transgender adolescents in the United States found that 91% had mood disorders, 65% had anxiety disorders, and 52.4% reported suicidal ideation, compared to 39.2% in their cisgender peers.10 These findings underscore the urgent need for targeted mental health support and interventions for transgender and gender-diverse populations.
Emerging research suggests that genetic, hormonal, and neurological factors may contribute to the development of gender identity, although the precise biological mechanisms remain incompletely understood. Whole-exome sequencing (WES) studies have identified rare variants in genes associated with estrogen receptor signaling pathways, which are implicated in the sexual differentiation of the brain. WES of 30 transgender individuals (13 trans men, 17 trans women) identified 21 rare variants in 19 genes linked to estrogen receptor pathways involved in brain sexual differentiation, suggesting a potential genetic basis for gender identity.11 Additionally, polymorphisms in the estrogen receptor α gene (ESR1) have been associated with female-to-male gender incongruence, suggesting a genetic component to gender identity development.12 Furthermore, epigenetic mechanisms involving estrogen receptor α (ERα) have been shown to play a role in brain sexual differentiation, indicating a complex interplay of genetic and hormonal factors.13 However, these findings are preliminary, and further research is necessary to elucidate the multifaceted biological underpinnings of gender identity.
In this study, we aim to explore potential genetic contributions to gender identity by performing WES on genomic DNA from 23 individuals experiencing gender incongruence. Our goal is to identify rare genetic variants in pathways related to brain sexual differentiation, contributing to a more comprehensive understanding of the biological aspects of gender identity.
Methods
Study design and participants
This study, conducted internationally, aimed to identify rare genetic variants associated with gender incongruence by WES. A total of 23 individuals (13 assigned male at birth and 10 assigned female at birth), all of whom self-identified as transgender, diagnosed with gender incongruence were recruited from the Shanghai Mental Health Center between March 2020 and February 2022. The inclusion criteria encompassed individuals aged 18 years or older who met the diagnostic criteria for gender incongruence as outlined in the ICD-10. The exclusion criteria included the presence of known chromosomal abnormalities or endocrine disorders. All the participants provided written informed consent, and the study was approved by the Ethics Committee of Shanghai Mental Health Center, Shanghai, China.
DNA extraction and quality assessment
Peripheral blood samples were collected from each participant, and genomic DNA was extracted using the QIAamp DNA Blood Mini Kit (Qiagen, Hilden, Germany) following the manufacturer’s protocol. The concentration and purity of the extracted DNA were assessed using the NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, United States), and integrity was evaluated by agarose gel electrophoresis.
Whole-exome sequencing
Whole-exome libraries were prepared using the SureSelect Human All Exon V6 kit (Agilent Technologies, Santa Clara, CA, United States). Sequencing was performed on the Illumina NovaSeq 6000 platform (Illumina, San Diego, CA, United States) to generate 150 bp paired-end reads, aiming for a minimum mean coverage depth of 100× across the targeted exonic regions.
Bioinformatics analysis
Raw sequencing data were processed through a standardized bioinformatics pipeline. First, FastQC (v0.11.9; Babraham Bioinformatics, Cambridge, United Kingdom) was used to assess the quality of raw reads. High-quality reads were then aligned to the human reference genome (GRCh38/hg38) using BWA-MEM (v0.7.17; Wellcome Trust Sanger Institute, Cambridge, United Kingdom). Variant calling was conducted using GATK HaplotypeCaller (v4.1.9.0; Broad Institute, Cambridge, MA, United States) to identify single nucleotide variants (SNVs) and insertion and deletion (indel) variants. Identified variants were annotated using ANNOVAR (2018Apr16; Wang Genomics Lab, University of Pennsylvania, Philadelphia, PA, United States), with annotation sourced from multiple public variant databases, including ExAC, the 1000 Genomes Project, and ESP6500.
Variant filtering
To identify potentially pathogenic variants, we applied a multistep filtering process. Variants with a minor allele frequency (MAF) of less than 0.01 in ExAC, 1000 Genomes, and ESP6500 databases were retained. Only variants classified as pathogenic or likely pathogenic based on the American College of Medical Genetics and Genomics (ACMG) guidelines were considered. Furthermore, we prioritized variants with predicted high functional impact, including frameshift mutations, nonsense mutations, canonical splice-site changes, and missense variants with a Combined Annotation Dependent Depletion (CADD) score of 20 or greater (Figure 1). Variants found within segmental duplications or other high-homology genomic regions were excluded to reduce the likelihood of false-positive findings.14
Figure 1.
Workflow for filtering and functionally annotating whole-exome sequencing variants. Starting from 266 265 raw variants detected in 23 samples, three sequential filters were applied: (1) minor allele frequency < 1% in ExAC, 1000 genomes, and ESP6500si v2; (2) ACMG class 3/4 variants (including frameshift, splicing, nonsynonymous SNVs, and CADD score ≥ 20); and (3) exclusion of variants in genomicSuperDups. This reduced the set to 1741 candidate variants. Variants absent from dbSNP147/138 (n = 479) were then categorized by type (345 nonsynonymous SNVs, 10 splicing, and 124 frameshifts), and further prioritized by in silico protein function prediction tools-Sorting Intolerant From Tolerant (SIFT), Polymorphism Phenotyping v2 (PolyPhen-2), Functional Analysis through Hidden Markow Models (FATHMM), Likelihood Ratio Test (LRT)—and by evolutionary conservation scores-Genomic Evolutionary Rate Profiling++ (GERP++) phyloP, phastCons analyses-yielding 27 high-confidence nonsynonymous SNVs. Meanwhile, splicing and frameshift events were tallied (7 splices, 43 deletions, and 28 insertions). Finally, KEGG and GO enrichment on the 1741 filtered variants highlighted pathways related to estrogen signaling and neuronal development.
Functional enrichment analysis
Genes harboring filtered, rare, and potentially deleterious variants were subjected to functional enrichment analysis. Gene Ontology (GO) analysis was performed using the DAVID Bioinformatics Resources (v6.8; National Institute of Allergy and Infectious Diseases, Bethesda, MD, United States) to explore enriched biological processes, molecular functions, and cellular components. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment was analyzed using the KEGG Mapper tool (Kyoto University Bioinformatics Center, Kyoto, Japan) to identify affected biological pathways. Empirical P values were calculated by comparing observed enrichment scores with distributions derived from 1000 randomly selected gene sets, matched for gene length and GC content, to control for potential biases.
Results
This study initially enrolled 23 transgender individuals, comprising 13 transgender males (individuals assigned female at birth who transitioned to male) and 10 transgender females (individuals assigned male at birth who transitioned to female). WES achieved a coverage of 98.36% with an average read depth of 75, revealing 266 265 genetic variations. Among these, SNVs accounted for 217 757, with mutations in the exonic region making up 26.26% and nonsynonymous mutations in the exonic region accounting for 49.76%. Additionally, there were 48 508 indels with mutations in the exonic region representing 4.09%, mutations in the splicing region accounting for 0.28%, insertions in the exonic region leading to frameshift mutations at 13.77%, and deletions causing frameshift mutations at 19.40% (Figures 2 and 3). Further investigation was required to identify 1741 variants following screening.
Figure 2.
Mutation function distribution map. Distribution of target single nucleotide variants (SNVs) and insertions/deletions (indels) across genomic regions and their associated functions. The top-left pie chart shows the genomic distribution of target SNVs, highlighting their prevalence in coding regions. The top-right pie chart details the functional classification of exonic SNVs, emphasizing nonsynonymous variants as the most common. The bottom-left pie chart presents the genomic distribution of target indels, demonstrating their occurrence in coding and noncoding regions. Lastly, the bottom-right pie chart breaks down the functional types of exonic indels, with a focus on frameshift mutations. Each chart employs a distinct color scheme to enhance clarity and is accompanied by a legend for straightforward interpretation.
Figure 3.
Circles and distribution of filtered genes from whole exome sequencing. The outer ring displays the density of single nucleotide polymorphisms (SNPs), with higher-density regions indicated by more intense shading. The middle ring shows transitions and transversions, providing insight into the types of genetic variations observed. The inner ring highlights the total number of insertions and deletions (indels), with insertions and deletions represented by distinct visual patterns. The accompanying legend on the right details the total counts of SNPs and indels, as well as specific transition and transversion types. This comprehensive view aids in understanding the genomic landscape of the filtered genes.
KEGG and GO enrichment analyses were conducted on the identified variants, followed by further filtering to retain 479 variants not present in dbSNP. Subsequently, protein function analysis demonstrated that these variants are related to the development of estrogen and estrogen receptor neurons (Figure 4).
Figure 4.

KEGG and GO pathway enrichment analysis of mutant genes affecting susceptibility, and functional enrichment analysis of genes with rare variants from whole exome sequencing. In the KEGG panel, the “axon guidance” pathway is emphasized due to its critical role in neural development. The BP panel highlights “neuron development” and “cell morphogenesis involved in neuron differentiation,”underscoring their significance in neuron specialization. The CC panel spotlights identifies “neuron projection,” as vital for neuron structure and function. Lastly, the MF panel includes “metallopeptidase activity” and “cytoskeletal protein binding,” which are essential for various biological processes. Each panel is displayed using a bubble plot, where bubble size reflects the number of genes and enrichment significance is represented by gradient shading or position along the significance axis (−log10(P value)). Abbreviations: BP, biological process; CC, cellular component; MF, molecular function.
Nonsense variants
After filtering, the 27 nonsensical variants initially identified by WES still remained (Table 1). Each variant was found in a single subject and was heterozygous. Conservative evolutionary analysis revealed ATP7A as an X chromosome mutation, with all genes situated in conserved regions. Furthermore, mutations in AGRN, KLKB1, ABCA1, KCNJ14, and ARFRP1 were positioned at transcription factor binding sites.
Table 1.
Ambiguous variants called through whole exome sequencing.
| Gene.refGene | Chrom | Pos | Ref | Alts | Alt | Het (n) | Hom (n) |
|---|---|---|---|---|---|---|---|
| AGRN | chr1 | 979016 | G | A | A | 1 | 0 |
| CYP1B1 | chr2 | 38298404 | C | T | T | 1 | 0 |
| CNTNAP5 | chr2 | 124979332 | A | G | G | 1 | 0 |
| ITPR1 | chr3 | 4730250 | C | A | A | 1 | 0 |
| KLKB1# | chr4 | 187179251 | G | T | T | 1 | 0 |
| CHRNA9 | chr4 | 40351422 | C | T | T | 1 | 0 |
| SLC6A7 | chr5 | 149582254 | G | C | C | 1 | 0 |
| ARSI | chr5 | 149681654 | G | A | A | 1 | 0 |
| OPRM1 | chr6 | 154412447 | C | G | G | 1 | 0 |
| WASHC5 | chr8 | 126093990 | T | C | C | 1 | 0 |
| CTHRC1 | chr8 | 104387987 | G | A | A | 1 | 0 |
| ABCA1 | chr9 | 107568541 | C | T | T | 1 | 0 |
| NOTCH1 | chr9 | 139412284 | T | C | C | 1 | 0 |
| TSPAN14 | chr10 | 82267123 | T | C | C | 1 | 0 |
| EXT2 | chr11 | 44148477 | C | G | G | 1 | 0 |
| SLC10A2 | chr13 | 103703646 | C | A | A | 1 | 0 |
| POMT2 | chr14 | 77746787 | G | A | A | 1 | 0 |
| PRKD1 | chr14 | 30135343 | A | G | G | 1 | 0 |
| SLC7A7 | chr14 | 23243690 | A | G | G | 1 | 0 |
| ATP10A | chr15 | 25932897 | A | G | G | 1 | 0 |
| GALK2 | chr15 | 49493405 | C | A | A | 1 | 0 |
| PHKB | chr16 | 47545585 | T | A | A | 1 | 0 |
| EML2 | chr19 | 46119077 | T | A | A | 1 | 0 |
| KCNJ14^ | chr19 | 48965151 | G | A | A | 1 | 0 |
| ARFRP1 | chr20 | 62338359 | C | G | G | 1 | 0 |
| NOL12 | chr22 | 38086786 | C | T | T | 1 | 0 |
| ATP7A& | chrX | 77287057 | G | A | A | 1 | 0 |
Note: Gene.refGene: according to the RefSeq Gene database, the name of the gene where the mutation site is located; Chrom: chromosome ID; Pos: location on the chromosome, starting from 1; Ref is the reference base. The ALTS software calls out bases that are inconsistent with the reference base. When there is more than one base, it is separated by a comma and displayed in multiple rows, with the ALT in each row being one of the ALTS bases; the ALT software calls out bases that are inconsistent with the reference base, usually consistent with the ALTS column. When there are multiple ALTS, ALT corresponds to a nonreference base in the ALTS and selects the most reasonable base. Het (n) lists the number of heterozygous mutations and Hom (n) the number of homozygous mutations.
Frameshift variants
Among the code shift variants identified by WES, 43 were filtered out (Table S1). Mutations in AKAP12, ZBTB39, ZNF213, RPRD1A, and BLVRB were located at the transcription factor binding site. Additionally, AKAP12, CNOT2, ZNF626, and VSIG4 exhibited homozygous mutations, with ZNF626 showing the highest number of such mutations. The total number of TBP mutations was the highest, and VSIG4 was associated with an X chromosome mutation.
Further analysis revealed that frameshift insertions were identified through WES and mapped to conserved elements based on the 46 way vertebrate alignment (Table S2). Mutations in FOXO6, MAST2, KIAA1109, H1-6, ANKS1A, RERGL, MBTPS1, ABR, TEKT1, and DCAF15 were found to be located at transcription factor binding sites, with CHST15 showing the highest number of frameshift insertion mutations. Additionally, TNFAIP6, CHST15, CNOT2, and DCAF15 were identified as homozygous mutations, with DCAF15 having the highest number of homozygous mutations.
Splice-region variants
Among the splice variants initially identified by WES, 7 were retained after filtration (Table 2). CFAP94, GRIN2C, and CD24 exhibited pure sum mutations, with CFAP94 showing the highest number of mutations. CD24 represents a Y chromosome mutation. SHANK3 phastConsElements46way received the highest and most conservative rating.
Table 2.
Splicing variants called by whole exome sequencing.
| Gene.refGene | Chrom | Pos | Ref | Alts | Alt | Het (n) | Hom (n) |
|---|---|---|---|---|---|---|---|
| HNRNPR* | chr1 | 23640196 | C | CT | CT | 1 | 0 |
| GEMIN5* | chr5 | 154287380 | T | TACA | TACA | 1 | 0 |
| ZC3HAV1 | chr7 | 138764989 | GC | G | G | 1 | 0 |
| HMCN2* | chr9 | 133305073 | GTACGGGGACACCCACCCTCTGGCCACACCGCTGCAGCTGCCCCAGGGGTTACCGGATGCAGGGCCCCAGCCTGCCCTGCCTAGT | G | G | 1 | 0 |
| GLT6D1* | chr9 | 138530956 | TATTTAC | T | T | 1 | 0 |
| CFAP94* | chr12 | 25261759 | T | TAAAAAAAAAAAA, TAAAAAAAAAAAAAAAAAAAAA, TAAAAAAAAAAAAAAAAAAAA | TAAAAAAAAAAAAAAAAAAAA | 1 | 4 |
| GRIN2C* | chr17 | 72848752 | T | TG | TG | 0 | 1 |
| SHANK3*^ | chr22 | 51135989 | GTT | G | G | 1 | 0 |
| CD24 | chrY | 21154 705 | C | CCAGGAAAGCCACAA, CCAGGAAAGCCACAATAGCCGTGACG | CCAGGAAAGCCACAA | 2 | 2 |
| CD24 | chrY | 21154705 | C | CCAGGAAAGCCACAA, CCAGGAAAGCCACAATAGCCGTGACG | CCAGGAAAGCCACAATAGCCGTGACG | 2 | 2 |
Missense variants
The mutation identified through whole exome sequencing in three or more patients involves 11 genes (Table 3). Among these genes, the phastConsElements46way score was highest for CNOT2, indicating a more conservative nature. Furthermore, the DCAF15 mutation was found at a transcription factor binding site when compared to the UCSC tfbsConsSite database, with the highest number of TBP and CHST15 mutations. ZNF626 exhibited the highest number of homozygous mutations, while CD24 was associated with a Y chromosome mutation.
Table 3.
Variants called by whole exome sequencing in three or more than three patients.
| Gene.refGene | Chrom | Pos | ExonicFunc.refGene | Ref | Alts | Het (n) | Hom (n) | N |
|---|---|---|---|---|---|---|---|---|
| TBP | chr6 | 170871054 | Frameshift deletion | AGCAGCAGCAGCAGCAGCAGCAG | ACAGCAGCAGCAGCAGCAGCAG, ACAGCAGCAG, A, ACAGCAG, ACAGCAGCAGCAG | 19 | 3 | 22 |
| CHST15 | chr10 | 125780752 | Frameshift insertion | CG | C, CGGG | 16 | 6 | 22 |
| TNFAIP6 | chr2 | 152236045 | Frameshift insertion | TA | TAA, T | 17 | 3 | 20 |
| CNOT2^ | chr12 | 70747693 | Frameshift deletion/insertion | TAA | T, TA, TAAA | 9 | 6 | 15 |
| DCAF15· | chr19 | 14070706 | Frameshift insertion | A | AGGTGGGCCCAGGGCGGGCAG, AGGTGTGCCCAGGGCGGGCAG | 6 | 8 | 14 |
| ANKLE1 | chr19 | 17397456 | Frameshift deletion/insertion | GGTGTGTGTGT | GGTGTGT, GGTGTGTGT, GGTGTGTGTGTGT, GGTGTGTGTGTGTGTGT, G | 12 | 0 | 12 |
| ZNF626 | chr19 | 20807133 | Frameshift deletion | GGCTTTGCCACATTCTTCACATTTGTAGAATTTCTCTCCAGTATGATTCTCTCATGTGTAGTAAGGATTGAGGACTGGTTGAAGGCTTTGCCACATTCTTCACATTTGTAGGGTCTCTCTCCAGTATGAATTTTCTTATGTGTAGTAAGGTTAGAGGAGCACTTAAAA | G | 0 | 12 | 12 |
| DENND4B | chr1 | 153907306 | Frameshift insertion | T | TGC, TGCTGCTGCTGCTGCTGCTGC, TGCTGCTGCTGC | 6 | 0 | 6 |
| RBM5 | chr3 | 50155887 | Frameshift insertion | TGA | T, TGAGAGA | 5 | 0 | 5 |
| CFAP94 | chr12 | 25261759 | Splicing | T | TAAAAAAAAAAAA, TAAAAAAAAAAAAAAAAAAAAA, TAAAAAAAAAAAAAAAAAAAA | 1 | 4 | 5 |
| CD24 | chrY | 21154705 | Splicing | C | CCAGGAAAGCCACAA, CCAGGAAAGCCACAATAGCCGTGACG | 2 | 2 | 4 |
| SERPINA1 | chr14 | 94845829 | Frameshift insertion | G | GAC | 3 | 0 | 3 |
Candidate gene variations are associated with neuronal development involving estrogen and estrogen receptor signaling.
Comparison of mutations in the estrogen and estrogen receptor signaling pathways identified 19 associated genes, including CYP1B1, CNOT2, AGRN, CHRNA9, OPRM1, CTHRC1, WASHC5, NOTCH1, PRKD1, KCNJ14, ATP7A, CCDC141, IFT88, FOXO6, ANKS1A, ABR, FBXO7, HNRNPR, and SHANK3 (Table 4) Among these, CNOT2 was found to be the most evolutionarily conserved. CNOT2 functions as a negative regulator of the estrogen receptor signaling pathway, whereas ATP7A, an X-linked gene, is implicated in neuronal functions.
Table 4.
Variants of candidate genes related to estrogen and estrogen receptor and neuron development.
| Relation | ExonicFunc.refGene | Gene.refGene | Chrom | Pos | Ref | Alts | Description |
|---|---|---|---|---|---|---|---|
| Estrogen and estrogen receptor | Nonsynonymous SNV | CYP1B1 | chr2 | 38298404 | C | T | The enzyme encoded by this gene localizes to the endoplasmic reticulum and metabolizes procarcinogens such as polycyclic aromatic hydrocarbons and 17 beta-estradiol. |
| Frameshift insertion | CNOT2^ | chr12 | 70747693 | TAA | T, TA, TAAA | The protein encoded by this gene is involved in the negative regulation of intracellular estrogen receptor signaling pathway | |
| Neuron development | Nonsynonymous SNV | AGRN | chr1 | 979016 | G | A | The encoded protein contains several laminin G, Kazal-type serine protease inhibitor, and epidermal growth factor domains. |
| CHRNA9 | chr4 | 40351422 | C | T | This protein is involved in cochlea hair cell development and is also expressed in the outer hair cells (OHCs) of the adult cochlea. | ||
| OPRM1 | chr6 | 154412447 | C | G | This gene encodes one of at least three opioid receptors in humans: the mu opioid receptor (MOR). The MOR is the principal target of endogenous opioid peptides and opioid analgesic agents such as beta-endorphin and enkephalins. | ||
| CTHRC1 | chr8 | 104387987 | G | A | This locus encodes a protein that may play a role in the cellular response to arterial injury through involvement in vascular remodeling. | ||
| WASHC5 | chr8 | 126093990 | T | C | This gene encodes a 134 kDa protein named strumpellin that is predicted to have multiple transmembrane domains and a spectrin-repeat-containing domain. This ubiquitously expressed gene has its highest expression in skeletal muscle. | ||
| NOTCH1 | chr9 | 139412284 | T | C | Notch signaling is an evolutionarily conserved intercellular signaling pathway that regulates interactions between physically adjacent cells through binding of Notch family receptors to their cognate ligands. | ||
| PRKD1 | chr14 | 30135343 | A | G | The protein encoded by this gene is a serine/threonine protein kinase involved in many cellular processes, including cell migration and differentiation, MAPK8/JNK1 and Ras pathway signaling, MAPK1/3 (ERK1/2) pathway signaling, cell survival, and regulation of cell shape and adhesion. | ||
| KCNJ14 | chr19 | 48965151 | G | A | This gene encodes a transmembrane protein that functions in copper transport across membranes. This protein is localized to the trans-Golgi network, where it is predicted to supply copper to copper-dependent enzymes in the secretory pathway. | ||
| ATP7A& | chrX | 77287057 | G | A | This protein is localized to the trans-Golgi network, where it is predicted to supply copper to copper-dependent enzymes in the secretory pathway. | ||
| Frameshift deletion | CCDC141 | chr2 | 179733903 | TTC | T | This gene is predicted to be involved in axon guidance and cell adhesion. It is also predicted to act upstream of or within centrosome localization and cerebral cortex radially oriented cell migration; to be located in the centrosome, cytoplasm, and plasma membrane; and to be active in neuron projections. | |
| IFT88 | chr13 | 21179185 | TACTA | T | This gene encodes a member of the tetratricopeptide repeat (TPR) family. The encoded protein is involved in cilium biogenesis. | ||
| Frameshift insertion | FOXO6 | chr1 | 41847881 | CCA | CGGGACGCCCGCCTACA, C | It is predicted to enable DNA-binding transcription factor activity, RNA polymerase II-specific and RNA polymerase II cis-regulatory region sequence-specific DNA binding activity and predicted to be involved in positive regulation of dendritic spine development and regulation of transcription by RNA polymerase II. | |
| ANKS1A | chr6 | 35046418 | A | AC | It is predicted to enable ephrin receptor binding activity and predicted to be involved in ephrin receptor signaling pathway, neuron remodeling, and substrate-dependent cell migration. | ||
| ABR | chr17 | 934891 | T | TG | The protein encoded by this gene contains a GTPase-activating protein domain, a domain found in members of the Rho family of GTP-binding proteins. | ||
| FBXO7 | chr22 | 32880039 | C | CT | The F-box proteins constitute one of the four subunits of the ubiquitin protein ligase complex called SCFs (SKP1-cullin-F-box), which function in phosphorylation-dependent ubiquitination. | ||
| Splicing | HNRNPR | chr1 | 23640196 | C | CT | This gene encodes an RNA-binding protein that is a member of the spliceosome C complex, which functions in pre-mRNA processing and transport. The encoded protein also promotes transcription at the c-fos gene | |
| SHANK3 | chr22 | 51135989 | GTT | G | Shank proteins are multidomain scaffold proteins of the postsynaptic density that connect neurotransmitter receptors, ion channels, and other membrane proteins to the actin cytoskeleton and G-protein-coupled signaling pathways. |
Discussion
Although susceptibility factors remain unclear, prior studies suggest a genetic component.15 Our study investigated potential genetic influences, focusing on ER-driven neurodevelopmental pathways in brain regions such as the ventromedial nucleus of the hypothalamus (VMN), medial preoptic area (mPOA), anteroventral periventricular nucleus (AVPV), and arcuate nucleus. These pathways are more active in males due to a pubertal testosterone surge, which is converted to estradiol and promotes male-typical behavior.16,17 In contrast, limited ER activation in females leads to different developmental outcomes.16 In humans, gender-specific brain development is less understood, but parallels are seen in gender anxiety, which often worsens during adolescence. Prenatal androgen exposure, such as in females with congenital adrenal hyperplasia, is linked to more male-typical behavior and higher rates of gender anxiety.18-21 Some males with gender anxiety are later diagnosed with congenital hypogonadotropic hypogonadism, characterized by low androgen levels.17,22 Given the conservation of estrogen metabolism in mammals and the observed links between hormonal influence, neurodevelopment, and behavior, exploring related genetic variations may provide insights into the origins of gender anxiety and identity.
ER stimulation in VMN presynaptic neurons triggers glutamate release, activating postsynaptic N-methyl-D-aspartate (NMDA)receptors and downstream mitogen-activated protein (MAP) kinase pathways, leading to increased dendritic spine density—an effect linked to male-typical behavior in animal models.16 Four gene variants associated with this pathway were identified: ABR and ATP7A (involved in vesicle fusion and neuronal migration), Shank3 (regulates NMDA receptors),23 and CHRNA9 (brain-expressed).24 Given the differing susceptibilities between transgender males and females, it is essential to analyze them separately. ER pathway activation may have region- and sex-specific effects, suggesting genetic variants may influence susceptibility to assigned gender at birth. One notable variant is CD24, homozygous in four susceptible individuals; it is Y-linked and associated with testicular damage.25,26
In addition to the previously mentioned candidate genes, WES identified several unique variants that do not fall within known sex-dimorphic neurodevelopmental pathways. Given that the genetic mechanisms underlying gender identity remain unclear, exploring these novel variants may uncover new neurodevelopmental pathways. These variants are not documented in the ExAC and dbSNP databases and were also absent in multiple nontransgender control samples, suggesting potential functional significance. Similarly, other studies using WES have identified rare variants not found in public databases, highlighting the effectiveness of WES in discovering novel genetic mutations.27
Genetic research often focuses on identifying variants associated with disease, but it is important to recognize that not all traits are pathological. In 2019, the World Health Organization reclassified “gender identity disorder” as “gender incongruence,” moving it from the mental disorders chapter to the sexual health chapter in the ICD-10, acknowledging that transgender identity is not a mental illness.22When investigating genetic factors related to gender identity, it’s crucial to use terminology that reflects this understanding. Terms like “pathogenic” or “mutation” may carry unintended negative connotations when applied to traits that are not disorders. Instead, researchers should focus on understanding the complex interplay of genetic, environmental, and social factors that contribute to gender identity, using terminology that respects the diversity and validity of transgender experiences. Gender identity is a multifaceted trait shaped by genetic, environmental, and social influences. The multigene threshold model posits that cumulative effects of multiple genetic variants contribute to the development of gender identity, allowing for a spectrum of phenotypes and challenging the notion that specific genetic variations can definitively predict an individual’s gender identity. Recent research utilizing WES has identified rare variants in genes associated with sexually dimorphic brain development, particularly within estrogen receptor–activated pathways.11 These findings suggest a potential genetic contribution to gender incongruence and underscore the utility of WES in uncovering novel genetic variations.
Our study identified rare variants in 19 candidate genes potentially associated with gender development pathways in the brain. These variants were initially detected through WES and subsequently validated using Sanger sequencing. They were not present in the ExAC and dbSNP databases, or in multiple nontransgender controls, suggesting that they are not common polymorphisms. While these neurodevelopmental pathways have not yet been fully elucidated in humans as they have in animals, we propose that the genes involved in these pathways offer a viable avenue for investigating the genetic underpinnings of gender incongruence in humans.
Supplementary Material
Contributor Information
Na Liu, Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Xuhui District, Shanghai 200030, China; Department of Psychiatry, Tongji Hospital of Tongji University, Tongji University School of Medicine, Shanghai 200065, China.
Jingyi Bai, Department of Psychiatry, Tongji Hospital of Tongji University, Tongji University School of Medicine, Shanghai 200065, China.
Nan Huang, Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Xuhui District, Shanghai 200030, China.
Yi Xu, Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Xuhui District, Shanghai 200030, China.
Xiangyun Long, Department of Psychiatry, The Affiliated Brain Hospital of Guangzhou Medical University, Guangzhou 510370, China.
Xinyi Hu, Department of Psychiatry, Tongji Hospital of Tongji University, Tongji University School of Medicine, Shanghai 200065, China.
Jiaxin Wu, Department of Psychiatry, Tongji Hospital of Tongji University, Tongji University School of Medicine, Shanghai 200065, China.
Fei Liu, Department of Psychiatry, Tongji Hospital of Tongji University, Tongji University School of Medicine, Shanghai 200065, China.
Zheng Lu, Department of Psychiatry, Tongji Hospital of Tongji University, Tongji University School of Medicine, Shanghai 200065, China; Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Xuhui District, Shanghai 200030, China.
Author contributions
Na Liu: Conceptualization; Data curation; Formal analysis; Investigation; Methodology; Project administration; Resources; Software; Visualization; Writing – original draft; Writing – review & editing. Jingyi Bai: Data curation; Formal analysis; Investigation; Resources; Software; Visualization; Writing – original draft; Writing – review & editing. Nan Huang: Data curation; Formal analysis; Investigation; Resources; Software; Visualization; Writing – original draft; Writing – review & editing. Yi Xu: Data curation; Formal analysis; Investigation; Resources; Software; Visualization; Writing – original draft; Writing – review & editing. Xiangyun Long: Data curation; Formal analysis; Investigation; Resources; Software; Visualization; Writing – original draft; Writing – review & editing. Xinyi Hu: Data curation; Formal analysis; Investigation; Resources; Software; Visualization; Writing – original draft; Writing – review & editing. Jiaxin Wu: Data curation; Formal analysis; Investigation; Resources; Software; Visualization; Writing – original draft; Writing – review & editing. Fei Liu: Data curation; Formal analysis; Investigation; Resources; Software; Visualization; Writing – original draft; Writing – review & editing. Zheng Lu: Conceptualization; Data curation; Formal analysis; Investigation; Methodology; Project administration; Resources; Software; Visualization; Writing – original draft; Writing – review & editing
Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Conflicts of interest
The authors have no competing interests to declare that are relevant to the content of this article.
Data availability
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Ethical approval
All the procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. The study was approved by the Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine (no. 2021-76).
Consent to participate
Informed consent was obtained from all the individual participants included in the study.
References
- 1. Reed GM, Drescher J, Krueger RB, et al. Disorders related to sexuality and gender identity in the ICD-11: revising the ICD-10 classification based on current scientific evidence, best clinical practices, and human rights considerations. World Psychiatry. 2016;15(3):205–221. 10.1002/wps.20354 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Dyachenko AV, Perekhov AY, Soldatkin VA, Bukhanovskaya OA. Gender identity disorders: current medical and social paradigm and the ICD-11 innovations. Consort Psychiatr. 2021;2(2):54–61. 10.17816/cp68 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Robles R, Keeley JW, Vega-Ramírez H, et al. Validity of categories related to gender identity in ICD-11 and DSM-5 among transgender individuals who seek gender-affirming medical procedures. Int J Clin Health Psychol. 2022;22(1):100281. 10.1016/j.ijchp.2021.100281 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Meyer IH. Prejudice, social stress, and mental health in lesbian, gay, and bisexual populations: conceptual issues and research evidence. Psychol Bull. 2003;129(5):674–697. 10.1037/0033-2909.129.5.674 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Monro S. Non-binary and genderqueer: an overview of the field. Int J Transgend. 2019;20(2-3):126–131. 10.1080/15532739.2018.1538841 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Thorne N, Yip AK, Bouman WP, Marshall E, Arcelus J. The terminology of identities between, outside and beyond the gender binary- a systematic review. Int J Transgend. 2019;20(2-3):138–154. 10.1080/15532739.2019.1640654 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Agapoff J. Supporting and understanding non-binary & gender diverse youth: a physician’s view. Child Adolesc Psychiatry Ment Health. 2024;18(1):105. 10.1186/s13034-024-00798-w [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Richards C, Bouman WP, Seal L, Barker MJ, Nieder TO, T'Sjoen G. Non-binary or genderqueer genders. Int Rev Psychiatry. 2016;28(1):95–102. 10.3109/09540261.2015.1106446 [DOI] [PubMed] [Google Scholar]
- 9. Strauss P, Cook A, Winter S, Watson V, Wright Toussaint D, Lin A. Associations between negative life experiences and the mental health of trans and gender diverse young people in Australia: findings from Trans Pathways. Psychol Med. 2020;50(5):808–817. 10.1017/s0033291719000643 [DOI] [PubMed] [Google Scholar]
- 10. Trivedi C, Rizvi A, Mansuri Z, Jain S. Mental health outcomes and suicidality in hospitalized transgender adolescents: a propensity score-matched cross-sectional analysis of the national inpatient sample 2016-2018. J Psychiatr Res. 2024;172(000):345–350. 10.1016/j.jpsychires.2024.02.043 [DOI] [PubMed] [Google Scholar]
- 11. Theisen JG, Sundaram V, Filchak MS, et al. The use of whole exome sequencing in a cohort of transgender individuals to identify rare genetic variants. Sci Rep. 2019;9(1):20099. 10.1038/s41598-019-53500-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Cortés-Cortés J, Fernández R, Teijeiro N, et al. Genotypes and haplotypes of the estrogen receptor α gene (ESR1) are associated with female-to-male gender dysphoria. J Sex Med. 2017;14(3):464–472. 10.1016/j.jsxm.2016.12.234 [DOI] [PubMed] [Google Scholar]
- 13. Gegenhuber B, Tollkuhn J. Epigenetic mechanisms of brain sexual differentiation. Cold Spring Harb Perspect Biol. 2022;14(11):a039099. 10.1101/cshperspect.a039099. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Swaab DF, Chung WC, Kruijver FP, Hofman MA, Ishunina TA. Sexual differentiation of the human hypothalamus. Adv Exp Med Biol. 2002;511:75–100; discussion 100-105. 10.1007/978-1-4615-0621-8_6 [DOI] [PubMed] [Google Scholar]
- 15. Polderman TJC, Kreukels BPC, Irwig MS, et al. The biological contributions to gender identity and gender diversity: bringing data to the table. Behav Genet. 2018;48(2):95–108. 10.1007/s10519-018-9889-z [DOI] [PubMed] [Google Scholar]
- 16. Wright CL, Schwarz JS, Dean SL, McCarthy MM. Cellular mechanisms of estradiol-mediated sexual differentiation of the brain. Trends Endocrinol Metab. 2010;21(9):553–561. 10.1016/j.tem.2010.05.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Clarkson J, Herbison AE. Hypothalamic control of the male neonatal testosterone surge. Philos Trans R Soc Lond Ser B Biol Sci. 2016;371(1688):20150115. 10.1098/rstb.2015.0115 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Babu R, Shah U. Gender identity disorder (GID) in adolescents and adults with differences of sex development (DSD): a systematic review and meta-analysis. J Pediatr Urol. 2021;17(1):39–47. 10.1016/j.jpurol.2020.11.017 [DOI] [PubMed] [Google Scholar]
- 19. Lee PA, Houk CP, Husmann DA. Should male gender assignment be considered in the markedly virilized patient with 46,XX and congenital adrenal hyperplasia? J Urol. 2010;184(4 Suppl):1786–1792. 10.1016/j.juro.2010.03.116 [DOI] [PubMed] [Google Scholar]
- 20. Zucker KJ. Epidemiology of gender dysphoria and transgender identity. Sex Health. 2017;14(5):404–411. 10.1071/sh17067 [DOI] [PubMed] [Google Scholar]
- 21. Foreman M, Hare L, York K, et al. Genetic link between gender dysphoria and sex hormone signaling. J Clin Endocrinol Metab. 2019;104(2):390–396. 10.1210/jc.2018-01105 [DOI] [PubMed] [Google Scholar]
- 22. Polderman TJC, Kreukels BPC, Irwig MS, et al. The biological contributions to gender identity and gender diversity: bringing data to the table. Behav Genet. 2018;48(2):95–108. 10.1007/s10519-018-9889-z [DOI] [PubMed] [Google Scholar]
- 23. Leahy JL, Bonner-Weir S, Weir GC. Abnormal insulin secretion in a streptozocin model of diabetes. Effects of insulin treatment. Diabetes. 1985;34(7):660–666. 10.2337/diab.34.7.660 [DOI] [PubMed] [Google Scholar]
- 24. Wang L, Guo J, Xi Y, et al. Understanding the genetic domestication history of the Jianchang duck by genotyping and sequencing of genomic genes under selection. G3 (Bethesda). 2020;10(5):1469–1476. 10.1534/g3.119.400893 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Hong T, Li R, Sun LL, et al. Role of the gene Phlda1 in fenvalerate-induced apoptosis and testicular damage in Sprague-Dawley rats. J Toxicol Environ Health A. 2019;82(15):870–878. 10.1080/15287394.2019.1664584 [DOI] [PubMed] [Google Scholar]
- 26. Yukselten Y, Aydos OSE, Sunguroglu A, Aydos K. Investigation of CD133 and CD24 as candidate azoospermia markers and their relationship with spermatogenesis defects. Gene. 2019;706:211–221. 10.1016/j.gene.2019.04.028 [DOI] [PubMed] [Google Scholar]
- 27. Jeroncic A, Memari Y, Ritchie GR, et al. Whole-exome sequencing in an isolated population from the Dalmatian island of Vis. Eur J Hum Genet. 2016;24(10):1479–1487. 10.1038/ejhg.2016.23 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.



