Skip to main content
PLOS One logoLink to PLOS One
. 2025 Sep 8;20(9):e0329472. doi: 10.1371/journal.pone.0329472

18 individual genes underwent variant screening in a northwest Chinese group comprised 83 probands diagnosed with early-onset high myopia

Yang Liu 1,#, Shao-Chi Zhang 2,#, Wen Zhang 2, Zhong-Qi Xue 3, Yi-Xuan Qin 2, Shun-Yu Piao 2, Wen-Jing Li 2, Meng-Li Ji 2, Wen-Juan Zhuang 2,*
Editor: Dror Sharon,4
PMCID: PMC12416727  PMID: 40920702

Abstract

Purpose

To investigate the variants in 18 disease-causing genes associated with nonsyndromic myopia in 83 Chinese individuals diagnosed with early-onset high myopia(eo-HM).

Methods

Variants in 18 candidate genes in 83 probands with eo-HM were distinguished by whole-exome sequencing (WES) and assessed by multistep bioinformatics analysis.

Results

Four likely pathogenic variants were detected in 4 of the 83 probands (4.8%) with eo-HM. All of these are missense variants, such as (NM_014452: c.443C > T) in TNFRSF21, (NM_013291: c.799C > G) in CPSF1, (NM_201269.3: c.3266A > G) in ZNF644, and (NM_001135195: c.577G > A) in SLC39A5. These variants were verified by Sanger sequencing, and all allele frequencies were less than 0.01 in the 1000G, ExAC, ESP6500, and gnomAD databases. In addition, the pathogenicity of these variants was determined using several computational tools including SIFT, Mutation Taster, Polyphen-2, PROVEAN, M-CAP, CADD, and DANN. However, it should be noted that the Tyr1089Cys variant was classified as neutral solely using PROVEAN.

Conclusion

Our findings support the hypothesis that the variants observed in TNFRSF21, CPSF1, ZNF644, and SLC39A5 are the causative genes of eo-HM and expand the spectrum of eo-HM variants observed across various ethnic groups. The dissemination of knowledge on the impact of TNFRSF21, CPSF1, ZNF644, and SLC39A5 on eo-HM is under investigation.

Introduction

High myopia, which represents a serious type of myopia, is the main reason of severe ocular complications [1]. Currently, the condition of high myopia remains serious, and it is expected that there will be a significant elevation in the total count of people with high myopia from 399 million throughout the year 2020–516 million by 2030 [2]. The situation in East Asia is even worse. The occurrence of high myopia was 21.6% in Korea, 21% in the Singaporean [34].The proportion of high myopia in China also increased from 7.9% to 16.6% [5].

It is widely acknowledged that the onset of high myopia is influenced by genetic and environmental variables [6]. Early-onset high myopia (eo-HM), observed in children under the age of 7 [7], is thought to be influenced by minimal environmental factors, such as lifestyle, diet, and UV light exposure, which are suggested to contribute to the onset and progression of non-syndromic myopia [8]. Genetic factors are the determined factor in the development [9]. Compared with late-onset myopia (lo-HM), which usually occurs in children aged 7 and above [10], it is closely related to environmental factors such as prolonged near work, limited outdoor activities, and inadequate exposure to natural light, all of which have been shown to contribute to the onset and progression of myopia [11]. Therefore, Compared to other forms of myopia, early-onset high myopia (eo-HM) is more strongly associated with genetic factors, which have been identified through extensive research and are crucial in understanding the underlying genetic contributions to the disease.

Up to now, with the development of whole-exome sequencing (WES), it has been reported that 18 genes were potentially causative for non-syndromic high myopia, such as eleven genes related to autosomal dominant: ZNF644 [12], SCO2 [13], SLC39A5 [14], CCDC111 [15], P4HA2 [16], BSG [17], CPSF1 [18], NDUFAF7 [19], TNFRSF21 [20], XYLT and DZIP1 [21]; four genes related to recessive inheritance, LRPAP1 [22], CTSH [23], LEPREL1 [24] and LOXL3 [25]; and three X-linked genes, ARR3 [26], ARR4 [27] and OPN1LW [28]. Combined with these previous investigations, we examined variants in the 18 known genes in a group of 83 northwest Chinese families suffering from eo-HM in order to enlarge the current genetic spectrum in different ethnic groups.

Materials and methods

Patient recruitment

Eighty-three patients from 83 unrelated families were admitted in this investigation from 21/5/2020–30/1/2024. All patients were diagnosed with eo-HM based on high myopia ≤ –6.00D after mydriasis, and the onset of high myopia presents less than 7 years old with no ocular or systemic disease [7]. Nuclear families collected have been obtained from the Ningxia eye hospital of Ningxia Hui Autonomous Area in the northwest of China, clinical information of 83 patients with high myopia have been show in S1 Table. After receiving written informed consent from each patient or their parents, genomic DNA was extracted from peripheral blood samples and the detailed clinical examinations were conducted. The present investigation followed the standards defined in the Declaration of Helsinki and has been approved from the institutional review board of People’s Hospital of Ningxia Hui Autonomous Region(2020-KY-GZR019).

WES and analyses

Genomic DNA was extracted from peripheral blood samples using a standard phenol–chloroform method. Whole exome sequencing (WES) was performed using the SureSelect Human All Exon V5 Kit (Agilent Technologies, USA) according to the manufacturer’s instructions, to enrich coding exons and flanking intronic regions across approximately 50 Mb of the human genome. The captured DNA libraries were then sequenced on the Illumina HiSeq 2500 platform, generating 150 bp paired-end reads with a mean coverage depth of over 100 × , ensuring reliable variant detection [29].

Raw sequencing reads were processed using FastQC for quality control, and adapters and low-quality reads were removed using Trimmomatic. Clean reads were then aligned to the human reference genome (GRCh37/hg19) using BWA-MEM. Post-alignment processing, including duplicate marking, local realignment around InDels, and base quality score recalibration, was conducted using the Genome Analysis Toolkit (GATK) best practices workflow [30]. Specifically, GATK Table Recalibration was used to improve variant quality, and both single-nucleotide variants (SNVs) and insertions/deletions (InDels) were identified using the GATK Unified Genotyper.

To interpret functional consequences, variants were annotated using ANNOVAR, which classifies SNVs and InDels based on their genomic context and predicted effects on protein-coding sequences, as previously reported [3132].

The filtration of WES data for the 18 genes from 83 probands in our study was performed along these lines: (1) alterations inside of the exonic and splicing site regions were extracted; (2) SNPs and indels (within 2 bp) with minor allele frequency (MAF) less than 0.01 within the 1000 Human Genome Project (1000G) (ftp://1000ge nomes.ebi.ac.uk/vol1/ftp), Exome Aggregation Consortium Database (ExAC) (http://exac.broadinstitute.org/), NHLBI Exome Sequencing Project (ESP6500) (http://evs.gs.washington.edu/EVS/) or the Genome Aggregation Database (gnomAD) (https://gnomad.broadinstitute.org/) were extracted; (3) synonymous alterations have no influence in splicing sites were eliminated; (4) missense variants anticipated to be benign utilizing various tools such as Variant Phenotyping v2 (Polyphen-2) (http://genetics.bwh.harvard.edu/pph2/), Sorting Intolerant From Tolerant (SIFT) (https://sift.bii.a-star.edu.sg/), Protein Variation Effect Analyzer (PROVEAN) (http://provean.jcvi.org/genome_submit_2.php), The MutationTaster2 (http://www.mutationtaster.org/) or Mendelian Clinically Applicable Pathogenicity (M-CAP) (http://bejerano.stanford.edu/mcap/) were eliminated; and (5) variants not heterozygous in Autosomal Dominant (AD) families, where the disease is typically inherited in a dominant pattern. Sanger sequencing validated the co-segregating conditions of the variants that remained after the filtering process (S1 Fig). Table 1 lists the primers designed for screening variants and validated using Sanger sequencing.

Table 1. PCR primers for sanger sequencing in this study.

primer name Forward Reverse
DZIP1-HM25 ACTCCCAGTGCTCGGTACAC CTCTTTGCAAGAAGGTGCAG
TNFRSF21-HM26 GTGAGGTGGAGCTGGAGAAG GGACCTTTACCAGGCATGAG
CPSF1-HM33 GGTACAACAGCGAGTTGACG CCTCCTCATCCTGTTTGAGC
OPN1LW-HM129 GGAAATGCCCAGTGTCTGTT GGACCACAGAGCCTTTCCTA
ZNF644-HM115 TTGTGGCTGATACATCAACGG AACACACTTGTCAGCTCTGTGG
SLC39A5-HM95 TGAGCTCAGGCAATCTACCC CTTCCAGGATTCAGGGTGTC

Bioinformatics analysis

All the variants detected in this study were assessed using a comprehensive bioinformatics pipeline. Raw sequencing reads were aligned to the human reference genome (GRCh37/hg19) using BWA-MEM (v0.7.17) [33]. GATK (v3.8) was employed for base quality recalibration, indel realignment, and variant calling through the UnifiedGenotyper module [34]. Variant quality was evaluated using standard hard filtering parameters recommended by the GATK workflow.

Subsequently, variants were annotated using ANNOVAR, integrating multiple public databases including 1000 Genomes, ExAC, ESP6500 and gnomAD to determine population frequency, predicted pathogenicity, and clinical relevance. Functional effects were further interpreted using in silico tools such as SIFT, PolyPhen-2, MutationTaster and so on.

The selection of these tools and parameters was based on their performance in previous studies and compatibility with our data quality. SIFT [35]is used to predict the spatial conformational changes of proteins caused by gene variants based on gene sequence homology, which influences the protein’s role. Polyphenon [36] was used to predict the potential influence of amino acid substitutions on the composition and function of proteins. Mutation Taster [37] can evaluate the pathogenic possibility of changes in DNA sequence, it is not only used to predict amino acid changes, but also to predict the functional consequences of short insertion or deletion (indel) alterations or both, and variants across intron-exon boundaries. PROVEAN [38] tool was developed to predict the impact of protein sequence variants on protein function. M-CAP [39] can correctly eliminate 60% of rare and uncertain missense variants in typical genomes with 95% sensitivity. Combined Annotation Dependent Depletion (CADD) [40] (https://cadd.gs.washing ton.edu/snv) is a software utilized for the purpose of prediction, which combines the variant of alleles, the pathogenicity of variation and other factors to build a model to evaluate each variant site, and give a specific score, referred to as C-Scores. CADD creates an original scoring algorithm to measure the harmful degree of a variant site. The technique known as Domain-Adversarial Training of Neural Networks (DANN) [41](https://cbcl.ics.uci.edu/public_data/DANN/) uses a neural network and integration algorithm to predict damaging variants on the basis of SIFT, PolyPhen, PROVEAN, and Mutation Taster. To compare sequence conservation among the eight species, Clustal Omega [42] (https://www.eb i.ac.uk/Tools/msa/clustalo/) was employed. Furthermore, I-TASSER predicted the three-dimensional structure protein modeling of variants and wild-type proteins [43] (https://zhanglab.ccmb.med. umich.edu/) and Swiss-PDB Viewer visualized the protein structure.

Results

After filtering variants across 18 candidate genes in a cohort of 83 individuals with eo-HM and multiple clinical features, six heterozygous variants were identified in DZIP1, TNFRSF21, CPSF1, OPN1LW, ZNF644, and SLC39A5, each found in six unrelated families.

To validate these findings, Sanger sequencing was performed for all candidate variants in available family members. Notably, the Thr658Ala variant in DZIP1 and the Leu153Met variant in OPN1LW were subsequently excluded from further consideration due to lack of co-segregation with the phenotype in the respective families, suggesting these variants are less likely to be pathogenic. The remaining four variants were further evaluated for pathogenicity based on the American College of Medical Genetics and Genomics (ACMG) guidelines [44], taking into account population frequency, computational predictions, segregation data, and functional domain involvement. These findings are discussed in the context of their potential roles in the genetic etiology of non-syndromic high myopia. There were four missense variants included (NM_014452: c.443C > T) in TNFRSF21, (NM_013291: c.799C > G) in CPSF1, (NM_201269.3: c.3266A > G) in ZNF644 and (NM_001135195: c.577G > A) in SLC39A5 for likely pathogenic (Table 2). Moreover, it should be noted that the Gln267Glu variant in CPSF1 was positioned within domain locations, whereas the remaining variants were not situated in such locations (Fig 1). Furthermore, the scores of CADD and DANN were calculated in TNFRSF21, CPSF1, ZNF644 and SLC39A5 as 24.8,0.998; 23.2,0.993; 24.5,0.998 and 31,0.999, respectively.

Table 2. Summary of mutations in ZNF644, SLC39A5, CPSF1, TNFRSF21,OPN1LWand DZIP1.

Patients ID Gene Inheritance Sex Age at oneset Age at exam Refraction Al Chr.position Exon Mutation Status SIFT PolyPhen2 PROVEAN Mutation Taster M-CAP CADD score DANN score 1000G ExAC ESP6500 genomAD
OD OS OD OS ALL EA ALL EA ALL EA ALL EA
26 TNFR SF21 AD F EC 45 −21.00 −21.50 31.21 31.58 Chr6.47253985 2 c.443C > T p.T148M Het D D D D NA 24.8 0.998 None None 0.001 0.0001 0.0005 0.0007 0.0015 None
33 CPSF1 AD M 3 11 −6.00 −5.75 26.11 26.16 Chr8.145625775 8 c.799C > G p.Q267E Het NA P NA D NA 23.2 0.993 0.0014 0.002 0.0027 0.006 0.0005 None 0.0034 0.0055
115 ZNF644 AD F 2 22 −9.75 −9.75 28.39 28.55 Chr1.91403464 4 c.3266A > G p.Y1089C Het D D N D D 24.5 0.998 0.0014 0.007 0.0003 0.0032 None None 0.0002 0.0031
95 SLC 39A5 AD M EC 21 −7.25 −6.75 26.75 26.45 Chr12.56628713 4 c.577G > A p.D193N Het D D D D D 31 0.999 0.0002 0.001 0.00007462 0.0003 None None 0.00007319 0.0003

F, female; Het, heterozygous; M, male; N, neutral; OD, right eye; OS, left eye; Chr, Chromosome; D,damaging; EA,East Asia;EC,early childhood; NA, not applicable; P, probably damaging.

Fig 1. Location of the potentially pathogenic variants in TNFRSF21, CPSF1, ZNF644 and SLC39A5.

Fig 1

Exons of human TNFRSF21, CPSF1, ZNF644 and SLC39A5(upper),and positions of variants corresponding to the protein model with functional domains highlighted(under).We identified four heterozygous variants in this study. The Gln267Glu in CPSF1 were located in domain regions, but the other variants were not. TNFRSF21(colored blue) domain plays a role in T-helper cell activation, and may be involved in inflammation and immune regulation(A). SFT1(colored green) is involved in mRNA cleavage and polyadenylation specialization(B). C2H2 zinc fingers(colored purple) are the motifs take part in mitochondrial complex I activity (C). The ZIP(colored orange) domain is contributed to the zinc transport (D).

TNFRSF21 variant

One heterozygous variant in TNFRSF21 were found from family26 (Fig 2A). Before the age of seven, the patient and affected family member had high myopia. During the examination, it was observed that the right eye had a high myopia measuring −21.00 D, accompanied by an ocular AL of 31.21 mm. Similarly, the left eye exhibited a myopia measuring −21.50 D, accompanied by an ocular AL of 31.58 mm. The missense variant identified in TNFRSF21 (NM_014452: c.443C > T) was evaluated using various computational tools, including SIFT, Polyphen-2, PROVEAN, MutationTaster2, and M-CAP. Collectively, the results of these analyses suggest that this variant is likely to have a damaging effect. Neither was detected in the 1000 Genomes (1000G) or gnomAD-east databases. The substitution Thr148Met showed high conservation when compared across various sequence alignments in species that were homologous (Fig 3A). This finding highlights the significant role of this site in protein functions. Interestingly, the three-dimensional structure of the protein did not show any obvious changes in its function due to the identified mutation(Fig 4A).

Fig 2. Four potentially pathogenic variants detected in this study.

Fig 2

The black arrow represents the patient. WT: Wild type; MT: Mutation. From left to right: pedigree plots of variants, sequences from affected individual with identified variant, sequences from wild type, clinical information regarding the segregation of the identified variants within the families is included in S1 Table.

Fig 3. Conservation analysis revealed evolutionary conservation of the variant.

Fig 3

It shows that multiple alignments of the amino acids from different species. The arrow indicates the location of the variants(A-D).

Fig 4. Predicted three-dimensional structure of proteins.

Fig 4

Predicted crystal structures of wild type(left) and variant (right)proteins. Yellow shows residue of wild type and variant, green represents residues interact with wild type(left) and variant residue(right)(A-D).

CPSF1 variant

A missense variant was detected in CPSF1 in an adolescent aged 11 years (Fig 2B) with low vision in childhood according to his parent’s narration. The diopter of both eyes is close to −6.00 D and the axial length of both eyes is more than 26 mmm without oculopathy or systemic disease. A novel missense variant c.799C > G was detected in exon 8 and was suggested to be a likely pathogenic symptom via Polyphen-2, SIFT, PROVEAN, and MutationTaster2. The variant is located on the SFT1 superfamily mRNA cleavage and polyadenylation specificity factor, which may play a role in RNA processing and modification. In addition, Gln267Glu showed highly conserved amino acid residues in various species (Fig 3B) and new hydrogen bond formation in the protein function from its three-dimensional structure (Fig 4B). This alteration is very rare, with a population frequency of 0.01% according to gnomAD. The frequencies of this alteration in other databases are also presented in Table 2.

ZNF644 and SLC39A5 variants

(NM_201269.3: c.3266A > G) within ZNF644 and (NM_001135195: c.577G > A) within SLC39A5 have been detected through family 115 and 95, which were verified by family co-segregation (Fig 2C, D). These variants occurred at an extremely conserved region in eight species (Fig 3C, D). According to SIFT, Polyphen-2, MutationTaster2, PROVEAN and M-CAP, the c.3266A > G in ZNF644 and c.577G > A in SLC39A5 were disease-causing, except the c.3266A > G in ZNF644 was neutral by PROVEAN. Both of these variants were detected as rare SNPs in ExAC. The substitutions are predicted to have a significant impact on the protein fold and stability in the three-dimensional structures of ZNF644 and SLC39A5 (Fig 4C, D).

Discussion

To date, at least 18 causative genes associated with high myopia have been discovered [1221], however, no related studies have been conducted in populations from Northwest China. In this study, we screened 18 causative genes in 83 patients from 83 unrelated families using WES, four likely pathogenic variants in TNFRSF21, CPSF1, ZNF644 and SLC39A5 were detected. All alterations that occurred within these genes affected the coding region role and were not found or rare in four observed databases (ExAC,1000G, ESP6500 and gnomAD). PolyPhen-2, SIFT, PROVAN, Mutation Taster, and M-CAP assessed the impact of amino acid changed by sequence variant, respectively. CADD combines multiple factors such as allelic variant and variant pathogenicity to evaluate each variant site while DANN uses neural network algorithms to evaluate the harmfulness of variants.

Pan [45] etal first discovered TNFRSF21 is associated with high myopia through a large Chinese family, a novel missense variant Pro146Ala was identified and three uncommon heterozygous alteratioms (Pro202Leu, Glu240Ter and Ala440Gly) in TNFRSF21 were observed in the screening of 220 unrelated individuals with HM in the same study. The Pro146Ala variant was found to significantly enhance the proliferation of adult retinal pigment epithelial cell line-19 cells in comparison with the wild type. Herein, we found a heterozygous missense variant Thr148Met in TNFRSF21, since the Pro146Ala variant might result in HM via the regulation of the apoptosis of myopia-related cells [46], the distance between Thr148Met and Pro146Ala is very close, whether Thr148Met acts on myopia in the same mechanism as Pro146Ala needs further explored.

A total of 6 variants (Phe1291Ter, Val943LeufsTer65, Gln620Ter, Tyr5Ter, Asp1275Tyr and c.4146-2A > G) within CPSF1 have been recognized in 6 of 623 probands suffering from eo-HM and two alterations were proved to be related to retinal ganglion cell in zebrafish in previous study [47]. Herein, we found a novel missense variant Gln267Glu in CPSF1. which located in the SFT1 domain, It is postulated that this substitution could potentially impact the stability of the protein construction. Further studies of how the variant of CPSF1 is connected with high myopia are still needed.

ZNF644, which is positioned on chromosome 1p22.2 and includes 6 exons, functions as a transcription factor with C2H2 Kruppel type zinc finger domains [48]. This protein underwent expression not only in the human retina and the pigment epithelium of retina, but also appears to have a function in ocular wall development. Elongation of the eye’s axial length was revealed to be a hallmark characteristic of high myopia [49]. The change of ZNF644 proteins may have an effect on normal eye development and contribute to the axial elongation seen in high myopes since it is hypothesized that ZNF644 consider to be a transcription factor that controls genes expression which implicated in the process of ocular development [50]. In a study by Shi et al. [12], a missense variant in ZNF644 (Ser672Gly) was first identified by WES throughout a five-generation Han Chinese family suffering from high myopia. In recent years, additional variants in ZNF644 linked to high myopia have been reported in both China and America [51]. Herein, we distinguished a novel missense variant, Tyr1089Cys, in exon 4 of the ZNF644 gene. Nevertheless, the specific mechanism underlying the action of ZNF644 and its function in the pathogenesis of high myopia remains unclear. More functional investigations are required to be conducted.

Located at 12q13.3, SLC39A5 encodes a zinc transporter that belongs to the ZIP family known as solute carrier family 39 member 5. Its primary role is to maintain zinc homeostasis. [52]. In a study by Guo et al. [14], the initial discovery involved the detection of a truncation alteration within the SLC39A5 gene (Tyr47Ter) that exhibited heterozygosity, and one proposal was made indicating that the (c.141C > G, Tyr47Ter) alteration may result in the impairment of SLC39A5 features by generating a truncated protein that significantly boosts the expression levels of Smad1 at mRNA and protein. Smad1 is a vital transcription factor located downstream within the BMP/TGF-b transduction pathway. Several investigations have suggested a correlation between myopia and the BMP/TGF-b pathway, suggesting that interference with this pathway could cause refractive errors or potentially high myopia [53,54]. Additionally, SLC39A5 was detected in each stage of the ocular development and exhibited an elevated expression levels in both the sclera and retina [14]. Employing a screening of screening 298 families suffering from early-onset high myopia, Jiang et al [23] distinguished yet another missense alteration (c.1238G > C, Gly413Ala). In addition, Feng et al [55] stated three other heterozygous missense variants in isolated cases (Arg84Trp, Pro287Leu, and Arg319Thr). Herein, we distinguished a novel variant in SLC39A5, The Asp193Asn variant in exon 4, located near the terminal exon, may lead to alterations in protein expression due to its effect on mRNA splicing or stability. Our findings further expand the variants spectrum of SLC39A5 in early-onset high myopia.

Although four likely pathogenic variants were identified in this study, they account for only approximately 5% of the total cohort, indicating a relatively low diagnostic yield.

indicating a relatively low diagnostic yield. We further analyzed the possible reasons for this low diagnostic yield and provided insights for future research directions.

First, the sample size may be a key factor affecting the diagnostic yield. The small sample size may have limited our ability to represent all potential pathogenic variants. For larger sample sizes are generally necessary to increase the probability of identifying additional pathogenic variants [56]. Therefore, the sample size limitation may have impacted the range of genetic variants we were able to detect in this study.

Second, the methodology used, whole exome sequencing (WES), while effective in detecting variants in protein-coding regions, has inherent limitations. WES primarily focuses on exonic regions of the genome, meaning that non-coding regions, structural variants, and deep intronic variants may not be detected. Since these types of variants may play significant roles in the genetic basis of certain diseases [57], the limitations of WES could be a contributing factor to the low diagnostic yield.

Additionally, the complexity and heterogeneity of the disease itself may also influence the diagnostic yield [58]. In some genetic diseases, there may be considerable phenotypic diversity, and the genetic background may exhibit high heterogeneity, with pathogenic variants spread across multiple genes or involving rare and previously unreported variants. Even with WES, it may not be possible to comprehensively identify all pathogenic variants. Furthermore, disease development may be influenced by environmental factors, epigenetic modifications, and gene-environment interactions, which cannot be fully captured by current genetic screening methods [59], further complicating the diagnostic process.

Lastly, the low diagnostic yield may suggest that we have not yet fully understood the genetic basis of the eo-HM, indicating the possibility of undiscovered pathogenic variants. Nonetheless, this study provides valuable insights into the genetic mechanisms of eo-HM and lays the foundation for future research. Future studies can explore larger sample sizes, improve sequencing technologies, and delve deeper into non-coding regions and structural variants to uncover additional pathogenic variants. Although the diagnostic yield in this study was lower than expected, the results reflect the limitations of current technologies and our incomplete understanding of the genetic basis of eo-HM. This highlights the need for further research in genetic studies, emphasizing the importance of expanding genetic testing technologies and increasing sample sizes to improve diagnostic rates and gain a more comprehensive understanding of eo-HM’s genetic landscape.

The limitations of our study include:(1)Whole genome analysis can identify variants throughout the genome, including intronic and exonic regions. Although the exome accounts for only 1% of the genome, it has been estimated that a significant proportion of known disease-causing variants—especially those in Mendelian disorders—are located within coding regions. Therefore, WES remains a widely used and efficient tool for variant discovery in such contexts [60]. (2) Although we have used seven prediction software based on different principles to enhance the prediction accuracy of the pathogenicity of variants, these existing databases may be inaccurate or incomplete, and eventually affect the accuracy of pathogenicity prediction.

Conclusion

Overall, by using of WES and bioinformatics, we identified 4 genetic variants associated with eo-HM development. To the best of our understanding, this is the initial investigation to screen the known high myopia gene variants in northwest Chinese. The results revealed 4 likely pathogenic variants related to eo-HM, offering further evidence of TNFRSF21, CPSF1, ZNF644 and SLC39A5 contributed to the eo-HM. Furthermore, our results have contributed to the broadening of the variant spectrum of eo-HM across various countries, thereby offering valuable insights for future genetic investigations pertaining to HM.

Supporting information

S1 Table. Clinical information of 83 patients with high myopia.

(XLSX)

pone.0329472.s001.xlsx (14.2KB, xlsx)

Data Availability

All files are available from the figshare database (accession number(s)10.6084/m9.figshare.26713711).

Funding Statement

This work was supported by the National Natural Science Foundation of China (82460215); Yinchuan Science and Technology Plan Project (2024SF006); Ningxia Medical University School-level Scientific Research Project (XY2024058); Pre-experimental Project of the National Natural Science Foundation of China (2025GZRYSY006); and the Ningxia Natural Science Foundation (2024AAC03515). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Zhang J, Zou H. Insights into artificial intelligence in myopia management: from a data perspective. Graefes Arch Clin Exp Ophthalmol. 2024;262(1):3–17. doi: 10.1007/s00417-023-06101-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Cottle S. Reporting civilizational collapse: Research notes from a world-in-crisis. Glob Media Commun. 2023;19(2):269–88. [Google Scholar]
  • 3.Lu J, Shi H, Yao J. Prevalence and influencing factors of myopia among primary and secondary school students in Urumqi: a cross-sectional study with interrupted time series analysis. 2023.
  • 4.Chen KS, Au Eong JTW, Au Eong K-G. Changing paradigm in the management of childhood myopia. Eye (Lond). 2024;38(6):1027–8. doi: 10.1038/s41433-023-02831-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Zhang H-M, Li B-Q, Zhu Y, Liu S-X, Wei R-H. Time trends in myopia and high myopia prevalence in young university adults in China. Int J Ophthalmol. 2023;16(10):1676–81. doi: 10.18240/ijo.2023.10.18 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Biswas S, El Kareh A, Qureshi M, Lee DMX, Sun C-H, Lam JSH, et al. The influence of the environment and lifestyle on myopia. J Physiol Anthropol. 2024;43(1):7. doi: 10.1186/s40101-024-00354-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Xiao X, Yang J, Li Y, Yang H, Zhu Y, Li L, et al. Identification of a novel frameshift variant of ARR3 related to X-linked female-limited early-onset high myopia and study on the effect of X chromosome inactivation on the myopia severity. J Clin Med. 2023;12(3):835. doi: 10.3390/jcm12030835 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.National Academies of Sciences, Engineering, and Medicine. Onset and progression of myopia. In; Myopia: causes, prevention, and treatment of an increasingly common disease. National Academies Press (US). 2024. [PubMed] [Google Scholar]
  • 9.Jiang Y, Xiao X, Sun W, Wang Y, Li S, Jia X, et al. Clinical and genetic risk factors underlying severe consequence identified in 75 families with unilateral high myopia. J Transl Med. 2024;22(1):75. doi: 10.1186/s12967-024-04886-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Morgan I, Rose K. How genetic is school myopia? Prog Retin Eye Res. 2005;24(1):1–38. doi: 10.1016/j.preteyeres.2004.06.004 [DOI] [PubMed] [Google Scholar]
  • 11.Lingham G, Yazar S, Lucas RM, Milne E, Hewitt AW, Hammond CJ, et al. Time spent outdoors in childhood is associated with reduced risk of myopia as an adult. Sci Rep. 2021;11(1):6337. doi: 10.1038/s41598-021-85825-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Shi Y, Li Y, Zhang D. Exome sequencing identifies ZNF644 mutations in high myopia. PLoS Genetics. 2011;7(6):e1002084. doi: 10.1371/journal.pgen.1002084 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Zhou G, Lan C, Yang Q, Zhong W, Gu Z, Xiang X, et al. Expression of SCO1 and SCO2 after form-deprivation myopia in Guinea pigs. Eur J Ophthalmol. 2022;32(5):3050–7. doi: 10.1177/11206721211070305 [DOI] [PubMed] [Google Scholar]
  • 14.Guo H, Jin X, Zhu T. SLC39A5 mutations interfering with the BMP/TGF-β pathway in non-syndromic high myopia. J Med Genet. 2014;51(8):518–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Zhao F, Wu J, Xue A, Su Y, Wang X, Lu X, et al. Exome sequencing reveals CCDC111 mutation associated with high myopia. Hum Genet. 2013;132(8):913–21. doi: 10.1007/s00439-013-1303-6 [DOI] [PubMed] [Google Scholar]
  • 16.Napolitano F, Di Iorio V, Testa F, Tirozzi A, Reccia MG, Lombardi L, et al. Autosomal-dominant myopia associated to a novel P4HA2 missense variant and defective collagen hydroxylation. Clin Genet. 2018;93(5):982–91. doi: 10.1111/cge.13217 [DOI] [PubMed] [Google Scholar]
  • 17.Jin Z-B, Wu J, Huang X-F, Feng C-Y, Cai X-B, Mao J-Y, et al. Trio-based exome sequencing arrests de novo mutations in early-onset high myopia. Proc Natl Acad Sci U S A. 2017;114(16):4219–24. doi: 10.1073/pnas.1615970114 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Zhang J, Zhang X, Zou Y, Han F. CPSF1 mediates retinal vascular dysfunction in diabetes mellitus via the MAPK/ERK pathway. Arch Physiol Biochem. 2022;128(3):708–15. doi: 10.1080/13813455.2020.1722704 [DOI] [PubMed] [Google Scholar]
  • 19.Wang B, Liu Y, Chen S, Wu Y, Lin S, Duan Y, et al. A novel potentially causative variant of NDUFAF7 revealed by mutation screening in a Chinese family with pathologic myopia. Invest Ophthalmol Vis Sci. 2017;58(10):4182–92. doi: 10.1167/iovs.16-20941 [DOI] [PubMed] [Google Scholar]
  • 20.Pan H, Wu S, Wang J, Zhu T, Li T, Wan B, et al. TNFRSF21 mutations cause high myopia. J Med Genet. 2019;56(10):671–7. doi: 10.1136/jmedgenet-2018-105684 [DOI] [PubMed] [Google Scholar]
  • 21.Lee J-K, Kim H, Park Y-M, Kim DH, Lim HT. Mutations in DZIP1 and XYLT1 are associated with nonsyndromic early onset high myopia in the Korean population. Ophthalmic Genet. 2017;38(4):395–7. doi: 10.1080/13816810.2016.1232415 [DOI] [PubMed] [Google Scholar]
  • 22.Magliyah MS, Alsulaiman SM, Nowilaty SR, Alkuraya FS, Schatz P. Rhegmatogenous retinal detachment in nonsyndromic high myopia associated with recessive mutations in LRPAP1. Ophthalmol Retina. 2020;4(1):77–83. doi: 10.1016/j.oret.2019.08.005 [DOI] [PubMed] [Google Scholar]
  • 23.Jiang D, Li J, Xiao X, Li S, Jia X, Sun W, et al. Detection of mutations in LRPAP1, CTSH, LEPREL1, ZNF644, SLC39A5, and SCO2 in 298 families with early-onset high myopia by exome sequencing. Invest Ophthalmol Vis Sci. 2014;56(1):339–45. doi: 10.1167/iovs.14-14850 [DOI] [PubMed] [Google Scholar]
  • 24.Magliyah MS, Almarek F, Nowilaty SR, Al-Abdi L, Alkuraya FS, Alowain M, et al. LEPREL1 -related giant retinal tear detachments mimic the phenotype of ocular stickler syndrome. Retina. 2023;43(3):498–505. doi: 10.1097/IAE.0000000000003691 [DOI] [PubMed] [Google Scholar]
  • 25.Jiang Y, Zhou L, Wang Y, Ouyang J, Li S, Xiao X, et al. The genetic confirmation and clinical characterization of LOXL3-associated MYP28: a common type of recessive extreme high myopia. Invest Ophthalmol Vis Sci. 2023;64(3):14. doi: 10.1167/iovs.64.3.24 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Xiao X, Li S, Jia X, Guo X, Zhang Q. X-linked heterozygous mutations in ARR3 cause female-limited early onset high myopia. Mol Vis. 2016;22:1257–66. [PMC free article] [PubMed] [Google Scholar]
  • 27.Yuan D, Yan T, Tang N. A novel nonsense mutation in ARR4 leads to X-linked high myopia: a genetic paradox. Authorea Preprints. 2020. [Google Scholar]
  • 28.Li J, Gao B, Guan L, Xiao X, Zhang J, Li S, et al. Unique variants in OPN1LW cause both syndromic and nonsyndromic X-linked high myopia mapped to MYP1. Invest Ophthalmol Vis Sci. 2015;56(6):4150–5. doi: 10.1167/iovs.14-16356 [DOI] [PubMed] [Google Scholar]
  • 29.Warr A, Robert C, Hume D, Archibald A, Deeb N, Watson M. Exome sequencing: current and future perspectives. G3 (Bethesda). 2015;5(8):1543–50. doi: 10.1534/g3.115.018564 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Alganmi N, Abusamra H. Evaluation of an optimized germline exomes pipeline using BWA-MEM2 and Dragen-GATK tools. PLoS One. 2023;18(8):e0288371. doi: 10.1371/journal.pone.0288371 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Liu Y, Zhang J-J, Piao S-Y, Shen R-J, Ma Y, Xue Z-Q, et al. Whole-exome sequencing in a cohort of high myopia patients in Northwest China. Front Cell Dev Biol. 2021;9:645501. doi: 10.3389/fcell.2021.645501 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucl Acids Res. 2010;38(16):e164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Pham M, Tu Y, Lv X. Accelerating BWA-MEM read mapping on GPUs. ICS. 2023;2023:155–66. doi: 10.1145/3577193.3593703 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.van der Valk T, Díez-Del-Molino D, Marques-Bonet T, Guschanski K, Dalén L. Historical genomes reveal the genomic consequences of recent population decline in Eastern Gorillas. Curr Biol. 2019;29(1):165-170.e6. doi: 10.1016/j.cub.2018.11.055 [DOI] [PubMed] [Google Scholar]
  • 35.Ali S, Ali U, Qamar A. Predicting the effects of rare genetic variants on oncogenic signaling pathways: a computational analysis of HRAS protein function. Front Chem. 2023;11:1173624. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Nasser KK, Shinawi T. Genotype-protein phenotype characterization of NOD2 and IL23R missense variants associated with inflammatory bowel disease: A paradigm from molecular modelling, dynamics, and docking simulations. Front Med (Lausanne). 2023;9:1090120. doi: 10.3389/fmed.2022.1090120 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Emadi E, Akhoundi F, Kalantar SM, Emadi-Baygi M. Predicting the most deleterious missense nsSNPs of the protein isoforms of the human HLA-G gene and in silico evaluation of their structural and functional consequences. BMC Genet. 2020;21(1):94. doi: 10.1186/s12863-020-00890-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Choudhury A, Mohammad T, Anjum F, Shafie A, Singh IK, Abdullaev B, et al. Comparative analysis of web-based programs for single amino acid substitutions in proteins. PLoS One. 2022;17(5):e0267084. doi: 10.1371/journal.pone.0267084 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Wang D, Li J, Wang Y, Wang E. A comparison on predicting functional impact of genomic variants. NAR Genom Bioinform. 2022;4(1):lqab122. doi: 10.1093/nargab/lqab122 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Schubach M, Maass T, Nazaretyan L, Röner S, Kircher M. CADD v1.7: using protein language models, regulatory CNNs and other nucleotide-level scores to improve genome-wide variant predictions. Nucleic Acids Res. 2024;52(D1):D1143–54. doi: 10.1093/nar/gkad989 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Ao Y-F, Pei S, Xiang C, Menke MJ, Shen L, Sun C, et al. Structure- and data-driven protein engineering of transaminases for improving activity and stereoselectivity. Angew Chem Int Ed Engl. 2023;62(23):e202301660. doi: 10.1002/anie.202301660 [DOI] [PubMed] [Google Scholar]
  • 42.Sievers F, Higgins DG. The clustal omega multiple alignment package. Multiple Sequence Alignment: Methods and Protocols. 2021;3–16. [DOI] [PubMed] [Google Scholar]
  • 43.Zhou X, Zheng W, Li Y, Pearce R, Zhang C, Bell EW, et al. I-TASSER-MTD: a deep-learning-based platform for multi-domain protein structure and function prediction. Nat Protoc. 2022;17(10):2326–53. doi: 10.1038/s41596-022-00728-0 [DOI] [PubMed] [Google Scholar]
  • 44.Miller DT, Lee K, Gordon AS, Amendola LM, Adelman K, Bale SJ, et al. Recommendations for reporting of secondary findings in clinical exome and genome sequencing, 2021 update: a policy statement of the American College of Medical Genetics and Genomics (ACMG). Genet Med. 2021;23(8):1391–8. doi: 10.1038/s41436-021-01171-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Pan H, Wu S, Wang J. TNFRSF21 mutations cause high myopia. J Med Genet. 2019;56(10):671–7. [DOI] [PubMed] [Google Scholar]
  • 46.Nikolaev A, McLaughlin T, O’Leary DDM, Tessier-Lavigne M. APP binds DR6 to trigger axon pruning and neuron death via distinct caspases. Nature. 2009;457(7232):981–9. doi: 10.1038/nature07767 [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
  • 47.Ouyang J, Sun W, Xiao X, Li S, Jia X, Zhou L, et al. CPSF1 mutations are associated with early-onset high myopia and involved in retinal ganglion cell axon projection. Hum Mol Genet. 2019;28(12):1959–70. doi: 10.1093/hmg/ddz029 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Zhang X, Xia F, Zhang X, Blumenthal RM, Cheng X. C2H2 zinc finger transcription factors associated with hemoglobinopathies. J Mol Biol. 2024;436(7):168343. doi: 10.1016/j.jmb.2023.168343 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Du Y, Meng J, He W, Qi J, Lu Y, Zhu X. Complications of high myopia: an update from clinical manifestations to underlying mechanisms. Adv Ophthalmol Pract Res. 2024;4(3):156–63. doi: 10.1016/j.aopr.2024.06.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.González-Iglesias E, López-Vázquez A, Noval S, Nieves-Moreno M, Granados-Fernández M, Arruti N, et al. Next-generation sequencing screening of 43 families with non-syndromic early-onset high myopia: a clinical and genetic study. Int J Mol Sci. 2022;23(8):4233. doi: 10.3390/ijms23084233 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Yu X, Yuan J, Chen ZJ, Li K, Yao Y, Xing S, et al. Whole-exome sequencing among school-aged children with high myopia. JAMA Netw Open. 2023;6(12):e2345821. doi: 10.1001/jamanetworkopen.2023.45821 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Saravanan R, Balasubramanian V, Swaroop Balamurugan SS, Ezhil I, Afnaan Z, John J, et al. Zinc transporter LIV1: A promising cell surface target for triple negative breast cancer. J Cell Physiol. 2022;237(11):4132–56. doi: 10.1002/jcp.30880 [DOI] [PubMed] [Google Scholar]
  • 53.Ziegler A, Duclaux-Loras R, Revenu C, Charbit-Henrion F, Begue B, Duroure K, et al. Bi-allelic variants in IPO8 cause a connective tissue disorder associated with cardiovascular defects, skeletal abnormalities, and immune dysregulation. Am J Hum Genet. 2021;108(6):1126–37. doi: 10.1016/j.ajhg.2021.04.020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Karouta C, Kucharski R, Hardy K. Transcriptome-based insights into gene networks controlling myopia prevention. FASEB J. 2021;35(9):1–23. [DOI] [PubMed] [Google Scholar]
  • 55.Feng C-Y, Huang X-Q, Cheng X-W, Wu R-H, Lu F, Jin Z-B. Mutational screening of SLC39A5, LEPREL1 and LRPAP1 in a cohort of 187 high myopia patients. Sci Rep. 2017;7(1):1120. doi: 10.1038/s41598-017-01285-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Sun KY, Bai X, Chen S. A deep catalogue of protein-coding variation in 983,578 individuals. Nature. 2024;631(8021):583–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Peña-Martínez EG, Rodríguez-Martínez JA. Decoding non-coding variants: recent approaches to studying their role in gene regulation and human diseases. Front Biosci (Scholar edition). 2024;16(1):4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Weisschuh N, Mazzola P, Zuleger T, Schaeferhoff K, Kühlewein L, Kortüm F, et al. Diagnostic genome sequencing improves diagnostic yield: a prospective single-centre study in 1000 patients with inherited eye diseases. J Med Genet. 2024;61(2):186–95. doi: 10.1136/jmg-2023-109470 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Alemu R, Sharew NT, Arsano YY, Ahmed M, Tekola-Ayele F, Mersha TB, et al. Multi-omics approaches for understanding gene-environment interactions in noncommunicable diseases: techniques, translation, and equity issues. Hum Genomics. 2025;19(1):8. doi: 10.1186/s40246-025-00718-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Barbitoff YA, Polev DE, Glotov AS, Serebryakova EA, Shcherbakova IV, Kiselev AM, et al. Systematic dissection of biases in whole-exome and whole-genome sequencing reveals major determinants of coding sequence coverage. Sci Rep. 2020;10(1):2057. doi: 10.1038/s41598-020-59026-y [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Table. Clinical information of 83 patients with high myopia.

(XLSX)

pone.0329472.s001.xlsx (14.2KB, xlsx)

Data Availability Statement

All files are available from the figshare database (accession number(s)10.6084/m9.figshare.26713711).


Articles from PLOS One are provided here courtesy of PLOS

RESOURCES