Abstract
Importance
Wide use of genomic sequencing to diagnose disease has raised concern about the extent of genotype-phenotype correlations.
Objective
To correlate disease-associated allele frequencies with expected and reported prevalence of clinical disease.
Design, Setting, and Participants
Xeroderma pigmentosum (XP), a recessive, cancer-prone, neurocutaneous disorder, was used as a model for this study. From January 1, 2017, to May 4, 2018, the Human Gene Mutation Database and a cohort of patients at the National Institutes of Health were searched and screened to identify reported mutations associated with XP. The clinical phenotype of these patients was confirmed from reports in the literature and National Institutes of Health medical records. The genetically predicted prevalence of disease based on frequency of known pathogenic mutations was compared with the prevalence of patients clinically diagnosed with phenotypic XP. Exome sequencing of more than 200 000 alleles from the Genome Aggregation Database, the National Cancer Institute Division of Cancer Epidemiology and Genetics database of healthy controls, and an Inova Hospital Study database was used to investigate the frequencies of these mutations in the general population.
Main Outcomes and Measures
Listing of all reported mutations associated with XP, their frequencies in 3 large exome sequence databases, determination of the number of patients in the United States with XP using modeling equations, and comparison of the observed and reported numbers of patients with XP with specific mutations.
Results
A total of 156 pathogenic missense and nonsense mutations associated with XP were identified in the National Institutes of Health cohort and the Human Gene Mutation Database. The Genome Aggregation Database provided frequency data for 65 of these mutations, with a total allele frequency of 1.13%. The XPF (ERCC4) mutation, p.P379S, had an allele frequency of 0.4%, and the XPC mutation, p.P334H, had an allele frequency of 0.3%. With the Hardy-Weinberg equation, it was determined that there should be more than 8000 patients who are homozygous for these mutations in the United States. In contrast, only 3 patients with XP were reported as having the XPF mutation, and 1 patient was reported as having the XPC mutation.
Conclusions and Relevance
The findings from this study suggest that clinicians should approach large genomic databases with caution when trying to correlate the clinical implications of genetic variants with the prevalence of disease risk. Unsuspected mutations in known genes with a predisposition for skin cancer may be responsible for some of the high frequency of skin cancers in the general population.
This molecular epidemiologic study examinsd 3 large exome sequence databases totaling more than 200 000 alleles to correlate disease-associated allele frequencies with the expected and reported prevalence of clinical disease.
Key Points
Question
Do databases of exome sequences reliably correlate with the prevalence of individuals with defective DNA repair?
Findings
In this molecular epidemiologic study examining 3 large exome sequence databases totaling more than 200 000 alleles, unexpectedly high frequencies were found of 2 mutations associated with xeroderma pigmentosum in DNA repair genes (XPF [ERCC4] p.P379S, 0.4% and XPC p.P334H, 0.3%). These frequencies estimate the presence of more than 8000 people with xeroderma pigmentosum in the United States with these mutations, yet only 4 individuals were clinically identified in this study.
Meaning
Unsuspected mutations in known genes with a predisposition for skin cancer may be responsible for some of the high frequency of skin cancers in the general population.
Introduction
Xeroderma pigmentosum (XP) is a rare, autosomal recessive disorder associated with mutations in genes involved in the nucleotide excision repair pathway. Xeroderma pigmentosum can be caused by mutations in 7 nucleotide excision repair genes (XPA-XPG [XPA, OMIM 611153; XPB excision repair, complementing defective, in Chinese hamster, 3, ERCC3, OMIM 133510; XPC, OMIM 613208; XPD, excision repair, complementing defective, in Chinese hamster, 2, ERCC2, OMIM 126340; XPE, DNA damage-binding protein 2, DDB2, OMIM 600811; XPF, excision repair, complementing defective, in Chinese hamster, 4, ERCC4, OMIM 133520; XPG, excision repair, complementing defective, in Chinese hamster, 5, ERCC5, OMIM 133530]) and in the POLH gene (POLH XP variant, polymerase, DNA, ETA, OMIM 603968), a bypass polymerase, and is characterized by the inability to repair UV-induced and chemically induced DNA damage.1 Patients with XP are usually clinically identified owing to severe sun sensitivity after minimum sun exposure, freckle-like (lentiginous) pigmentation on sun-exposed skin before 2 years of age, and premature development of many skin cancers. Some patients (approximately 20% in the United States) also have early onset of progressive neurologic degeneration.1,2 Patients with XP younger than 20 years have a 10 000-fold increased risk of sunlight-induced skin cancer.2 The frequency of XP in western Europe is clinically estimated to be about 1 in 1 million individuals.3 Clinical diagnosis is often confirmed by DNA sequencing of the genes known to cause XP.
With the growing emphasis on using genomic sequencing to assess risk of disease, we investigated the frequency of XP using data from large genomic databases. To identify and analyze the prevalence of recessive germline mutations associated with XP, we mined 4 databases: the Human Gene Mutation Database,4 the Genome Aggregation Database (gnomAD)5; an independent database from the National Institutes of Health (NIH), National Cancer Institute Division of Cancer Epidemiology and Genetics (DCEG)6,7,8; and an Inova Hospital Study database.9 We also examined our cohort of patients at the NIH with XP.2 We sought to compare the genetically estimated prevalence of disease, based on the frequency of known pathogenic mutations in heterozygous carriers from the general population, with the reported prevalence of the clinically observed and diagnosed XP phenotype. The clinical pathogenicity of the mutations associated with XP were confirmed by identifying patients in the literature who harbored these mutations, who received a diagnosis of clinical XP, and who underwent a laboratory assessment confirming reduced DNA repair.
We found high allele frequencies for several XP genetic variants. This finding suggests that there should be thousands of patients who are homozygous for missense or nonsense mutations in XPF, XPC, and other genes causing XP and should, consequently, express the clinical XP phenotype. It could also suggest that our understanding and interpretation of the variants associated with these human diseases is incorrect. Could these genetic variants be a risk factor in some patients with multiple skin cancers with subclinical features of XP?
Methods
From January 1, 2017, to May 4, 2018, we searched the Human Gene Mutation Database Professional4 and our cohort at the NIH2 to identify all germline mutations with a reported association with a clinically recognized XP phenotype. The patients at the NIH were examined under a protocol approved by the National Cancer Institute Institutional Review Board. All patients and/or parents provided written informed consent.
The Human Gene Mutation Database includes mutations from 7791 genes representing most of the published mutations responsible for human inherited diseases. The Human Gene Mutation Database does not include somatic or mitochondrial mutations. We focused exclusively on the missense and nonsense mutations to estimate the frequency of XP because those mutations cause changes in DNA that result in an amino acid change or a premature stop codon and are likely to be pathogenic. Also, a substantial portion of these mutations had allele frequencies listed in gnomAD.5 GnomAD provides allele frequencies of human genomic variants and mutations and includes 123 000 exome sequences from 8 different world subpopulations. We used gnomAD to represent the general population for our determinations.
We investigated the consistency of some of these allele frequencies by analyzing control data from a separate independent exome sequencing database from the National Cancer Institute DCEG database. Data on 998 controls from 2 cohort studies (the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial8 and the Cancer Prevention Study of the American Cancer Society6) and 1 case-control study (the Environment and Genes in Lung Cancer Etiology7) were available for the present study. In addition, we evaluated data from 681 healthy individuals in the Inova Hospital Study.9 Controls were cancer free at the time of enrollment. We used the Hardy-Weinberg equation10 (https://www.nature.com/scitable/definition/hardy-weinberg-equation-299) to make estimates for the prevalence of expected clinical disease: p2 + 2pq + q2 = 1, where p is the population frequency of the major allele and q is the population frequency of the minor allele. Thus, p + q = 1, and q2 represents the frequency of the homozygous minor allele. Because XP is a recessive disorder requiring 2 pathogenic mutations, we focused on estimating the number of affected homozygotes (q2).
Results
Mutations Associated With XP
We identified a total of 156 missense and nonsense mutations associated with XP (Table 1; eTable 1 in the Supplement) in 8 XP genes (XPA-G and POLH) using the Human Gene Mutation Database4 and our patients at the NIH.2 The XPD (ERCC2) gene had the most mutations associated with XP (32 reported), while XPB (ERCC3) had the least mutations associated with XP (2 reported).
Table 1. Missense and Nonsense Mutations and Allele Frequencies Associated With Xeroderma Pigmentosum (XP) Disease.
Complementation Group (Genes) | No. of Different Mutationsa | No. of Different Mutations With Frequencies Listed in gnomADb | No. of Alleles Detected in gnomADb | % of Alleles Detected in gnomAD (q)c | Estimated Frequency of Affected Individuals in gnomAD (q2)c |
---|---|---|---|---|---|
XPA (XPA) | 16 | 6 | 146 | 0.053 | 2.83 × 10−7 |
XPB (ERCC3) | 2 | 2 | 3 | 0.001 | 1.48 × 10−10 |
XPC (XPC) | 27 | 12 | 867 | 0.319 | 1.02 × 10−5 |
XPD (ERCC2) | 32 | 16 | 446 | 0.162 | 2.62 × 10−6 |
XPE (DDB2) | 10 | 0 | 0 | 0 | 0 |
XPF (ERCC4) | 12 | 6 | 1267 | 0.457 | 2.09 × 10−5 |
XPG (ERCC5) | 26 | 9 | 39 | 0.015 | 2.40 × 10−8 |
XP variant (POLH) | 31 | 14 | 330 | 0.120 | 1.44 × 10−6 |
Total | 156 | 65 | 3098 | 1.13 | 3.55 × 10−5 |
Abbreviation: gnomAD, Genome Aggregation Database.
From the Human Gene Mutation Database and the National Institutes of Health cohort (eTable 1 in the Supplement).
See eTable 2 in the Supplement.
See Methods section.
Mutations in Exome Databases Associated With XP
With more than 200 000 sequenced alleles,5 gnomAD provided data for 65 of the 156 mutations associated with XP (41.7%) (Table 1; eTable 2 in the Supplement), including allele count, allele frequency, and number of homozygotes harboring these genomic variations. Ninety-one of the 156 mutations (58.3%) associated with XP that were not listed in gnomAD were each presumably less frequent than 1 in 200 000. The 65 mutations associated with XP that were listed in gnomAD had a total allele frequency of 1.13% (Table 1). They included mutations in all of the known genes associated with XP except XPE (DDB2). The 2 XP genes with most frequent mutations were XPF (ERCC4), at 0.457%, and XPC, at 0.319%.
XPF (ERCC4) mutations account for only 1.6% of 189 published cases of XP.11 However, the XPF (ERCC4) gene contained 12 different mutations associated with XP. Seven of these mutations had data reported in gnomAD, including p.P379S and p.R799W, with an allele frequency of 4.05 × 10−3 for p.P379S and 4.48 × 10−4 for p.R799W (Table 2). In addition, gnomAD reported 4 people as homozygous for the XPF p.P379S mutation (Table 2; eTable 2 in the Supplement).
Table 2. Higher Frequency of XPF (ERCC4) and XPC Mutations in Genetic Databases Compared With Phenotypic XP Observed in the United States.
Complementation Group (Gene) | Mutation Associated With XP | Total No. of Alleles Sequenced | No. of Individuals | No. of Alleles | % of Alleles Reported in Database (q) | Estimated % of Homozygous Affected Individuals (q2)a | Total No. of Genetic Homozygotes Reported in Database | Total No. of Genetic Homozygotes Estimated in Databasea | rs No.b |
---|---|---|---|---|---|---|---|---|---|
gnomAD | |||||||||
XPF (ERCC4) | p.P379S | 276 560 | 138 280 | 1122 | 0.41 | 1.65 × 10−5 | 4 | 2.28 | rs1799802 |
XPF (ERCC4) | p.R799W | 277 034 | 138 517 | 124 | 0.04 | 2.00 × 10−7 | 0 | 0.03 | rs121913049 |
XPC | p.P334H | 274 914 | 137 457 | 838 | 0.30 | 9.29 × 10−6 | 7 | 1.28 | rs74737358 |
DCEG database (healthy controls) | |||||||||
XPF (ERCC4) | p.P379S | 1996 | 998 | 10 | 0.50 | 2.51 × 10−5 | 0 | 0.03 | rs1799802 |
XPF (ERCC4) | p.R799W | 1996 | 998 | 1 | 0.05 | 2.51 × 10−7 | 0 | 0 | rs121913049 |
Inova Hospital Study database (healthy controls) | |||||||||
XPF (ERCC4) | p.P379S | 1362 | 681 | 7c | 0.51 | 2.60 × 10−5 | 0 | 0.02 | rs1799802 |
XPF (ERCC4) | p.R799W | 1362 | 681 | 1 | 0.07 | 4.90 × 10−7 | 0 | 0 | rs121913049 |
XPC | p.P334H | 1362 | 681 | 8 | 0.59 | 3.48 × 10−5 | 0 | 0.02 | rs74737358 |
US estimations | |||||||||
XPF (ERCC4) | p.P379S | NA | 323 100 000d | NA | 0.41e | 1.65 × 10−5 | 3f | 5298g | rs1799802 |
XPF (ERCC4) | p.R799W | NA | 323 100 000d | NA | 0.04e | 2.00 × 10−7 | 11f | 66g | rs121913049 |
XPC | p.P334H | NA | 323 100 000d | NA | 0.30e | 9.29 × 10−6 | 1f | 3002g | rs74737358 |
65 XP mutationsh | NA | NA | NA | 1.13e | 2.81 × 10−5 | 300 (US only)f | 9007g | NA |
Abbreviations: DCEG, Division of Cancer Epidemiology and Genetics; gnomAD, Genome Aggregation Database; NA, not available; XP, xeroderma pigmentosum.
Calculated using Hardy-Weinberg equation (see Methods section).
From the Single Nucleotide Polymorphism database (https://www.ncbi.nlm.nih.gov/projects/SNP/).
Calculated from the total number of alleles sequenced and the allele frequencies reported in Inova Hospital Study database.
In US populaton.
In gnomAD.
Phenotypic patients with XP reported in the world literature who are homozygous or heterozygous for indicated mutation.
Total number of genetic homozygous patients with XP predicted in United States based on gnomAD.
Mutations for XP in gnomAD (Table 1; eTable 2 in the Supplement).
We analyzed XPF (ERCC4) p.P379S and p.R799W in 2 independent databases. The frequencies of these mutations in the National Cancer Institute DCEG control population of 998 clinically healthy individuals were 5.0 × 10−3 for p. P379S and 5.0 × 10−4 for p.R799W, which were comparable to the frequencies found in the much larger gnomAD (Table 2). Similarly, the frequencies in the Inova Hospital Study database were 5.1 × 10−3 for p. P379S and 7.0 × 10−4 for p.R799W. No homozygotes were reported in these smaller cohorts.
One of the XPC mutations, p.P334H, had a frequency of 3.05 × 10−3 in gnomAD and 5.9 × 10−3 in the Inova Hospital Study database (Table 2; eTable 2 in the Supplement). These mutations were identified predominantly in individuals of African descent. All 7 individuals reported as homozygous for this mutation in gnomAD were of African ancestry.
Before estimating the expected number of patients, we tested the Hardy-Weinberg equation using the data reported in gnomAD and the other databases to estimate the number of homozygotes expected to be in each database. We estimated 2.28 XPF p.P379S homozygotes would be present in gnomAD, 0.03 XPF p.P379S homozygotes would be present in the DCEG database, and 0.02 XPF p.P379S homozygotes would be present in the Inova Hospital Study database. The actual reported number of homozygotes was 4 in gnomAD, 0 in the DCEG database, and 0 in the Inova Hospital Study database (Table 2). The comparability of the number of estimated homozygotes and observed homozygotes confirmed the reasonableness of using the Hardy-Weinberg equation to estimate the number of homozygotes present using the US population.
Estimations of XP Mutation Prevalence in the US Population
We used the allele frequencies of XPF p.P379S and the reported population of the United States (3.23 × 108) (http://www.worldometers.info/) to estimate the number of genetically homozygous patients expected to express the XP phenotype. We estimated that there should be approximately 5300 patients who are homozygous for the XPF p.P379S mutation in the United States (Table 2). In contrast, only 3 patients who were homozygous or compound heterozygous for the XPF p.P379S mutation were identified as affected in the world literature (Table 3).12,13,16 Using a similar approach, Fassihi et al12 noted that this XPF p.P379S mutation was present at a frequency of 0.3% in the Single Nucleotide Polymorphism database. They state, “this estimate implies that there might be a significant cohort of UVR-sensitive individuals not recognized as having XP, but homozygous for this mutation.”12(pE1242) We found a similar disparity for the second most common XP associated mutation, XPC p. P334H, which was estimated to be present in about 3000 people in the United States. Only 1 patient with this mutation was reported in the literature (Table 3).19,20,21,22
Table 3. Clinical Phenotypes of Patients With Most Common XPF (ERCC4) and XPC Mutations.
Patient No./Sex/Age, y | Gene | Mutation 1 | Mutation 2 | Sun Sensitivity | Abnormal Freckling | Neurologic Degeneration | Skin Cancer | DNA Repair Defect | DNA Repair Assay | Source(s) |
---|---|---|---|---|---|---|---|---|---|---|
XP72BR/M/18 | XPF (ERCC4) | p.P379S | Homozygous | Yes | Few lentigines | No | No | Yes | UDS, 35% | 12 |
XP32BR/M/23 | XPF (ERCC4) | p.P379S | p.R589W | Yes | No | No | No | Yes | UDS, 18% | 12 |
XP7NE/M/28a | XPF (ERCC4) | p.P379S | Silent | Yes | Yes | NA | No | Yes | UDS, 30% | 13 |
XP126LO/F/22 | XPF (ERCC4) | p.R799W | p.T770Pfs*46 | Yes | Yes | No | No | Yes | UDS, 13% | 14, 15 |
XP24BR/F/48 | XPF (ERCC4) | p.R799W | p.R589W | Yes | Few lentigines | Yes (adult onset) | No | Yes | UDS, 5% | 12, 16 |
XP24KY/ Unknow/50 |
XPF (ERCC4) | p.R799W | 537fs+7bp | Unknown | Unknown | Yes (adult onset) | Unknown | Yes | UDS, 7% | 13 |
XP48DC/F/51 (deceased) |
XPF (ERCC4) | p.R799W | p.F196Qfs*20 | Yes | Yes | Yes (began at 24 y) | Yes (BCC) | Yes | HCR, reduced | 17 |
C014TA (sibling)/F/51 | XPF (ERCC4) | p.R799W | p.S459X | Yes | Yes | Yes (began at 30 y) | No | Yes | HCR, reduced | 17 |
CO107TA (sibling)/M/51 (deceased) |
XPF (ERCC4) | p.R799W | p.S459X | Yes | Unknown | Yes (began at 42 y) | No | Yes | HCR, reduced | 17 |
UDP-7356/F/54 (deceased) |
XPF (ERCC4) | p.R799W | p.S459X | Yes | Yes | Yes (chorea and ataxia at 34 y) | Yes (>20 BCC began at 16 y) | Yes | UDS, 21% | 18 |
UDP-3675/F/60 | XPF (ERCC4) | p.R799W | p.R589W | Yes | Yes | Yes (cognitive impairment at 46 y) | No | Yes | UDS, 26% | 18 |
XP42RO/M/62 | XPF (ERCC4) | p.R799W | Homozygous | Yes | Yes | Yes (began at 47 y) | Yes (9 at 27 y) | Yes | UDS, 20% | 13, 14 |
XP62RO/ unknown |
XPF (ERCC4) | p.R799W | Homozygous | Unknown | Unknown | Yes (late onset?) | Unknown | Yes | UDS, 20% | 13 |
XP26BR/F/41a | XPF (ERCC4) | p.R799W | L291X | Yes | Yes | Mild | No | Yes | UDS, 15% | 13 |
XP1MI/F/17 (deceased) |
XPC | p.P334H | Homozygous | Yes | Yes | No | Yes | Yes | UDS, 29% | 19-22 |
Abbreviations: BCC, basal cell carcinoma; HCR, plasmid host cell reactivation after UV damage; NA, not available; UDS, unscheduled DNA synthesis after UV damage.
A. Lehmann, PhD, C. Arlett, PhD, H. Fassihi, MD, written communication, September 2018.
Phenotypes of People With XP Mutations
The 15 patients with XP reported in the literature12,13,14,15,16,17,18,19,20,21,22 with at least 1 of 3 relatively frequent mutations associated with XP, XPF p.P379S, XPC p.P334H, and XPF p.R779W, were confirmed as receiving a clinical diagnosis of XP and having reduced DNA repair (Table 3). Their clinical features varied. Thirteen of these patients were reported to have marked sun sensitivity, 11 had abnormal freckling (lentigines), and 4 had skin cancer. Of the 3 patients with XP with the XPF p.P379S mutation, an 18-year-old man (patient XP72BR) was homozygous for p.P379S. He had several episodes of severe sunburn as a child. He had few lentigines, no skin cancer, and no neurologic or ocular abnormalities. His cells had decreased DNA repair (unscheduled DNA synthesis after UV damage [UDS], 35% of normal).12 A 23-year-old man (patient XP32BR) was reported as being a compound heterozygous with XPF p.P379S and p.R589W. Since infancy, he had severe sunburns with minimal sun exposure, resulting in a diagnosis of XP at 7 years of age. He had a pale complexion and no neurologic abnormalities. His cells had decreased DNA repair (UDS, 18% of normal).12 The third patient (patient XP7NE) was reported as harboring XPF p.P379S and a silent allele.13,16 This patient was a 28-year-old man with decreased DNA repair (UDS, 30% of normal), a history of severe sunburns, numerous large freckles on the back of his hands and shoulders, and no skin cancer16 (A. Lehmann, PhD, C. Arlett, PhD, and H. Fassihi, MD, written communication, September 2018).
XPC p.P334H was the second most common mutation associated with XP. However, there was only 1 patient (patient XP1MI) reported in the literature with this mutation.19,20,21,22 She was homozygous for the mutation, born with low birth weight and microcephaly and had reduced DNA repair (UDS, 13% of normal). This patient (with cell lines CRL 1333 and GM02096) was reported as a 17-year-old African American girl who had both XP and systemic lupus erythematosus.19 She was examined at the NIH at 18 years of age, where the diagnoses of both XP and systemic lupus erythematosus were confirmed. She had abnormalities of her skin typical for XP (hypopigmentation and hyperpigmentation), multiple skin cancers, and severe photodamage to her eyes. Her IQ was previously reported as being in the range of persons with intellectual disability (mental retardation), and it was noted that her IQ may have been affected by her limited vision and other conditions. The diagnosis of systemic lupus erythematosus was confirmed, with several swollen joints making it difficult for her to write and walk and with positive blood test results for lupus anticoagulant and antinuclear antibody. This patient died at 37 years of age.
Two of the 11 patients with XP and the XPF p.R799W mutation were reported as homozygous, and 9 were compound heterozygotes.12,13,14,15,16,17,18 All had reduced DNA repair. Three of these patients had skin cancer, and 9 had adult-onset severe neurodegeneration (Table 3).
We also found 6 other mutations associated with XP with frequencies greater than 10−4 in gnomAD (eTable 2 in the Supplement). Two mutations were in the XPA gene (p.R228X,12 with a frequency of 3.44 × 10−4, and p.H244R,23 with a frequency of 1.38 × 10−4), 2 in the XPD (ERCC2) gene (p.R616P,12 with a frequency of 1.37 × 10−4 in gnomAD and a frequency of 5.0 × 10−4 in the DCEG database, and p.L461V plus p.V716-R730del,12,24,25 with a frequency of 1.13 × 10−3 in gnomAD and a frequency of 2.9 × 10−3 in the Inova Hospital Study database), and 2 in the polH gene (for XP variant) (p.K535E,26 with a frequency of 8.43 × 10−4 in gnomAD and a frequency of 5.0 × 10−4 in the DCEG database, and p.T692A,27 with a frequency of 1.52 × 10−4 in gnomAD). These 6 mutations plus the 2 mutations in XPF (ERCC4) and the 1 mutation in the XPC gene have a combined population frequency of about 1% and account for 93% of the frequency of the 65 variant alleles associated with XP in gnomAD (eTable 2 in the Supplement).
Discussion
Extensive work has focused on the “genotype-phenotype correlation,” which usually has meant assessing phenotype and then looking for an underlying genotype. Much of our understanding of the association between clinical disease and underlying genetics has been based on this approach, and it has resulted in an assumption that certain mutations equate with specific clinical manifestations. However, to our knowledge to date, there are few data on the reverse approach: understanding the spectrum of clinical manifestations of variants and how they are associated with penetrance and expressivity, especially when those variants may cause subclinical manifestations of disease. Large genomic databases now allow for the estimation of the frequency of variant alleles. Measuring genotypes in a population and then looking for expected phenotypes have resulted in a surprising discordance: an absence of patients expected to be affected. This finding remains true even with the extensive data processing, variant calling, quality control, and filtering performed on the data in gnomAD.5
Our investigation provided unexpected information. The population estimations based on the allele frequencies of the 9 most frequent mutations associated with XP suggest that, based on prevalence of genotype, there should be substantially more cases of XP than have been identified clinically. One of the least common genes reported to cause XP, XPF (ERCC4), had mutations associated with XP with notably high frequencies in 3 independent databases (2 with verified normal populations). The frequency of the XPF p.P379S mutation in gnomAD suggests that there should be about 5300 patients in the United States who are homozygous for this mutation (Table 2). These numbers were estimated using the US population of 323 million. This estimation was vastly contrasted with only 3 patients being identified in the literature as harboring at least 1 XPF (ERCC4) p.P379S mutation (Table 3). Similarly, the XPC p.P334H mutation frequency in gnomAD would estimate that about 3000 individuals in the United States are homozygous for this mutation, and only 1 individual was reported in the literature (Table 2). The combined frequency of the 65 variants associated with XP listed in gnomAD is 1.13% (Table 1; eTable 2 in the Supplement). This frequency would imply that more than 9000 individuals are homozygous for these variants in the United States (Table 2). With the high frequencies of skin cancer in the general population, homozygous carriers of these mutations may represent a subgroup of individuals with a mild or subclinical XP phenotype. If these patients could be identified early, strict sun avoidance and surveillance measures could reduce the morbidity and mortality rates of skin cancer.
These reported clinical and laboratory observations demonstrated the clinically associated pathogenicity of these mutations. However, the clinical features of patients with these reported mutations challenge our approach to recognizing the clinical profile of XP, which currently is strongly associated with sun sensitivity and a 10 000-fold increased risk of skin cancer. Although the 14 patients with XPF (ERCC4) mutations ranged in age from 18 to 62 years, only 3 of them received a diagnosis of skin cancer. Neurologic degeneration in patients with XP with mutations in the XPA or XPD (ERCC2) genes typically has an onset in the first decade of life. In contrast, 9 of 11 patients harboring the XPF (ERCC4) p.R799W mutation had a different phenotype, with adult-onset neurologic degeneration with onset as late as 47 years of age (Table 3). Xeroderma pigmentosum should be considered in patients with sun sensitivity, premature development of many skin cancers (especially of more than 1 type), freckle-like (lentiginous) pigmentary abnormalities or poikilodermatous skin changes, and/or undiagnosed neurologic degeneration.1
Limitations
There is a marked discrepancy between the data in these genomic databases and the reported clinical prevalence of recessive XP. It is possible that we are not identifying some homozygous patients owing to a mild or varied phenotype, incomplete penetrance of mutated alleles, modifier genes, fetal loss, or late onset of clinical manifestations such as neurodegeneration. The penetrance of XP-type mutations may depend on many factors, including skin type, the rate at which tumorigenic mutations (eg, TP53) accumulate, possible suppressor mutations, compensating levels of gene expression, and levels of patients’ sun exposure. Unfortunately, the DNA samples in gnomAD were collected in a manner that does not permit obtaining clinical information on donors with variations of interest. Mutations in the XPF endonuclease have been reported to result in diverse clinical manifestations, including Cockayne syndrome and Fanconi anemia in addition to XP.28 These mutations might also cause an unidentified clinical phenotype with minimal skin involvement. In addition, we may have underestimated the number of mutations causing XP since we considered only missense and nonsense mutations in a single database of human mutations. Furthermore, we were not able to independently verify the loss of function for many of the mutations listed in the database.
Our understanding and interpretation of the variant(s) may be incorrect. It is possible that variants in other genes modify the phenotype so that the usual features of XP are not apparent. Studies of some dominantly inherited diseases have made similar observations. For autosomal-dominant Li-Fraumeni syndrome, Rana et al29 found different clinical features in TP53-positive patients who were ascertained by multigene panel testing compared with single-gene panel testing. Furthermore, a higher than expected population prevalence of potentially pathogenic germline TP53 variants in individuals unselected for cancer history was reported by de Andrade et al.30 Similarly, Kim et al31 reported that likely pathogenic variants of the DICER1 gene (associated with the dominant DICER1 cancer predisposition syndrome) are much more prevalent in the large Exome Aggregation Consortium database using Meta Support Vector Machine, Rare Exome Variant Ensemble Learner, and Combined Annotation Dependent Depletion than by considering only loss of function and previously published pathogenic variants.
Conclusions
It is important to follow up with patients who might possibly have genomic abnormalities to determine whether they will develop any clinical manifestations and what those manifestations might be.32 A new NIH precision medicine initiative study (All of Us [https://allofus.nih.gov/]) has a goal of obtaining clinical information as well as DNA sequences from 1 million volunteers of diverse genetic backgrounds and with differences in lifestyle, environment, and biology. We should approach current large genomic databases with caution when trying to estimate the clinical implications of genetic variants to determine the prevalence of disease risk. This caution is also relevant with prenatal testing and identifying disease-associated mutations that may never be expressed, even with recessive disorders.
References
- 1.DiGiovanna JJ, Kraemer KH. Shining a light on xeroderma pigmentosum. J Invest Dermatol. 2012;132(3, pt 2):785-796. doi: 10.1038/jid.2011.426 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Bradford PT, Goldstein AM, Tamura D, et al. . Cancer and neurologic degeneration in xeroderma pigmentosum: long term follow-up characterises the role of DNA repair. J Med Genet. 2011;48(3):168-176. doi: 10.1136/jmg.2010.083022 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Kleijer WJ, Laugel V, Berneburg M, et al. . Incidence of DNA repair deficiency disorders in western Europe: xeroderma pigmentosum, Cockayne syndrome and trichothiodystrophy. DNA Repair (Amst). 2008;7(5):744-750. doi: 10.1016/j.dnarep.2008.01.014 [DOI] [PubMed] [Google Scholar]
- 4.Stenson PD, Mort M, Ball EV, et al. . The Human Gene Mutation Database: towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studies. Hum Genet. 2017;136(6):665-677. doi: 10.1007/s00439-017-1779-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Lek M, Karczewski KJ, Minikel EV, et al. ; Exome Aggregation Consortium . Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536(7616):285-291. doi: 10.1038/nature19057 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Calle EE, Rodriguez C, Jacobs EJ, et al. . The American Cancer Society Cancer Prevention Study II Nutrition Cohort: rationale, study design, and baseline characteristics. Cancer. 2002;94(9):2490-2501. doi: 10.1002/cncr.101970 [DOI] [PubMed] [Google Scholar]
- 7.Landi MT, Consonni D, Rotunno M, et al. . Environment And Genetics in Lung cancer Etiology (EAGLE) study: an integrative population-based case-control study of lung cancer. BMC Public Health. 2008;8:203. doi: 10.1186/1471-2458-8-203 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Prorok PC, Andriole GL, Bresalier RS, et al. ; Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial Project Team . Design of the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial. Control Clin Trials. 2000;21(6)(suppl):273S-309S. doi: 10.1016/S0197-2456(00)00098-2 [DOI] [PubMed] [Google Scholar]
- 9.Bodian DL, McCutcheon JN, Kothiyal P, et al. . Germline variation in cancer-susceptibility genes in a healthy, ancestrally diverse cohort: implications for individual genome sequencing. PLoS One. 2014;9(4):e94554. doi: 10.1371/journal.pone.0094554 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Scitable. Hardy-Weinberg equation. https://www.nature.com/scitable/definition/hardy-weinberg-equation-299. Accessed October 28, 2018.
- 11.Kraemer KH, Lee MM, Scotto J. Xeroderma pigmentosum: cutaneous, ocular, and neurologic abnormalities in 830 published cases. Arch Dermatol. 1987;123(2):241-250. doi: 10.1001/archderm.1987.01660260111026 [DOI] [PubMed] [Google Scholar]
- 12.Fassihi H, Sethi M, Fawcett H, et al. . Deep phenotyping of 89 xeroderma pigmentosum patients reveals unexpected heterogeneity dependent on the precise molecular defect. Proc Natl Acad Sci U S A. 2016;113(9):E1236-E1245. doi: 10.1073/pnas.1519444113 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Ahmad A, Enzlin JH, Bhagwat NR, et al. . Mislocalization of XPF-ERCC1 nuclease contributes to reduced DNA repair in XP-F patients. PLoS Genet. 2010;6(3):e1000871. doi: 10.1371/journal.pgen.1000871 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Sijbers AM, van Voorst Vader PC, Snoek JW, Raams A, Jaspers NG, Kleijer WJ. Homozygous R788W point mutation in the XPF gene of a patient with xeroderma pigmentosum and late-onset neurologic disease. J Invest Dermatol. 1998;110(5):832-836. doi: 10.1046/j.1523-1747.1998.00171.x [DOI] [PubMed] [Google Scholar]
- 15.Norris PG, Hawk JL, Avery JA, Giannelli F. Xeroderma pigmentosum complementation group F in a non-Japanese patient. J Am Acad Dermatol. 1988;18(5, pt 2):1185-1188. doi: 10.1016/S0190-9622(88)70121-8 [DOI] [PubMed] [Google Scholar]
- 16.Berneburg M, Clingen PH, Harcourt SA, et al. . The cancer-free phenotype in trichothiodystrophy is unrelated to its repair defect. Cancer Res. 2000;60(2):431-438. [PubMed] [Google Scholar]
- 17.Imoto K, Boyle J, Oh KS, et al. . Patients with defects in the interacting nucleotide excision repair proteins ERCC1 or XPF show xeroderma pigmentosum with late onset severe neurological degeneration. J Invest Dermatol. 2007;127(suppl 1):S92. [Google Scholar]
- 18.Shanbhag NM, Geschwind MD, DiGiovanna JJ, et al. . Neurodegeneration as the presenting symptom in 2 adults with xeroderma pigmentosum complementation group F. Neurol Genet. 2018;4(3):e240. doi: 10.1212/NXG.0000000000000240 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Hananian J, Cleaver JE. Xeroderma pigmentosum exhibiting neurological disorders and systemic lupus erythematosus. Clin Genet. 1980;17(1):39-45. doi: 10.1111/j.1399-0004.1980.tb00112.x [DOI] [PubMed] [Google Scholar]
- 20.Li L, Bales ES, Peterson CA, Legerski RJ. Characterization of molecular defects in xeroderma pigmentosum group C. Nat Genet. 1993;5(4):413-417. doi: 10.1038/ng1293-413 [DOI] [PubMed] [Google Scholar]
- 21.Robbins JH. Significance of repair of human DNA: evidence from studies of xeroderma pigmentosum. J Natl Cancer Inst. 1978;61(3):645-656. [PubMed] [Google Scholar]
- 22.Bernardes de Jesus BM, Bjørås M, Coin F, Egly JM. Dissection of the molecular defects caused by pathogenic mutations in the DNA repair factor XPC. Mol Cell Biol. 2008;28(23):7225-7235. doi: 10.1128/MCB.00781-08 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Satokata I, Tanaka K, Yuba S, Okada Y. Identification of splicing mutations of the last nucleotides of exons, a nonsense mutation, and a missense mutation of the XPAC gene as causes of group A xeroderma pigmentosum. Mutat Res. 1992;273(2):203-212. doi: 10.1016/0921-8777(92)90081-D [DOI] [PubMed] [Google Scholar]
- 24.Zhou X, Khan SG, Tamura D, et al. . Abnormal XPD-induced nuclear receptor transactivation in DNA repair disorders: trichothiodystrophy and xeroderma pigmentosum. Eur J Hum Genet. 2013;21(8):831-837. doi: 10.1038/ejhg.2012.246 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Takayama K, Danks DM, Salazar EP, Cleaver JE, Weber CA. DNA repair characteristics and mutations in the ERCC2 DNA repair and transcription gene in a trichothiodystrophy patient. Hum Mutat. 1997;9(6):519-525. doi: [DOI] [PubMed] [Google Scholar]
- 26.Itoh T, Linn S, Kamide R, et al. . Xeroderma pigmentosum variant heterozygotes show reduced levels of recovery of replicative DNA synthesis in the presence of caffeine after ultraviolet irradiation. J Invest Dermatol. 2000;115(6):981-985. doi: 10.1046/j.1523-1747.2000.00154.x [DOI] [PubMed] [Google Scholar]
- 27.Opletalova K, Bourillon A, Yang W, et al. . Correlation of phenotype/genotype in a cohort of 23 xeroderma pigmentosum–variant patients reveals 12 new disease-causing POLH mutations. Hum Mutat. 2014;35(1):117-128. doi: 10.1002/humu.22462 [DOI] [PubMed] [Google Scholar]
- 28.Kashiyama K, Nakazawa Y, Pilz DT, et al. . Malfunction of nuclease ERCC1-XPF results in diverse clinical manifestations and causes Cockayne syndrome, xeroderma pigmentosum, and Fanconi anemia. Am J Hum Genet. 2013;92(5):807-819. doi: 10.1016/j.ajhg.2013.04.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Rana HQ, Gelman R, LaDuca H, et al. . Differences in TP53 mutation carrier phenotypes emerge from panel-based testing. J Natl Cancer Inst. 2018;110(8):863-870. doi: 10.1093/jnci/djy001 [DOI] [PubMed] [Google Scholar]
- 30.de Andrade KC, Mirabello L, Stewart DR, et al. . Higher-than-expected population prevalence of potentially pathogenic germline TP53 variants in individuals unselected for cancer history. Hum Mutat. 2017;38(12):1723-1730. doi: 10.1002/humu.23320 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Kim J, Field A, Schultz KAP, Hill DA, Stewart DR. The prevalence of DICER1 pathogenic variation in population databases. Int J Cancer. 2017;141(10):2030-2036. doi: 10.1002/ijc.30907 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Biesecker LG, Mullikin JC, Facio FM, et al. ; NISC Comparative Sequencing Program . The ClinSeq Project: piloting large-scale genome sequencing for research in genomic medicine. Genome Res. 2009;19(9):1665-1674. doi: 10.1101/gr.092841.109 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.