Abstract
Purpose:
Existing resources that characterise the essentiality status of genes are based on either proliferation assessment in human cell lines, viability evaluation in mouse knockouts, or constraint metrics derived from human population sequencing studies. Several repositories document phenotypic annotations for rare disorders, however there is a lack of comprehensive reporting on lethal phenotypes.
Methods:
We queried Online Mendelian Inheritance in Man for terms related to lethality and classified all Mendelian genes according to the earliest age of death recorded for the associated disorders, from prenatal death to no reports of premature death. We characterised the genes across these lethality categories, examined the evidence on viability from mouse models and explored how this information could be used for novel gene discovery.
Results:
We developed the Lethal Phenotypes Portal to showcase this curated catalogue of human essential genes. Differences in the mode of inheritance, physiological systems affected and disease class were found for genes in different lethality categories as well as discrepancies between the lethal phenotypes observed in mouse and human.
Conclusion:
We anticipate that this resource will aid clinicians in the diagnosis of early lethal conditions and assist researchers in investigating the properties that make these genes essential for human development.
Keywords: Mendelian disorders, Lethal phenotypes, Essential genes, Novel gene discovery, Lethal mouse knockouts
Introduction
Defining essentiality and sources of evidence
Essential genes are defined as those required for growth, proliferation, and survival of a cell or an organism. The classification of a gene as essential may vary depending on the level of organisation being considered, the species, the exact definition or thresholds used, e.g. a quantitative gene effect score in different cell lineages. Cellular essential genes are those required for cell proliferation while lethal genes in the mouse are defined by the International Mouse Phenotyping Consortium (IMPC) as those where homozygous knockouts die during embryonic development or soon after birth, during the pre-weaning stage 1. Similarly, in humans, lethality can be investigated prenatally, which provides information on the essential nature of genes for early organism development 2,3. From an evolutionary perspective, essential genes could be defined as those required for growth to a fertile adult, i.e. affected individuals die before reproductive age in the absence of treatment, or even genes leading to physical and intellectual phenotypes that impede reproductive success 4. The complete loss-of-function (LoF) of a gene may lead to a wide spectrum of phenotypic abnormalities ranging from: clinical infertility due to embryonic loss at the earliest stages of development, i.e. embryonic lethality before a pregnancy is clinically recognised; prenatal lethality; early-onset neurodevelopmental, metabolic and skeletal disorders that may result in neonatal, infant or childhood death; disorders associated with premature death; abnormal phenotypes that may not impact life expectancy; or even no detectable clinical phenotypes.
Moreover, gene essentiality is not an absolute or binary trait. Even when using exactly the same definition, and considering the same organism and organisational level, gene essentiality may be context or tissue specific (gene is essential only in certain cell types) or genetic background specific, e.g. discrepancies in mouse viability have been found for up to 10% of genes when knocked out 5. Further, lethality can manifest with incomplete penetrance in the presence of the same genetic background and null allele 1. Traditionally, essentiality has been evaluated in homozygous knockouts, however there are haplo-essential genes that cannot tolerate a LoF mutation in one or both alleles 6,7. This set of genes is not as well characterised in mouse knockouts, and the comprehensive approach of the IMPC presents an opportunity to explore this more extensively.
The current evidence on essential genes in humans comes from various sources providing insights into the phenotypic impact on a gene’s LoF at distinct levels, as recently reviewed 8 and illustrated in Figure 1. These include: i) Genes essential for cell proliferation in human cancer cell lines and human pluripotent stem cells (hPSCs) 9–11, factoring in the challenge of defining a ‘core’ set of essential genes 12; ii) Intolerance to variation metrics derived from large scale human population sequencing programmes and machine learning approaches 6,13,14. Notably, these scores provide a measure of how intolerant to (heterozygous) LoF a gene is and how likely it is to underlie single-gene disorders, but not on the nature or severity of the phenotype. The presence or deficit of homozygous protein-altering variants can also help to understand gene constraint and potential phenotypic impact 15,16; iii) Resources compiling clinical reports on single-gene disorders that enable users to conduct queries based on phenotypic criteria, such as records of early death, including the Online Mendelian Inheritance in Man 17 (OMIM), the Human Phenotype Ontology 18 (HPO) and the Monarch Initiative 19 repositories; iv) Human orthologues of essential genes in different unicellular and multicellular organisms, especially genes that are essential for mammalian organism development 7,20–22.
How can knowledge of gene essentiality inform human disease studies?
Around 20–25% of the protein coding genome is recognised to be associated with single-gene disorders 17. Recent estimates suggest the final number of Mendelian genes will be 1.5–3 times higher 23. Many patients remain without a molecular diagnosis after exome or genome sequencing, and potentially pathogenic variants in genes with no proven involvement in the condition could explain a fraction of these cases 24. Among the strategies to identify novel Mendelian genes, mouse knockout databases constitute a source of candidate genes linking phenotypic outcomes, including prenatal lethality, to LoF variation 25.
Mouse knockout lines with viability information are now available through the IMPC resource for up to one third of the protein coding genome, and Mendelian disease genes are significantly enriched for lethality in the mouse 1,5,26. However, the proportion of lethal genes is not evenly distributed across disease categories. A higher number of affected physiological systems has been found in disorders associated with mouse lethal genes 5,27,28 (Figure 2).
When we break down the set of mouse lethal genes into more granular categories, we observe that the sets of developmental lethal or late gestation lethal genes are driving the enrichment in human disease genes 5,29. Postnatal, non-lethal phenotypes in humans may occur as the result of haploinsufficiency or hypomorphic variants that lead to reduced protein function 27. Even when humans do not exhibit the extreme phenotype of lethality associated with mouse null alleles, the mouse embryonic manifestations and postnatal abnormalities in the early adult heterozygous knockout contribute to facilitating the interpretation of variants identified in humans with different molecular consequences, variable penetrance and/or expressivity and to understanding disease mechanisms 30. It is worth noting that most of the homozygous lethal mouse knockouts present a viable but abnormal phenotype in the heterozygous state (1,497/1,891, 79%; IMPC DR20.1). The actual percentage may be even higher as not all the lethal lines have completed the planned phenotyping screen for the corresponding heterozygous knockout, or if additional phenotypes were explored. Overall, the set of genes that are lethal in the mouse and not currently associated with human disease constitutes a powerful source of genes potentially linked to Mendelian phenotypes, including prenatal and neonatal death 5,26,29,31. Gene and variant prioritisation strategies leveraging this information have been successful in identifying novel neurodevelopmental disease genes 32–34.
The challenge of diagnosing lethal fetal disorders
Prenatal exome, and more recently, genome sequencing has been introduced to routine clinical care for at-risk pregnancies in which a genomic diagnosis would guide management of the foetus 35, and in the extreme case of prenatal death, to perform molecular genetic testing to determine the genetic cause of pregnancy loss or perinatal death 36. Similar to later onset phenotypes, an important proportion of pregnancy losses lack a molecular diagnosis. Microarray analysis of sporadic and recurrent pregnancy loss samples did not detect clinically significant chromosomal abnormalities in ~ 42% of the samples, with pregnancy losses occurring during the earlier stages of gestion being more likely due to such genomic imbalances. In the case of stillbirths, the percentage of cases with non-chromosomal abnormalities goes up to 85–90% 37. Remarkably, stillbirths were recently found to be enriched for LoF variants in genes not linked to disease, compared to undiagnosed patients with postnatal manifestations 3. The cumulative evidence suggests that pregnancy loss in euploid pregnancies can have a Mendelian or polygenic origin, indicative of the ‘essential’ nature of the implicated genes, with different biological processes associated with the developmental stage at which lethality occurs 5,29,38.
When investigating potentially pathogenic variants associated to prenatal death, one of three scenarios are possible: i) the gene is a known Mendelian gene and the lethal phenotype association has been previously described, ii) the gene is a known Mendelian gene with postnatal and/or other prenatal manifestations and the prenatal lethal phenotype constitutes a phenotypic expansion, or iii) the gene is not yet known to be associated to a single-gene disorder, and the abnormal prenatal phenotypes may be limited to fetal life and the outcome consistently involves prenatal or perinatal lethality 39. Allelic presentation needs to be factored in, including complete vs partial LoF or biallelic lethal vs monoallelic viable with other postnatal manifestations 40,41.
Many fetal demises are sporadic, and a monogenic cause may not be suspected, therefore molecular genetic testing is either not performed or it does not provide a definite diagnosis, resulting in the absence of a gene-phenotype association 41. Previous evidence supports a model where highly intolerant to LoF variation genes are not known to be associated with recognisable human phenotypes since the outcome is always early embryonic death 5,31,42. As a result, these genes are likely underrepresented in existing disease databases.
The lack of a fetal phenotype resource, equivalent to those assisting the molecular diagnosis of postnatal disorders, adds to the challenge, although efforts are being made in that direction. These include the creation of a Prenatal HPO working group as part of the Fetal Sequencing Consortium 43 and the submission of new fetal phenotype-genotype associations to facilitate variant curation 44.
The need for a catalogue of lethal phenotypes in humans
First, as described above, the direct evidence for human essential genes comes mainly from cell proliferation assays in human cancer cell lines. Postnatal and embryonic viability screens in mouse reveal a larger set of genes playing a fundamental role in developmental processes, suggesting that the set of human cellular essential genes will not necessarily capture those genes essential for human development beyond the earliest embryonic divisions 5.
Second, the number of molecular autopsies, sequencing studies of fetal structural anomalies, often severe and lethal, and those aimed at identifying genetic variants associated with pregnancy loss and perinatal death is increasing 3,36,41,45,46, and the associations with known and novel Mendelian genes is expected to increase accordingly.
Attempts have been made to try and identify these prenatal lethal phenotypes from the literature 47. One informatic toolkit retrieved data from different sources to generate a list of candidate genes to be associated to unexplained infertility and prenatal or infantile mortality 26. Information from this resource combined with mouse evidence and LoF variants documented in the gnomAD database has recently been used to generate a new candidate set of genes related to human lethality and to compute heterozygous rates for pathogenic/likely pathogenic variants in those genes 48. There are several resources for single-gene disorders that enable users to perform phenotypic queries, including OMIM and the HPO repositories. A previous study based on OMIM reports 624 genes with perinatal lethal phenotypes 26. The most recent HPO release (v2024–01-11) contains 457 genes and 514 disorders with an age of death annotation, likely an underestimation 18. The information currently available from OMIM on lethal phenotypes is not captured in a comprehensive manner, mainly being described in heterogeneous free text reports.
To address these limitations, we decided to develop the Lethal Phenotypes Portal, an online web application, to showcase a curated catalogue of Mendelian genes with lethal phenotypes identified through the OMIM knowledge base. Here, we searched OMIM using a number of terms related to lethality, then collated and curated the resulting hits to categorise genes into different lethality categories according to the earliest age of death reported using HPO age of death terms and definitions 8,18, from prenatal death to death in adulthood to genes with no reports of early death. Next, we characterised the genes across the different lethality categories, explored how this information combined with phenotypic similarity measures and gene group annotations can be used for novel gene discovery, and examined the evidence on mouse viability. Additional visualisations available through the web tool allow to inspect how these categories correlate with other metrics on gene essentiality.
Methods
Data collection
No identifiable patient information was included in this study.
OMIM Data
Disease-gene associations were data mined by using the OMIM API 17 [https://www.omim.org; Data last accessed 24/11/23] to search terms linked to lethality. Manual curation of all the hits was performed and a series of inclusion and exclusion criteria were applied to discard ambiguous reports of lethality. The list of terms and summary of the OMIM data curation process is illustrated in Figure 1. The initial queries date back from May 2020, subsequent queries and curation of hits was performed, including the inspection of 10% of previous curated entries with potential updates. This implied the reclassifications of some lethality categories, mainly between L1 to L3 and L6 to LU labels. The OMIM Morbid Map was pruned to exclude provisional gene-phenotype relationships, non-diseases and drug response phenotypes. Disorders with somatic, multifactorial and digenic modes of inheritance were also excluded. As a result, the gene-phenotype associations included in the catalogue are limited to Mendelian phenotypes with molecular basis known and where the gene is classified as protein coding according to HGNC 49.
The initial hits that were not included after manual curation were labelled as ambiguous entries. Each unique disease-gene association was assigned to a ‘lethality category’ based on the earliest time point in which lethality had been documented to occur, with categories grouped by age ranges defined by the HPO age of death categories 18. The exact definitions of each lethality category set and more details on the entirety of the OMIM curation can be found in the web application. Additional information clarifying whether the evidence of a lethal phenotype comes from a proband or a family member, such as history of miscarriages in the family, is also included.
Gene Properties and Essentiality Annotations
HGNC ids and information on gene groups 49 [https://www.genenames.org/; Data accessed 19/12/23], gene-disease associations according to OMIM and their associated mode of inheritance(s) and abnormal phenotypes according to HPO annotations [https://hpo.jax.org/app/data/annotations; Data accessed 19/12/23] were retrieved for all human protein coding genes. Information on disease categories was retrieved from Genomics England PanelApp API, an open knowledgebase of virtual gene panels related to human disorders 50 [https://panelapp.genomicsengland.co.uk/api/v1/genes/; Data accessed 19/12/23].
Several additional metrics and properties of essentiality for each gene were collected: Intolerance to LoF variation gene-level metrics from gnomAD v4 6 [https://gnomad.broadinstitute.org/], Selection coefficients on fitness from RGC-ME 14, Gene viability data from mouse orthologues from the IMPC web portal 22 [https://www.mousephenotype.org/; mouse viability assessment, DR 20.1, Data accessed 15/12/23] and the MGI database 7 querying lethal phenotypes from Dickinson et al. 1 [https://www.informatics.jax.org/; Data accessed 15/12/23] and Human cell line proliferation scores according to the Cancer Dependency Map Portal’s Project Achilles 9 [DepMap 23Q4 CRISPR Gene Effect score, https://depmap.org/portal/; Data accessed 15/12/23].
Phenotypic similarity scores between Mendelian disorders were computed using PhenoDigm 51 and HPO annotations 18 [https://hpo.jax.org; Data accessed 04/07/23].
Database organisation / Web application
The gene annotations were ultimately organised into two main files within the Lethal Phenotypes Portal: OMIM curation and gene annotations. The web application was built using R programming language (v4.3.1) 52, ‘shiny’(v.1.7.5) 53 and ‘shinydashboard’ (v0.7.2) 54, which allows for the relevant information contained in the catalogue to be presented in a dashboard format. The ‘plotly’ (v4.10.2) R package 55 was used to create visualizations of interactive plots and ‘DT’ (v0.28) R package 56 as an interface to the DataTables JavaScript library to display the resulting datasets. Other packages used include ‘dplyr’ (v1.1.2) 57 and ‘stringr’ (v1.5.0) 58.
Data analysis
The figures in the manuscript were created using ‘ggplot2’ 59 and ‘networkD3’ 60 R packages. All the statistical analyses including Odds Ratios and Fisher test, correlation coefficients, Mann-Whitney test were performed in R 52.
Gene family enrichment analysis: 1,053 out of 1,509 gene groups provided by HGNC include at least one gene present in the catalogue curated from OMIM (subset of genes associated with Mendelian phenotypes with molecular basis known). For each one these gene groups, the proportion of genes in the catalogue was computed, and for those genes in the catalogue, the proportion of genes in each ‘merged’ lethality category: pre-infant lethal, post-infant lethal and non-lethal. Odds Ratio, CI and Fisher test p-values were computed for each group to identify gene groups enriched in OMIM genes and any lethality category. Uncorrected p-values are shown.
Phenotypic similarity analysis: For each OMIM disorder, the associated HPO phenotypes are retrieved and the phenotypic similarity for all disease-disease pairwise combinations is computed. Each disorder is then mapped to its associated gene/s to obtain gene-gene scores. The distribution of phenotype similarity scores for genes in a given gene group can be compared with similarity scores for different subsets: catalogue genes belonging to the same gene family and same lethality category, catalogue genes belonging to the same gene family and different lethality category, catalogue genes belonging to the same lethality category (regardless of gene group), catalogue genes belonging to the same gene group (regardless of lethality category), catalogue genes belonging to different gene group, catalogue genes belonging to different lethality category.
Results
A comprehensive resource of genes with lethal phenotypes in humans
Web application
The Lethal Phenotypes Portal is an online resource that provides users with a catalogue of human genes that are associated with documented lethal phenotypes in Mendelian disorders within OMIM. The web interface contains the full catalogue, allowing for queries and customised downloads, and a set of modules where genes in different lethality categories can be explored and compared with other sources of evidence on gene intolerance to LoF variation: constraint metrics, including LoF Observed/Expected Upper-bound Fraction (LOEUF) from gnomAD and selection against heterozygous (shet) scores inferred from 1 million genomes; gene effect scores from CRISPR cancer cell knockouts from DepMap and evidence on lethality from mouse knockout screens from the IMPC and phenotype annotations from MGI (see Methods). It shows a series of visualisations on the distribution of these metrics across different lethality categories (Figure 3a).
The OMIM queries and curation constitute the main source of evidence on lethal phenotypes in humans captured in the resource. The outline for the query strategy and subsequent curation of the lethal phenotype hits is shown in Figure 3b and explained in detail in the Methods section and web application.
After manual curation and exclusion of ambiguous entries, we found that 57% (2,133/3,773) of genes associated with human single-gene disorders catalogued in OMIM were not retrieved through the queries, suggesting no clinical records of lethality (non-lethal genes), 33% (1,239/3,773) are only associated to disorders with records of lethal phenotypes (as defined in Methods and Figure 3b), and 11% (401/3,773) are linked to both lethal and non-lethal phenotypes. With regards to lethality categories, 975 genes (59% of all lethal genes (1,640), 26% of disease genes) have records of prenatal, neonatal or infant death (pre-infant-lethal) as opposed to post-infant-lethal, where the earliest reported age of death ranges from childhood to adulthood. (Figure 3b, 3c, see Methods). The distribution of genes according to lethality categories is based on the earliest age of death reported.
Characterisation of the set of lethal genes
Analysis of HPO annotations of the mode of inheritance revealed that the genes linked to early death show a depletion of autosomal dominant (AD) inheritance genes: 12% (118/975) of pre-infant-lethal genes are AD compared to 25% (165/665) of post-infant-lethal genes and 34% (719/2,133) of non-lethal genes (Figure 4a).
Exploring the prenatal phenotypes associated with the genes in the catalogue, i.e. those abnormal phenotypes under the ‘Abnormality of prenatal development or birth’ and ‘Intrauterine growth retardation’ parental terms, we observe a consistent trend across lethality categories. The earlier the age of death, the higher the likelihood of prenatal manifestation for a gene/disorder (Figure 4b).
The number of top-level HPO terms – phenotype terms that are direct descendants of the term ‘Phenotypic abnormality’ (HP:0000118) –, a proxy for the number of physiological systems affected, is significantly higher for the set of lethal genes compared to other disease associated genes, reflecting the multisystemic nature of these disorders and the presence of more severe clinical manifestations leading to premature death (Figure 4c). In accordance with this observation, the percentage of genes with an abnormal phenotype mapping to any of these individual systems is higher among those genes with records or early lethality compared to post-infant-lethal genes and non-lethal genes (Figure 4d). Similar patterns are observed when PanelApp disease classes are considered instead (Figure 4e). Consistent with the results reported using mouse viability data (Figure 2d), for ‘Ophthalmological disorders’ and ‘Hearing and ear disorders’ the percentage of genes is higher among the non-lethal category.
Interestingly, up to 26 % (250) of genes associated with disorders with pre-infant-lethality are also associated with other disorders with no records of lethal phenotypes. For 66 of these genes (26%), differences in allelic requirement could explain the differences in the severity of the phenotypes, since all the lethal forms are autosomal recessive (AR), and the associated non lethal disorders are AD (Supplementary File 1). Examples include ACTL6B or ALG8 where, in addition, mouse mutants support this allelic model where the homozygous knockout is embryonic lethal and the heterozygous knockout shows phenotypic abnormalities mimicking the phenotypes of the associated disorders (Figure 5a).
A further analysis of the causal variants would be needed to explore which other factors may help explain the spectrum of phenotypic severity, e.g. distinct location of variants (different protein domains); degree of functional impact (null alleles vs hypomorphic alleles); qualitative variation in functional impact, such as LoF vs gain-of-function (GoF) 61. It is worth noticing that GoF variation is more common in de novo/dominant disorders 62. The same variant can even be associated with different disorders or variations of a phenotypic spectrum, indicating other mechanisms need to be involved, including gene-environment interactions, genomic imprinting, stochastic forces or genetic modifiers 61,63. Other factors that could explain variable penetrance comprise digenic/oligogenic inheritance, gene expression levels, age or gender 64. This once again reflects the challenges of trying to classify genes into binary categories, i.e. essential/lethal vs non-essential/non lethal and the need to build allelic-phenotypic series including prenatal phenotypes and age of death.
Finally, we investigated HGNC gene families/groups that are significantly enriched for both OMIM disease genes and one of the lethality categories (see details in Methods) and highlight two of them in Figure 5b. The ‘Glycoside hydrolases’ group was significantly enriched for pre-infant lethal genes and the group ‘Beta-gamma crystallins’ enriched for non-lethal genes. Genes in the same gene group/family not currently associated to Mendelian phenotypes are suggested as candidates: PGGHG and ENGASE, and CRYBA2 (currently with a phenotype association reported as provisional in OMIM), CRYGA, and CRYGN respectively. The complete gene list is available in Supplementary File 2. This finding is supported by the analysis of phenotypic similarity scores for the genes in these two groups where genes within the same gene group and with the same lethality category showed higher similarities compared to other genes in different groups and lethality categories (Figure 5c). Phenotypic-driven, variant prioritisation algorithms, like Exomiser 65, are already used to identify diagnostic variants. These could potentially be expanded to detect variants in novel disease genes where the gene belongs to the same gene group and lethality category as a known disease gene with associated phenotypes similar to those of the patient under investigation.
How well does essentiality correlate between organisms?
Using the set of genes associated with prenatal and neonatal lethality in humans and combining it with viability data from the mouse orthologues, we can look at the overlap between the sets of lethal genes in the two species. Out of 438 pre-infant lethal genes in humans with IMPC mouse orthologue data on viability, 322 are also lethal in the mouse, while 116 genes are mouse viable, which implies a discrepancy of 26% between the two organisms in terms of gene essentiality. This percentage is slightly lower if we include only pre- and perinatal death (22%). By contrast, we find AR disease genes, where no records of premature death were captured and the corresponding mouse orthologue is lethal (256/586, 44%). Some of the hypothesised reasons behind these discrepancies are highlighted in Figure 6. These range from differences in the type of genetic variants and mechanisms (LoF vs non-LoF) to variable transcriptional and functional compensation mechanisms. For the disease genes in the other two categories, post-infant-lethal and AD disease genes with no records of lethal phenotypes, the differences in lethality could be more easily explained, i.e. deficit of homozygous variants leading to embryonic lethality in humans.
Of the total number of genes with a lethal phenotype in knockout mice and a one-to-one human orthologue (IMPC and MGI combined, 5,064), up to 54% have no phenotype associations reported in humans to date according to OMIM. Given the strong and consistent evidence of the association of lethal genes in the mouse and disease genes in humans, these 2,721 genes represent a substantial source of potential candidates for Mendelian disorders, including prenatal conditions. This information, combined with other sources of gene essentiality as displayed in the web application can assist the prioritisation of variants in novel genes.
Discussion
Relevance of the resource
Essential genes and Mendelian (lethal) genes are not two independent concepts. The experimental evidence we have on essential genes comes mainly from cell proliferation assays and model organism viability studies. The current sources of human lethal phenotypes consist of single-gene disorders repositories, since the disruption of a gene function leading to embryonic/prenatal lethality can be interpreted as the most severe manifestation of these disorders. However, these phenotypes, and their associated genes, are not comprehensively captured in current databases. Here we queried and curated lethal phenotypes described in the OMIM catalogue to categorise human disease genes, using HPO terms, according to the earliest reported age of death. We integrated this data with metrics on gene constraint inferred from human population sequencing data, cell and mouse essentiality status, and provide a number of user-interactive visual and analytical features. In making this resource openly available to the public, we hope that this application will be used as a tool to aid clinical geneticists in diagnosing early lethal conditions, allowing better informed pre- and perinatal counselling and family planning. Additionally, it will assist researchers investigating what makes these genes so essential for human development, and at what stage.
We also describe a characterisation of the set of lethal and non-lethal genes in humans, highlighting potential strategies for novel Mendelian gene discovery. It is unlikely that we have identified most of the genes associated with rare disorders 23, let alone all the monogenic forms of embryonic loss (before a pregnancy is recognised) and fetal death, that once again, may be considered an extreme manifestation of some Mendelian conditions. Information on lethality category, gene group annotations, and phenotypic similarity between undiagnosed patients and known disorders or among patients, could be integrated for this purpose. Strategies for variant prioritisation leveraging information from other members of the gene group have previously been successfully implemented 66. When assessing the significance of potentially pathogenic variants in unknown disease genes, evidence of lethality in mice in combination with intolerance to variation metrics have independently been used to prioritise candidate variants in potential novel genes 5,36,48.
What have we learned about lethal phenotypes in humans?
Analysis of the phenotype annotations of the genes, categorised by earliest age of death for their associated disorders, revealed several correlations. First, the proportion of AD disease genes is significantly lower for pre-infant lethal genes. Second, the number of physiological systems affected is significantly higher when we compare lethal vs non-lethal genes. Third, there is a significant correlation between lethality category and presence of prenatal abnormalities. In terms of disease categories, metabolic, dysmorphic and congenital abnormality syndromes and skeletal disorders are more frequent among pre-infant lethal genes compared to post-infant lethal and non-lethal genes. The classification used for this analysis based on PanelApp disease categories 50 presents potential bias due to how broad some groups are compared to others, as well as potential overlaps. It is also important to mention that, prenatally, organ system anomalies are what can be most effectively assessed and compared, and these prenatal phenotypes could be related to different categories, like fetal hydrops, a unique severe often lethal prenatal phenotype can be associated to cardiovascular, immunological, or the haematological category. More granular associations can also be found when we explore the exact embryonic stage at which the mouse embryo dies 29. It is important to emphasise that while mouse embryonic stage refers to any developmental stage before birth, in humans there is an embryonic and a fetal stage.
Comparison between mouse and human
Comparing the set of essential genes in different species is a particularly challenging task. Even within the same species, several factors may affect the essentiality assessment: the specific developmental stage at which viability is evaluated, the exact measure of viability (survival vs fitness), the assessment approach (inference, e.g. absence of biallelic complete LoF mutations in population sequence data vs observation, e.g. molecular autopsy), environmental conditions and genetic background 5,67.
The mouse is the most extensively used model organism in the study of human disease, particularly in the context of single-gene disorders, allowing us to explore how genetic variants impact the phenotype 68,69. However, the ability of mouse models to capture human phenotypes is not without limitations 70. When performing comparisons between mouse and human lethal genes, several considerations need to be taken into account: 1) RNA expression profiles, including differences in developmental gene expression that may reflect physiological differences between these two organisms 71,72. This is supported by evidence of evolutionary divergence in regulatory networks contributing to phenotypic differences 73,74; 2) molecular function and biological processes are likely to remain constant between different species while physiological relevance may differ 75; 3) based on that, we may want to consider essential functions instead of essential genes, and approaches based on projection over functional modules, such as pathways or networks, have already been implemented 67; 4) even within the same species, we may observe variability due to genetic background, for example differences in lethality were found for up to 10% of mouse knockouts with data on viability 5. Additionally, lethality might show incomplete penetrance even with the same mutation and genetic background. Taking all these factors into account, a 75% overlap between pre-infant lethal and mouse lethal genes, along with the observation that disease categories where prenatal and neonatal lethal phenotypes in humans are more frequently reported show a higher percentage of genes with an orthologue mouse knockout that is also lethal at embryonic and pre-weaning developmental stages, are indicative of concordant cross-species phenotypic effects with certain degree of variability. This percentage of genes with discordant phenotypes between the two species is consistent with previous findings 75. Overall, while differences in essentiality between mouse and human are undoubtedly expected to some extent, to date, the set of mouse lethal genes remains our most valuable source of information on genes essential for mammalian development. The evidence is strong and consistent regarding the association of lethal genes in the mouse and disease genes in humans 1,5,76. Consequently, the set of mouse lethal genes with no existing human evidence constitutes a powerful source of potential candidates for Mendelian disorders, including those with prenatal and neonatal lethal phenotypes.
Challenges and limitations
One of the main limitations is the potential for false negatives when determining the lethality category associated with each human disease gene. First, the queries may have failed to detect all the records of death described in OMIM. Second, and a more likely factor leading to misclassification of genes, is that the curation is limited to the information captured in OMIM, and does not include the original source. A gene classified as non-lethal indicates that we were not able to retrieve any record of lethality in OMIM, and thus the non-lethal phenotype category may be overestimated. Third, there is a manual curation component that is prone to human error and/or biased interpretation. Establishing the link with lethality is not always straightforward from the information available in this resource. For example, the cause and age of death may be ambiguous, or the reported death may refer to siblings or other family members with variable evidence of being affected by the same disorder as the proband.
Allele type is an important caveat. Clearly we do not have a full view of genome-wide nullizygosity in humans, only inferences from constraint and heterogenous reports of disease-associated lethality, the latter of which is likely to be impacted by the nature of the reported alleles and the specific mechanisms. It is worth noting that for some of the gene-disease associations captured in the catalogue, the mechanisms may not be necessarily one of LoF, and GoF mutations can also lead to lethal phenotypes 77,78. Similarly, within the set of AD disease associated genes we may find both de novo and inherited monoallelic variants. Lastly, recent molecular autopsies studies are not necessarily captured in OMIM, since there is a gap between a novel gene disease association being published and captured by this repository. In the same manner, brief reports of expansion of phenotypes to include prenatal lethality may not necessarily be reflected.
Further plans
The increasing number of cases undergoing prenatal and neonatal sequencing and molecular autopsies will reveal novel Mendelian genes as well as expansion of the phenotypic spectrum for other known disease genes. Complementing this resource with a literature review of these studies might add a set of potential candidate genes to be associated to prenatal and neonatal lethal phenotypes where predicted pathogenic variants in novel genes are identified 36,47. For those cases where early death constitutes a novel phenotype in a known disease gene, we will focus on generating allelic-phenotypic series and establish correlations between variants, gene and protein features and the clinical manifestations observed in patients.
Other categories of essential genes could be incorporated in this catalogue. A comprehensive resource of genes and variants associated with infertility in different model organisms and humans has recently been published 79. Infertility is an emerging public health issue as 10–20% of couples are infertile worldwide and global fertility rates are falling 80. Approximately 30–40% of infertility cases are of unknown aetiology 81,82. The contribution of essential/lethal genes to clinical infertility due to recurrent embryonic loss at the very earliest stages before a pregnancy is recognised is currently underappreciated, but we would expect them to explain a proportion of these cases, e.g. genes affecting zygotic and early cleavage stage survival 83,84. Records of lethality for some of these genes included in the catalogue are often heterogenous and/or ambiguous and should be interpreted with caution.
Similarly, it would be useful to create specific categories for maternal effect genes (MEG) associated with clinical infertility due to early embryonic loss as well as infertility due to abnormal oocyte development. Miscarriage constitutes another recognised phenotype for some MEGs, in particular due to recurrent molar pregnancy 85. An additional category could include those genes where LoF may be associated with other phenotypes linked to reduced reproductive success 86. The information curated for this study is currently in the process of being incorporated into the HPO resource. Overall, this catalogue represents one more step towards eventually categorising all human genes across the full spectrum of intolerance to variation: from genes with pathogenic variants leading to early embryonic death to rare, genuine, homozygous LoF variants found in healthy adult individuals.
Supplementary Material
Acknowledgments
This research utilised Queen Mary’s Apocrita HPC facility, supported by QMUL Research-IT: http://doi.org/10.5281/zenodo.438045. We are grateful to present and past members of the QMUL ITS Research team (Tom Bradford, Giles Greenway, Iain Barrass) for their help with the development and deployment of the Shiny app.
Funding statement
This work was supported by National Institutes of Health Grants UM1HG006370 (P.C., D.S.), R01HD055651 and P50HD103555 (I.B.V.d.V.), UM1OD0023222 (S.A.M.), 1F32HD112084 (M.D.), 5U24HG011449-03 (P.N.R.) and 5R01HD103805-03 (D.S., P.N.R.).
Footnotes
Ethics declaration
The present study used only openly available human and mouse data that were originally located at Online Mendelian Inheritance in Man (https://www.omim.org/) and International Mouse Phenotyping Consortium (https://www.mousephenotype.org/) respectively.
Conflict of interest
The authors declare no conflict of interest.
Publisher's Disclaimer: This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Data availability
The web interface is available for data exploration at: https://lethalphenotypes.research.its.qmul.ac.uk. Supplementary Files accompanying this manuscript can be found in the following repository: https://zenodo.org/records/10419108
References
- 1.Dickinson ME et al. High-throughput discovery of novel developmental phenotypes. Nature 537, 508-+ (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Shamseldin HE et al. Identification of embryonic lethal genes in humans by autozygosity mapping and exome sequencing in consanguineous families. Genome Biology 16(2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Stanley KE et al. Causal Genetic Variants in Stillbirth. New England Journal of Medicine 383, 1107–1116 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Gao ZY, Waggoner D, Stephens M, Ober C. & Przeworski M. An Estimate of the Average Number of Recessive Lethal Mutations Carried by Humans. Genetics 199, 1243–1254 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Cacheiro P. et al. Human and mouse essentiality screens as a resource for disease gene discovery. Nature Communications 11(2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Karczewski KJ et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434-+ (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Blake JA et al. Mouse Genome Database (MGD): Knowledgebase for mouse-human comparative biology. Nucleic Acids Research 49, D981–D987 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Cacheiro P. & Smedley D. Essential genes: a cross-species perspective. Mammalian Genome 34, 357–363 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Tsherniak A. et al. Defining a Cancer Dependency Map. Cell 170, 564-+ (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Mair B. et al. Essential Gene Profiles for Human Pluripotent Stem Cells Identify Uncharacterized Genes and Substrate Dependencies. Cell Reports 27, 599-+ (2019). [DOI] [PubMed] [Google Scholar]
- 11.Yilmaz A, Peretz M, Aharony A, Sagi I. & Benvenisty N. Defining essential genes for human pluripotent stem cells by CRISPR-Cas9 screening in haploid cells. Nat Cell Biol 20, 610–619 (2018). [DOI] [PubMed] [Google Scholar]
- 12.Sharma S, Dincer C, Weidemuller P, Wright GJ & Petsalaki E. CEN-tools: an integrative platform to identify the contexts of essential genes. Molecular Systems Biology 16(2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Zeng T, Spence JP, Mostafavi H. & Pritchard JK Bayesian estimation of gene constraint from an evolutionary model with gene features. bioRxiv, 2023.05.19.541520 (2023). [DOI] [PubMed] [Google Scholar]
- 14.Sun KY et al. A deep catalog of protein-coding variation in 985,830 individuals. bioRxiv, 2023.05.09.539329 (2023). [Google Scholar]
- 15.Narasimhan VM et al. Health and population effects of rare gene knockouts in adult humans with related parents. Science 352, 474–7 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Oddsson A. et al. Deficit of homozygosity among 1.52 million individuals and genetic causes of recessive lethality. Nature Communications 14, 3453 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Amberger JS, Bocchini CA, Scott AF & Hamosh A. OMIM.org: leveraging knowledge across phenotype-gene relationships. Nucleic Acids Research 47, D1038–D1043 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Kohler S. et al. The Human Phenotype Ontology in 2021. Nucleic Acids Research 49, D1207–D1217 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Putman TE et al. The Monarch Initiative in 2024: an analytic platform integrating phenotypes, genes and diseases across species. Nucleic Acids Res 52, D938–D949 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Luo H, Lin Y, Gao F, Zhang CT & Zhang R. DEG 10, an update of the database of essential genes that includes both protein-coding genes and noncoding genomic elements. Nucleic Acids Research 42, D574–D580 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Gurumayum S. et al. OGEE v3: Online GEne Essentiality database with increased coverage of organisms and human cell lines. Nucleic Acids Research 49, D998–D1003 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Groza T. et al. The International Mouse Phenotyping Consortium: comprehensive knockout phenotyping underpinning the study of human disease. Nucleic Acids Research 51, D1038–D1045 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Bamshad MJ, Nickerson DA & Chong JX Mendelian Gene Discovery: Fast and Furious with No End in Sight. American Journal of Human Genetics 105, 448–455 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Smedley D. et al. 100,000 Genomes Pilot on Rare-Disease Diagnosis in Health Care - Preliminary Report. New England Journal of Medicine 385, 1868–1880 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Seaby EG, Rehm HL & O’Donnell-Luria A. Strategies to Uplift Novel Mendelian Gene Discovery for Improved Clinical Outcomes. Frontiers in Genetics 12(2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Dawes R, Lek M. & Cooper ST Gene discovery informatics toolkit defines candidate genes for unexplained infertility and prenatal or infantile mortality. Npj Genomic Medicine 4(2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Dickerson JE, Zhu A, Robertson DL & Hentges KE Defining the Role of Essential Genes in Human Disease. Plos One 6(2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Hentges KE Essential Genes and Human Genetic Disease. in Encyclopedia of Life Sciences. [Google Scholar]
- 29.Cacheiro P. et al. Mendelian gene identification through mouse embryo viability screening. Genome Medicine 14(2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Turgeon B. & Meloche S. Interpreting Neonatal Lethal Phenotypes in Mouse Mutants: Insights Into Gene Function and Human Diseases. Physiological Reviews 89, 1–26 (2009). [DOI] [PubMed] [Google Scholar]
- 31.Spataro N, Rodriguez JA, Navarro A. & Bosch E. Properties of human disease genes and the role of genes linked to Mendelian disorders in complex disease aetiology. Human Molecular Genetics 26, 489–500 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Cousin MA et al. Pathogenic SPTBN1 variants cause an autosomal dominant neurodevelopmental syndrome. Nature Genetics 53, 1006-+ (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Vetro A. et al. Stretch-activated ion channel TMEM63B associates with developmental and epileptic encephalopathies and progressive neurodegeneration. Am J Hum Genet 110, 1356–1376 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Dhindsa RS et al. Genome-wide prediction of dominant and recessive neurodevelopmental disorder risk genes. bioRxiv, 2022.11.21.517436 (2022). [Google Scholar]
- 35. https://www.genomicseducation.hee.nhs.uk/genotes/knowledge-hub/r21-rapid-prenatal-exome-sequencing/.
- 36.Byrne AB et al. Genomic autopsy to identify underlying causes of pregnancy loss and perinatal death. Nat Med 29, 180–189 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Finley J. et al. The genomic basis of sporadic and recurrent pregnancy loss: a comprehensive in-depth analysis of 24,900 miscarriages. Reprod Biomed Online 45, 125–134 (2022). [DOI] [PubMed] [Google Scholar]
- 38.Robbins SM, Thimm MA, Valle D. & Jelin AC Genetic diagnosis in first or second trimester pregnancy loss using exome sequencing: a systematic review of human essential genes. J Assist Reprod Genet 36, 1539–1548 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Meier N. et al. Exome sequencing of fetal anomaly syndromes: novel phenotype-genotype discoveries. Eur J Hum Genet 27, 730–737 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Alkuraya FS Natural human knockouts and the era of genotype to phenotype. Genome Med 7, 48 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Filges I. & Friedman JM Exome sequencing for gene discovery in lethal fetal disorders--harnessing the value of extreme phenotypes. Prenat Diagn 35, 1005–9 (2015). [DOI] [PubMed] [Google Scholar]
- 42.Pengelly RJ, Vergara-Lope A, Alyousfi D, Jabalameli MR & Collins A. Understanding the disease genome: gene essentiality and the interplay of selection, recombination and mutation. Brief Bioinform 20, 267–273 (2019). [DOI] [PubMed] [Google Scholar]
- 43.Dhombres F. et al. Prenatal phenotyping: A community effort to enhance the Human Phenotype Ontology. Am J Med Genet C Semin Med Genet 190, 231–242 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Chitty LS & Van den Veyver IB Facilitating variant curation sharing for fetal precision genomics: A new venture for prenatal diagnosis. Prenat Diagn 42, 1479–1480 (2022). [DOI] [PubMed] [Google Scholar]
- 45.Yates CL et al. Whole-exome sequencing on deceased fetuses with ultrasound anomalies: expanding our knowledge of genetic disease during fetal development. Genetics in Medicine 19, 1171–1178 (2017). [DOI] [PubMed] [Google Scholar]
- 46.Lord J. et al. Prenatal exome sequencing analysis in fetal structural anomalies detected by ultrasonography (PAGE): a cohort study. Lancet 393, 747–757 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Colley E. et al. Potential genetic causes of miscarriage in euploid pregnancies: a systematic review. Hum Reprod Update 25, 452–472 (2019). [DOI] [PubMed] [Google Scholar]
- 48.Aminbeidokhti M. et al. Preconception Genetic Carrier Screening for Miscarriage Risk Assessment: A Bioinformatic Approach to Identifying Candidate Lethal Genes and Variants. medRxiv, 2023.05.25.23290518 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Tweedie S. et al. Genenames.org: the HGNC and VGNC resources in 2021. Nucleic Acids Res 49, D939–D946 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Martin AR et al. PanelApp crowdsources expert knowledge to establish consensus diagnostic gene panels. Nat Genet 51, 1560–1565 (2019). [DOI] [PubMed] [Google Scholar]
- 51.Smedley D. et al. PhenoDigm: analyzing curated annotations to associate animal models with human diseases. Database (Oxford) 2013, bat025 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Team RCR: A Language and Environment for Statistical Computing. (2023). [Google Scholar]
- 53.Chang W. et al. shiny: Web Application Framework for R. (2023). [Google Scholar]
- 54.Chang W. & Borges Ribeiro B. shinydashboard: Create Dashboards with ‘Shiny’. (2021). [Google Scholar]
- 55.Sievert C. Interactive Web-Based Data Visualization with R, plotly, and shiny. (2020). [Google Scholar]
- 56.Xie Y, Cheng J. & Tan X. DT: A Wrapper of the JavaScript Library ‘DataTables’. (2023). [Google Scholar]
- 57.Wickham H,FR, Henry L, Müller K, Vaughan D. dplyr: A Grammar of Data Manipulation. R package version 1.1.2. (2023). [Google Scholar]
- 58.Wickham H. stringr: Simple, Consistent Wrappers for Common String Operations. R package version 1.5.0. (2022). [Google Scholar]
- 59.Wickham H. ggplot2: Elegant Graphics for Data Analysis. (2016). [Google Scholar]
- 60.Allaire JJ, Gandrud C, Russell K. & Yetman CJ networkD3: D3 JavaScript Network Graphs from R. (2017). [Google Scholar]
- 61.Zhu X, Need AC, Petrovski S. & Goldstein DB One gene, many neuropsychiatric disorders: lessons from Mendelian diseases. Nat Neurosci 17, 773–81 (2014). [DOI] [PubMed] [Google Scholar]
- 62.Carvill GL, Matheny T, Hesselberth J. & Demarest S. Haploinsufficiency, Dominant Negative, and Gain-of-Function Mechanisms in Epilepsy: Matching Therapeutic Approach to the Pathophysiology. Neurotherapeutics 18, 1500–1514 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Katsanis N. The continuum of causality in human genetic disorders. Genome Biol 17, 233 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Cooper DN, Krawczak M, Polychronakos C, Tyler-Smith C. & Kehrer-Sawatzki H. Where genotype is not predictive of phenotype: towards an understanding of the molecular basis of reduced penetrance in human inherited disease. Human Genetics 132, 1077–1130 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Smedley D. et al. Next-generation diagnostics and disease-gene discovery with the Exomiser. Nature Protocols 10, 2004–2015 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Lal D. et al. Gene family information facilitates variant interpretation and identification of disease-associated genes in neurodevelopmental disorders. Genome Medicine 12(2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Gerdes S. et al. Essential genes on metabolic maps. Current Opinion in Biotechnology 17, 448–456 (2006). [DOI] [PubMed] [Google Scholar]
- 68.Brown SDM Advances in mouse genetics for the study of human disease. Human Molecular Genetics 30, R274–R284 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Justice MJ & Dhillon P. Using the mouse to model human disease: increasing validity and reproducibility. Disease Models & Mechanisms 9, 101–103 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Cacheiro P. et al. New models for human disease from the International Mouse Phenotyping Consortium. Mammalian Genome 30, 143–150 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Lin S. et al. Comparison of the transcriptional landscapes between human and mouse tissues. Proceedings of the National Academy of Sciences of the United States of America 111, 17224–17229 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Cardoso-Moreira M. et al. Developmental Gene Expression Differences between Humans and Mammalian Models. Cell Reports 33(2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Ha D. et al. Evolutionary rewiring of regulatory networks contributes to phenotypic differences between human and mouse orthologous genes. Nucleic Acids Research 50, 1849–1863 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Han SK, Kim D, Lee H, Kim I. & Kim S. Divergence of Noncoding Regulatory Elements Explains Gene-Phenotype Differences between Human and Mouse Orthologous Genes. Molecular Biology and Evolution 35, 1653–1667 (2018). [DOI] [PubMed] [Google Scholar]
- 75.Liao BY & Zhang J. Null mutations in human and mouse orthologs frequently result in different phenotypes. Proc Natl Acad Sci U S A 105, 6987–92 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Georgi B, Voight BF & Bucan M. From mouse to human: evolutionary genomics analysis of human orthologs of essential genes. PLoS Genet 9, e1003484 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Kida Y. et al. Lethal Interstitial Lung Disease Associated with a Gain-of-Function Mutation in IFIH1. J Clin Immunol 43, 1143–1146 (2023). [DOI] [PubMed] [Google Scholar]
- 78.Toubiana J. et al. Heterozygous STAT1 gain-of-function mutations underlie an unexpectedly broad clinical phenotype. Blood 127, 3154–64 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Wu J. et al. IDDB: a comprehensive resource featuring genes, variants and characteristics associated with infertility. Nucleic Acids Res 49, D1218–D1224 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.WHO. Sexual and Reproductive Health and Research. Infertility Prevalence Estimates, 1990–2021 (2023). [Google Scholar]
- 81.Thonneau P. et al. Incidence and main causes of infertility in a resident population (1,850,000) of three French regions (1988–1989). Hum Reprod 6, 811–6 (1991). [DOI] [PubMed] [Google Scholar]
- 82.Babul-Hirji R, Hirji R. & Chitayat D. Genetic counselling for infertile men of known and unknown etiology. Transl Androl Urol 10, 1479–1485 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Hakim RB, Gray RH & Zacur H. Infertility and early pregnancy loss. Am J Obstet Gynecol 172, 1510–7 (1995). [DOI] [PubMed] [Google Scholar]
- 84.Macklon NS, Geraedts JP & Fauser BC Conception to ongoing pregnancy: the ‘black box’ of early pregnancy loss. Hum Reprod Update 8, 333–43 (2002). [DOI] [PubMed] [Google Scholar]
- 85.Mitchell LE Maternal effect genes: Update and review of evidence for a link with birth defects. HGG Adv 3, 100067 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Gardner EJ et al. Reduced reproductive success is associated with selective constraint on human genes. Nature 603, 858-+ (2022). [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The web interface is available for data exploration at: https://lethalphenotypes.research.its.qmul.ac.uk. Supplementary Files accompanying this manuscript can be found in the following repository: https://zenodo.org/records/10419108