Abstract
Making a specific diagnosis in neurodevelopmental disorders is traditionally based on recognizing clinical features of a distinct syndrome, which guides testing of its possible genetic etiologies. Scalable frameworks for genomic diagnostics, however, have struggled to integrate meaningful measurements of clinical phenotypic features. While standardization has enabled generation and interpretation of genomic data for clinical diagnostics at unprecedented scale, making the equivalent breakthrough for clinical data has proven challenging. However, increasingly clinical features are being recorded using controlled dictionaries with machine readable formats such as the Human Phenotype Ontology (HPO), which greatly facilitates their use in the diagnostic space. Improving the tractability of large-scale clinical information will present new opportunities to inform genomic research and diagnostics from a clinical perspective. Here, we describe novel approaches for computational phenotyping to harmonize clinical features, improve data translation through revising domain-specific dictionaries, quantify phenotypic features, and determine clinical relatedness. We demonstrate how these concepts can be applied to longitudinal phenotypic information, which represents a critical element of developmental disorders and pediatric conditions. Finally, we expand our discussion to clinical data derived from electronic medical records (EMR), a largely untapped resource of deep clinical information with distinct strengths and weaknesses.
Keywords: Epilepsy, Human Phenotype Ontology, Genetics, Genomics, Big data, Electronic Medical Records, Electronic Health Records
Introduction
The critical role of deep, reliable clinical information in interpreting genetic findings is beyond doubt. The traditional approach to define disease entities in clinical medicine is based on recognizing shared clinical features in individuals, ranging from constellations of symptoms and signs to laboratory values, imaging abnormalities, and physiological findings. Furthermore, clinical presentations are not static but dynamic, with diseases differing by their clinical trajectories. In the history of neurogenetics, delineating phenotypic features in detail and identifying such distinctive patterns has also been the route to identifying monogenic etiologies. A classic example of this approach is the initial discovery of Rett Syndrome. Andreas Rett recognized that two girls in his pediatric practice frequently exhibited unusual handwashing motions and reasoned that this stereotypy, in combination with clinical features such as regression and lack of speech development, may represent a novel clinical syndrome (Hagberg, Aicardi, Dias, & Ramos, 1983; Rett, 1966). This insight eventually led to the delineation of Rett Syndrome and the identification of MECP2 as the causative genetic etiology (Amir et al., 1999).
Sometimes it is the age at which a clinical feature manifests in the context of typical developmental or aging trajectories, rather than the feature itself, that is sufficiently distinctive to identify a monogenic disorder. For example, Genetic Epilepsy with Febrile Seizures Plus is a mendelian disorder most frequently attributable to variants in SCN1A, which encodes a voltage-gated sodium channel (Escayg et al., 2000). In this case, the occurrence of febrile seizures, which occur in up to 5% of preschool children, outside of this usual age range serves as a strong indicator.
Sometimes it is the characteristic combination of multiple phenotypic features, and their sequence, that has led to successful gene discovery. Dravet Syndrome, initially described by Charlotte Dravet as Severe Myoclonic Epilepsy in Infancy, represents the most prominent example in pediatric epileptology. Also caused by pathogenic variants in SCN1A (Claes et al., 2001), Dravet Syndrome is characterized by typical development in the first six months of life, followed by febrile seizures, often febrile status epilepticus, and hemiclonic seizures (Dravet, 2011). In isolation, few of these features would necessarily raise high suspicion for Dravet Syndrome, emphasizing how it is often the overall clinical “gestalt” of a genetic disease rather than a single tell-tale clinical feature that is distinctive. This is especially the case in neurogenetics, where clinical manifestations are complex combinations of features that may be interwoven with childhood development or aging.
The historical and examination skills of a master clinician consist of correctly eliciting and interpreting clinical features in context, ideally on a background of encyclopedic clinical knowledge. However, what exactly does it mean to recognize clinical features and to have the quintessential epiphany of recognizing a typical genetic syndrome — particularly when individuals do not manifest all classical features and many present with atypical clinical features? Trainee clinicians can be taught Bayesian reasoning, yet in clinical practice the formal application of this to rare diseases is challenging. First, beyond exclusion of the more frequently encountered conditions such as Trisomy 21, 22q Deletion Syndrome, Fragile X Syndrome, and Angelman Syndrome, the baseline prior probability (prevalence of the disease), marginal likelihoods (prevalence of each clinical feature in the general population), and conditional likelihoods (the proportion of individuals with the disease who each clinical feature) are often insufficiently determined for differential diagnoses to be compared accurately. Take the diagnosis of tuberous sclerosis as a relatively common and well-studied example, in which diagnosis enables genetically guided surveillance and treatment. Current estimates of the incidence of tuberous sclerosis range from 1:6,000 to 1:10,000 live births and approximately 5% of cases cannot yet be explained by variants in TSC1 or TSC2 (Northrup et al., 2021). If you have identified an elliptical hypopigmented skin lesion that you think is an ash leaf spot, how much more likely does this make it that the 2-year-old child before you carries a diagnostic variant in TSC1 or TSC2, and how much should you modify your prediction based on the absence of developmental delay but the presence of seizures at this age? In practice, the cognitive processes that contribute to clinical pattern recognition for genetic diagnosis are not well-defined but heuristic. What are the informative clinical features, how informative are they, and when is a combination of shared clinical features distinctive? Formalizing these concepts for big-data computational approaches in the age of high-throughput genomics and clinical data repositories would allow us to generate the reference data and frameworks needed to incorporate detailed phenotypic perspectives into personalized genomic medicine.
Within this review, we will revisit these concepts and discuss how they can be described or approximated using computational phenotyping approaches, which analyze clinical data statistically at scale while retaining as much phenotypic information as possible to empower discovery and maximize interpretability for translation to the care of individual patients. The pressing need to develop these analytic approaches is reflected in the current paucity of phenotypic resolution of high-throughput genomic studies, which limits the clinical interpretability of advances for personalized genomic medicine. While the last two decades have led to paradigm-changing approaches in genomic sequencing that can be deployed at large scale with sophisticated quality control pipelines that empower interpretation, clinical phenotyping often remains a manual, non-scalable task. Accordingly, in the past, most genomic datasets had little or no relevant phenotypic information attached, often limited to diagnostic codes for disorders, for example, from the International Classification of Disease identifiers (World Health Organization, 1992), or to descriptive names (sometimes with a subtype number) such as those referred to as ‘phenotypes’ and ‘phenotypic series’ in OMIM (Bodenreider, 2004). However, the last five years have seen a renewed interest in integrating detailed clinical data into genomic approaches — aiming to provide phenotypes at the resolution of single clinical features (such as ‘focal-onset motor seizures’) rather than disease labels (such as ‘epilepsy’). This has led to novel gene discoveries using large biorepositories. For example, the discovery of recessive genetic etiologies such as KIAA0586, HACE1, PRMT7, or MMP2 in the Deciphering Developmental Disorders project was achieved through a statistical framework combining genomic and phenotypic evidence (Akawi et al., 2015). Even regarding established disease genes, questions remain about the full clinical spectrum, genotype-specific phenotypic associations and treatment responses, and whether ‘lumping or splitting’ would be more meaningful for biological understanding and, more importantly, for clinical care. Large research repositories and routine clinical records that link detailed clinical and genomic data are likely to contain many answers. Yet if we are to exploit the opportunities these resources offer, we must address the challenge of making phenotypic data — typically highly dimensional and heterogeneous — tractable, without compromising its content.
Given the need for approaches to fill the widening gap between the increasing amount of genomic information and the limited bandwidth of traditional clinical studies, novel frameworks have been developed to overcome the phenotyping bottleneck. The Human Phenotype Ontology (HPO), a controlled dictionary of more than 16,000 clinical concepts, is the most widely adopted model used to capture people’s clinical features in a format that is both human and machine readable (Core Human Phenotype Ontology/Monarch Initiative Team; Kohler et al., 2014; Kohler et al., 2021; Kohler et al., 2017; Robinson et al., 2008). The HPO is adapted and used by diagnostic laboratories, the NIH Undiagnosed Disease Network (Gahl et al., 2016), the 100,000 Genomes Project (Turro et al., 2020) and NHS England Genomic Medicine Service (NHS Health Education England, 2020), and a large number of other initiatives around the globe. The HPO has emerged as the lingua franca for phenotypic features and is appealing through its simplicity, which encourages clinical domain experts to engage in its revision and expansion, with updated versions released several times per year. Our team has already mapped clinical data from major epilepsy genetics projects, including the Epi4K project, the EuroEPINOMICS project, and the Epi25 project, as well from our local EMR into HPO annotations, demonstrating the versatility of this framework to harmonize clinical information from heterogeneous data sources (Crawford et al., 2021; Galer et al., 2020; Ganesan et al., 2020; Helbig et al., 2019; D. G. Lewis-Smith, S; Galer, PD; Krause, R; Thomas, RH; Helbig, I; Epi25 Collaborative,, 2021; Xian et al., 2021). In summary, the HPO provides a generally accepted framework that can be used to harmonize clinical data and has emerged as one of the major vehicles to make large-scale phenotypic data available for analytic approaches.
In the following sections, we discuss insights from the emerging fields of computational phenotyping and EMR genomics, identifying possibilities and pitfalls as well as highlighting novel paradigms that exploiting these approaches. We cover data harmonization, information content, phenotypic depth, the value of annotating negative (explicitly absent) phenotypes, and concepts relevant to the interpretation of longitudinal phenotypic data when studying clinical trajectories and outcomes.
Biomedical ontologies capture the relationship between clinical annotations
Mapping clinical information to structured dictionaries such as the HPO makes clinical information available for computational analyses harmonized in both human- (e.g., “Seizure”) and machine-readable (e.g., [HP:0001250]) form. While human-readable names may evolve with clinical preference, their machine-readable identifiers should correspond to consistent concepts, future-proofing data so that they can be reinterpreted using subsequent versions of the HPO. The HPO not only represents clinical features as a dictionary, but it arranges these concepts a directed acyclic graph that provides information about is_a relationships between phenotypic features (Figure 1). These can be used to compare individuals by their annotations, as illustrated in Figure 2A using three hypothetical individuals: Anna, Benjamin, and Charlotte, each with a single phenotype as an example. For instance, taking three hypothetical individuals: Anna’s annotation of “Focal aware seizure” [HP:0002349] is closely related to Benjamin’s annotation of “Focal impaired awareness seizure” [HP:0002384] because is_a relationships show that both are types of “Focal-onset seizure” [HP:0007359]. While Charlotte’s annotation of “Generalized myoclonic seizure” [HP:0002123] is related to these (all are types of “Seizure” [HP:0001250]), Anna and Benjamin appear more similar to each other clinically than either is to Charlotte because they share a more conceptually precise phenotypic concept with each other (“Focal-onset seizure” [HP:0007359]) than either does with Charlotte (“Seizure” [HP:0001250]). Similarly, this concept can be applied to comparison of individuals annotated with sets of multiple HPO terms (Figure 2B).
Figure 1.

A visual representation of the complexity of Human Phenotype Ontology version 1.7.13 released 2021-10-10, focusing on seizure and related neurological phenotypes. This version contains 16,290 terms and 20,529 is_a relationships.
Figure 2.

Interpretation using the is_a relationships of the HPO. (A) A simplified example of the HPO, comparing three individuals, each annotated with a single phenotypic term. (B) Individuals can be compared according to sets of HPO annotations. Here node color indicates the individual to whom they have been annotated, with nodes in green representing phenotypic descriptors applicable to both individuals. (C) Translation of raw clinical data typically results in precise phenotypic annotations which should be propagated following is_a relationships to infer the presence of less conceptually specific phenotypic concepts, otherwise the frequency of the latter will be underestimated.
Inference allows for harmonization across the depth of all phenotypes
When raw clinical data are translated into HPO annotations, they become harmonized in that equivalent concepts are unified using a controlled vocabulary. For example, synonyms “generalized tonic-clonic seizure”, “grand mal”, and “generalized convulsion” might be unified under the HPO concept of “Bilateral tonic-clonic seizure with generalized onset” [HP:0025190]. This reduces arbitrary heterogeneity in the data. However, a further step of harmonization — this time across phenotypic breadth — allows better estimation of the true frequency of a clinical feature in the cohort. For example, generic phenotypic descriptors represented by higher-level HPO terms such as “Abnormality of digestive system morphology” [HP:0025033] or “Abnormality of the Nervous System” [HP:0000707] are unlikely to be translated from raw data because clinicians would tend to document the clinical features more precisely than this. However, many specific clinical terms imply the presence of higher-level concepts via the is_a relationships of the HPO. These implied terms are “ancestors” of the terms translated from the text, which we call “base terms”. For instance, the applicability of the two high-level HPO terms above to an individual should be inferred from the presence of annotations “Pyloric stenosis” [HP:0002021] (narrowing of the outflow from the stomach) and “Epileptic spasm” [HP:0011097] (a type of motor seizure most frequently occurring in clusters during infancy) respectively, even if these high-level terms were not present among those translated from the raw data. We refer to the process of inferring higher-level phenotypic terms (and adding these to an individual’s annotations that have been translated from the raw data) as propagation. Accordingly, sets of patients’ HPO annotations are referred to as base annotations if they are those translated from raw data and propagated annotations if these base terms have been supplemented by all applicable terms inferred according to the is_a relationships of the HPO (Figure 2C).
Estimates of the frequency of more generic phenotypic concepts in a cohort are more accurate and internally consistent after propagation. For example, in a study of 413 people with SCN2A-related disorders, translation of raw phenotypic data sources assigned “Neurodevelopmental abnormality” [HP:0012759] to only 0.2% of the cohort, whereas counterintuitively “Intellectual disability” [HP:0001249] (a type of neurodevelopmental abnormality) was annotated to 7%. However, after propagation, the frequency of “Neurodevelopmental abnormality” [HP:0012759] increased to 63% and the frequency of “Intellectual disability” [HP:0001249] to 46% as subtypes of intellectual disability were present among the base annotations of many participants. This is now consistent with the logical intuition that a generic phenotypic description should not be applicable to fewer individuals than one of its subtypes, and precise descriptions should be more informative than generic descriptions of the same phenotypic feature.
Clinical phenotypic annotations translated from detailed raw data are sparse
While all clinical features can be linked through such an approach (with the ultimate root concept “All” [HP:0000001]), some aspects of clinical information require dedicated consideration. Inherent features of clinical datasets typically become obvious once translation has been performed and some of the properties in clinical data are often surprising and counterintuitive.
First, a frequently underappreciated feature of large clinical datasets is sparseness, where many phenotypic terms are only assigned to a small number of individuals. For example, in a study of 846 people with developmental and epileptic encephalopathies, we found that despite propagation, 679 of 1,616 terms were annotated to only one individual and only 331 of 1,616 terms were annotated to >10% of the cohort (Galer et al., 2020). Second, the available clinical information per individual varies significantly within and between datasets. When comparing seizure descriptions regarding 791 individuals from three cohorts, we found a median of 8 clinical annotations per individual after propagation (D. Lewis-Smith, Galer, et al., 2021). However, the range of annotations per individual after propagation extended from 1–30, emphasizing that available clinical information per individual is not uniform across cohorts, but widely distributed. Third, availability of standardized clinical information allows for operational definitions of concepts such as phenotypic depth. However, the number of clinical annotations is only one method of quantifying the available information and needs to be assessed alongside other measures. For example, we have used information content per clinical annotation, calculated as −log2 of the frequency with which the given term is annotated in the cohort after propagation and measured in bits. Consequently, just as the frequency with which terms are annotated should increase monotonically as one follows the is_a relationships towards the root of the HPO, the information content should decrease monotonically. The result of this is that conceptually specific phenotypic features tend to have a low frequency and high information content (they are more informative), while more generic phenotypic features have a high frequency with correspondingly low information content (less informative). This captures how clinicians typically not only consider the breadth of a constellation of clinical phenotypes but also weigh each element by its individual specificity.
The amount of information per individual can be summed across all their propagated phenotypic annotations, providing a measure of the total information available as HPO annotations about that person’s clinical features. Consequently, the total information per individual quantifies how much information is available in the data to distinguish them within the context of the overall cohort being studied (which dictates the information content of each annotation). This can be useful for assessing the amount of information available for phenotypic analysis or to establish how much variation there is between cohorts recruited from different centers in a multicenter study when deciding where to focus efforts on further data collection. The number of clinical descriptors that can be inferred about an individual depends on three factors: (1) the complexity of their real-world clinical features, (2) how thoroughly these have been annotated to HPO terms, and (3) how precisely these phenotypic concepts are represented in the HPO, which limits the precision of annotation. Hence, a large amount of information may be available about an individual because they have one or a few rare phenotypic annotations or because they have many, particularly if these common features span multiple anatomical and functional systems — modeling the distinctiveness of classical multisystemic genetic syndromes for which a single feature is seldom pathognomonic.
Domain-specific biomedical ontologies can be improved through expert curation
The HPO is a relatively simple ontology where phenotypic concepts are connected through a single, is_a relationship. Therefore, in comparison to other more complex frameworks such as the Systematized Nomenclature of Medicine - Clinical Terms (SNOMED-CT) or the Unified Medical Language System (UMLS), revision of entire branches of the HPO is feasible and can be accomplished by domain-specific expert panels (Kohler et al., 2021). In 2018–2020, we revised the seizure subontology of the HPO on behalf of the Genetics Commission of the International League Against Epilepsy (D. Lewis-Smith, Galer, et al., 2021). This revision, led by a dedicated Task Force, was aligned with the most recent seizure classifications, allowing us to measure the impact of revising the dictionary on the amount and quality of data captured. The number of concepts by which seizures can be described increased five-fold, despite removing several redundant terms. Importantly, the number of is_a relationships between phenotypic terms increased seven-fold, as a result of our efforts to harmonize concepts within and between formal classifications so that the HPO can be used to combine data from sources using different classifications with minimal loss of information or arbitrary distinction. Application of the seizure subontology of the HPO (using versions from before and after our revision) to data from a cohort of 791 individuals, demonstrated that the number of annotations according to which the cohort’s seizures could be analyzed, and the amount of information about each participant’s seizures both increased by 40% following the revision (Figure 3) (D. Lewis-Smith, Galer, et al., 2021). Moreover, we concluded that for the 86 participants for whom the revised seizure subontology appeared to capture less information than its predecessor, much of this lost information would be considered inaccurate according to current epilepsy classifications.
Figure 3.

The number of HPO annotations relating to seizure descriptors that were assigned to 791 individuals from the merger of three different research cohorts before (HPO release date 2017-12-12) and after (HPO release date 2020-12-12) expert revision. Reproduced and adapted under CC-BY from D. Lewis-Smith, Galer, et al. (2021).
Furthermore, our findings address a more general question: does a new framework such as revised nomenclature or classification hold value for real-word clinical care or research? Translating this question into a comparison of the amount of information captured through two competing frameworks allows this to be answered in a data-driven manner. In this instance, integration of recent classifications to build a harmonized model of clinical domain knowledge suggest that this is the case.
The challenge of representing important absent phenotypes
In some disorders, it is the absence of particular clinical features that is characteristic, and potentially important for clinical reasons. With regards to the two SCN1A-related epilepsies discussed in the introduction, in people with Dravet Syndrome, developmental delay typically occurs after seizures begin, usually progressing to some degree of intellectual disability, whereas in those with Febrile Seizures Plus, developmental delay is unusual. When phenotypes are coded as HPO annotations or even lists of clinical features, the relevance of absent clinical information is not immediately clear. For example, if an individual’s propagated annotations do not include the term “Intellectual Disability” [HP: 0001249] is it safe to assume that this phenotypic feature is truly absent (that the individual did not have intellectual disability of any sort)? Alternatively, might it be present but not annotated the clinician did not consider it relevant at the time of data collection, so it was not documented in the raw data, or because some manual or automated data capture from the raw data missed it? In formal systems of logic used for knowledge representation, a difference is made between “open world” and “closed world” assumptions.
The more conservative assumption, used in the majority of HPO-based phenotypic studies, is open world, meaning that the absence of an annotation is not interpreted as the corresponding phenotype being absent. In this case, only those phenotypes that are explicitly recorded as being present are used for analysis. Consequently, one cannot distinguish between absence of evidence of the presence of a phenotype and explicit absence of the phenotype itself.
If a closed world assumption is made, the lack of phenotypic annotation represents the true absence of the feature. However, with regards to HPO annotations, a closed world assumption can typically only be made in a few scenarios. For example, in a study of individuals with developmental disorders, clinicians were required to complete a standardized data set recording each feature as explicitly present or absent, thus the absence an HPO term among an individual’s annotations could be interpreted as meaning that the phenotype was actively sought and not found at the time of assessment (Andrews et al., 2015). The more common scenario is that phenotypic data are provided in a less standardized, typically opportunistic manner, and limited conclusions can be drawn from the lack of assigned clinical features, especially in dataset with relatively shallow phenotypic annotations. The Phenopacket format has been developed recently as part of the Global Alliance for Genomics and Health (GA4GH), and its PhenotypicFeature element allows clinical features to be recorded as being present or “excluded” using an ontology such as the HPO as a controlled dictionary (Jacobsen et al., 2021).
We were able to evaluate the value of including explicitly absent phenotypes in a study of 413 individuals with SCN2A-related neurodevelopmental disorders, which are associated with a range of developmental outcomes that are clinically important to capture (Crawford et al., 2021). After annotating individuals with positive HPO terms (explicitly present), we duplicated the HPO, modifying each term by adding “No” as a prefix to each human-readable name, and changing “HP:” machine-readable labels to “NP:”. We use this modified HPO as a notation of convenice for coding clinical features that are absent: “negative phenotypes”. We then assigned negative phenotypes when clinical reports unequivocally reported that the feature was absent. In this way, we were able to give 260 of 413 individuals a total of 475 negative base annotations. This enabled us to confirm that not only was the presence of “Autism” [HP:0000717] associated with protein-truncating rather than missense variants in SCN2A, but also that the explicit absence of autism (“No Autism” [NP:0000717]) was associated with missense variants. We found the converse with regards to seizures. Yet addition of explicit negative annotations generated the power to detect a more nuanced relationship between seizures and missense variants. In contrast to the positive association of missense variants with “Seizure” [HP:0001250], when discriminating by domain, missense variants in the S5–6 pore loop were associated with “No Seizure” [NP:0001250] suggesting that carries of missense variants in this domain may manifest phenotypes more typical of protein truncating variants.
We were also able to assess the phenotyping gap at the resolution of single HPO terms, meaning the proportion of individuals in the cohort for which the evidence was insufficient to assert the presence or absence of the given clinical feature under an open world assumption (Figure 4). After propagation, 1,064 of 1,268 HPO concepts were annotated exclusively as present or absent in this cohort. Other than phenotypes that were ubiquitous in this cohort (“All” [HP:0000001], “Phenotypic abnormality” [HP:0000118], “Abnormality of the nervous system” [HP:0000707] and “Abnormal nervous system physiology” [HP:0012638]), these have very large phenotyping gaps (>50%) so for 1,059 annotations, we do not know if the phenotype was present. 204 HPO concepts were annotated as being present in at least one individual and absent in another. For example, the presence or absence of “Delayed speech and language development” [HP:0000750] was only documented in 23% of individuals (present in 10% and absent in 13%), leaving us blind to the language development of the majority of people with SCN2A disorders without further raw data. The median phenotypic gap across these 204 phenotypic concepts was 80%. This suggests that in general we should strongly favor an open world assumption where data are missing. However, depending on how raw data were collected (which may in turn depend on academic focus or clinical context), there are circumstances in which one might hypothesize that a closed world assumption may be reasonable. In this study, the presence of seizures might be expected to be reported whenever present, as enquiring about the occurrence of seizures would be expected as part of routine clinical assessment of patients presenting with neurodevelopmental concerns. Thus, in the absence of the explicit presence of seizures it might be reasonable to assume that they were absent. Regardless, we found that only 3% of the cohort were annotated with neither “Seizure” [HP:0001250] nor “No Seizure” [NP:0001250], indicating that authors of previous SCN2A case series and users of our EMR (local cohort) recognized the importance of assessing the occurrence of seizures and reporting their absence explicitly. Consequently, even in this case, the potential benefit of making a closed word assumption to fill missing data is minimal.
Figure 4.

The proportion of individuals with SCN2A-related disorders with particular phenotypes using data from Crawford et al. (2021). (A) The percentage of individuals coded as having (blue) or not having (red) a selection of 204 phenotypes coded as present and absent in this cohort after propagation, as well as the phenotypic gap: the percentage of individuals in whom we cannot annotate the presence or absence of the phenotype. The six HPO concepts with the smallest phenotyping gap and a representative selection of the remainder are shown. (B) The distribution of the phenotypic gap for all 204 phenotypes that were coded as present in at least one and absent in at least one member of the cohort, ranked by phenotyping gap with only a selection labelled for clarity.
A particular technical matter needs consideration when using negative HPO annotations. Whereas propagation of positive phenotypes follows the is_a relationships to conceptually more generic, ancestral terms all the way to the HPO root node “All” [HP:0000001], the opposite is true for negative phenotypes, which require “downward propagation” so that the absence of generic concepts is used to infer the absence of their specific subtypes. For example, “No Global developmental delay” [NP:0001263] implies that there is “No Mild global developmental delay” [NP:0011342] and “No Severe global developmental delay” [HP:0011344]. Downward propagation, however, results in an unreasonably large number of inferred negative HPO annotations that could be analyzed but which are redundant for clinical interpretation. For example, in the SCN2A study downward propagation from “No Morphological abnormality of the central nervous system” [NP:0002011] resulted in a total of 695 negative annotations, corresponding to all descendants of this high-level term. In order to curb the number of negative annotations while allowing for meaning data harmonization, we developed a pruning method that eliminates terms that add no further information beyond their parent term.
While the phenotyping gap might be expected to be large for clinical features that are rare, this can also be the case for phenotypes that are common in the general population and overlooked in the EMR. For example, stuttering (also known as stammering) is common. However, Pruett et al. (2021) found it to be under-reported in EMR billing code data from the Vanderbilt University Medical Center, given the 1–3% prevalence in the general population. We too have found “Stuttering” [HP:0025268] to be conspicuous by its scarcity in the HPO annotations of several neurodevelopmental cohorts. To overcome the challenge of identifying people who stammer from using EMR data, Pruett et al. trained a supervised classifier to use sets of Phecodes to predict stammering in their cohort. Phecodes are a representation of clinical phenotypes that can be used to identify cases and controls based on the mapping of International Classification of Diseases codes present in individuals’ EMR (Bastarache, 2021). The classifier enabled identification of 9,239 individuals predicted to have stammering from their research biorepository, empowering a common variant genome-wide association study that identified two loci (Shaw et al., 2021).
A further related consideration should be borne in mind if our interest is in knowing whether a genetic variant is associated with the presence or absence of a phenotype that is dynamic. If the phenotype is age-dependent, the age of the patient at the time of the clinical assessment will matter. For example, if none of a hypothetical cohort has “Delayed speech and language development” [HP:0000750] annotated, we might be more confident that their genetic variants are not positively associated with language delay if the entire cohort was assessed up to a minimum age of 10 years old than if half of the cohort were neonates in whom this phenotype cannot be assessed. This is a lesser concern for congenital phenotypes such as “Primary microcephaly” [HP:0011451] but greater for phenotypes that may not manifest until late adulthood such as “Parkinsonism” [HP:0001300]. One should also be careful before asserting that a phenotype that is transient and hard to capture is truly absent without caveats. Such scenarios may include the assessment in an adult of early life phenotypes that resolve in childhood (for example, many adults do not know for certain that they did not have febrile seizures as a child and their parents may no longer be available to confirm this). Furthermore, some dynamic phenotypes are not apparent to the patient or an onlooker without technical assistance and might be missed despite a complete medical history. For example, not all patients with seizures may have had electroencephalography (EEG), an investigation necessary to be able to assert that they have an “EEG abnormality” [HP:000235]. Even were this clinical investigation performed and no abnormality found, this may not be sufficient to conclude that the patient would never have had an abnormality were continuous EEG feasible. Given the dynamic and stochastic nature of cerebral electrophysiology, some EEG abnormalities may have been missed due to sampling.
In summary, inclusion of explicitly absent phenotypic annotations allows additional information to be captured without the risk of assuming that a phenotype is absent, which is relevant for phenotypic features where both presence and absence are meaningful when interpreting clinical data. In the future, the explicit presence, absence, or uncertain status of core phenotypes of particular interest for a disease domain (for example, “Neurodevelopmental delay” [HP:0012758] and “Seizure” [HP:0001250] in neurodevelopmental disorders) may become established within reporting standards. However, negative HPO annotations play by their own rules and require distinct data harmonization techniques and critical interpretation given their conceptual difference from explicitly positive assigned phenotypic annotations.
Some phenotypes, particularly those based on laboratory assays, are quantitative and consequently require binning into ordinal categories that can be represented in the HPO, for example Hyperphenylalaninemia [HP:0004923] and Hypophenylalaninemia [HP:0500141]. Additionally, one could use negative annotations such as No abnormal circulating phenylalanine concentration [NP:0010893] to represent normal values. However, information within bins is lost and the location of thresholds must be decided to partition the range of values optimally for the relevant phenotype, and for some phenotypes such as serum testosterone concentration, these need to vary by age and sex. While ordinal annotations (low, normal, or high values) may be sufficient for gross cross-sectional phenotypic analysis of groups of individuals, they may be insufficiently powerful for longitudinal analysis of dynamic phenotypes or identifying relationships at high resolution, such as the 1.3–3.1 point reduction in intelligence quotient per 100μmol/l increase in serum phenylalanine concentration in phenylketonuria (Waisbren et al., 2007). The Phenopacket format allows for documentation of quantitative measurements using standards such as the Logical Observation Identifiers Names and Codes (LOINC, (Vreeman, McDonald, & Huff, 2010)) with time stamps that could be used for quantitative analyses (Jacobsen et al., 2021).
Longitudinal phenotype data captures age-specific clinical features
Understanding the temporal relationship of clinical features is critical, not only for diagnosis but also for prognosis and treatment decision-making. However, other than for more frequently encountered examples such as Trisomy 21 or 22q Deletion Syndrome, the phenotypic trajectories of neurodevelopmental disorders are largely unexplored at scale because of the efforts required to perform longitudinal studies in rare diseases and the lack of conceptual frameworks to analyze longitudinal, often piecemeal phenotypic data.
We have previous examined longitudinal phenotypic data by extracting features from EMR (Ganesan et al., 2020; D. Lewis-Smith, Ganesan, et al., 2021; Xian et al., 2021). The widespread adoption of EMR mandated by the American Recovery and Reinvestment Act of 2009 is an unprecedented opportunity to leverage clinical data generated as a byproduct of healthcare for genomic research. Large national and international initiatives have started to systematically link biorepositories to EMR data, including the NHGRI-funded eMERGE consortium (eMerge, 2007) and the Geisinger MyCode Community Healthcare Initiative (Geisinger Healthcare System, 2021). These combined resources have been used to identify the protective effect of ANGPTL4 for coronary artery disease and type 2 diabetes (Dewey et al., 2016; Gusarova et al., 2018), and a reduced risk of chronic liver disease in patients with variants in HSD17B13 (Abul-Husn et al., 2018). The systematic use of EMR data allowed these studies to include up to 80,000 cases and 500,000 controls. Extraction of detailed clinical phenotypes from EMR data for genetic studies requires robust phenotyping algorithms, whether these are to be used for genotype-phenotype analyses or for inference or validation of affectation state regarding a specific disease (sometimes called a ‘phenotype’) or trait in a genetic association study. As of 2018, more than 60 phenotyping algorithms are available, demonstrating that methods based on EMR data can be successfully developed and applied for de-identified data across institutions (Kirby et al., 2016). Similarly, national biobank projects such as the UK Biobank periodically update the phenotypic data available to researchers from hospital and primary care EMR codes, supplementing study-specific questionnaires and investigations and both the UK Biobank core team and collaborating researchers contribute algorithms for identification of participants with a wide number of phenotypes (UK Biobank, 2007). However, few algorithms specifically interrogate longitudinal clinical information.
In our longitudinal EMR-based study of children with genetic epilepsies, we developed an approach to analyze longitudinal clinical data taking advantage of HPO-based data harmonization and time stamps from the EMR (Ganesan et al., 2020; D. Lewis-Smith, Ganesan, et al., 2021). With clinical features translated into HPO annotations, we binned the annotations into 3-month age epochs using each participant’s date of birth and the date of the clinical encounter recording the phenotype. We were able to map longitudinal phenotype data from 62,104 clinical encounters pertaining to 658 individuals across a cumulative 3,251 patient-years, generating 286,085 age-specific annotations based on a repertoire of 528 unique HPO terms. These data allowed us to map how phenotypic features associated with distinct genetic etiologies over time, such as the frequency of “Status Epilepticus” [HP:0002133] in 29 individuals with disease-causing variants in SCN1A compared to the remainder of the patient cohort (Figure 5). The approach developed in this study demonstrates how longitudinal phenotypic data can be harmonized and mapped to provide information about natural history and prognosis.
Figure 5.

The longitudinal interrogation of HPO annotations from the EMR of patients with genetic epilepsies. (A–F) Stacked bar charts demonstrating how the number of patients with the given phenotype recorded and without the phenotype recorded varies with age. For example, Febrile seizures [HP:0002373] is coded most frequently in children aged 1–7 years of age but only in a minority of individuals with clinical encounters over this age range. (G) How status epilepticus is particularly common in children under the age of 5 years with diagnostic SCN1A variants compared to those without. Reproduced and adapted under CC-BY from Ganesan et al. (2020).
Several key considerations are relevant with respect to longitudinal mapping of phenotypic data. First, selection of time interval bin width is often arbitrary. While we compared the effect of various age bins, the use of constant time intervals across the entire age range can be reasonably questioned and the use of logarithmic scales have been suggested to better reflect human biology over time. Second, application of longitudinal phenotypic data emphasizes the concept of EMR usage. EMR usage refers to the interval between the first and most recent EMR encounter, reflecting the time frame where clinical information on an individual can be retrieved and analyzed — it is the chronological perspective from which we are limited when observing a person via their EMR. The effect of EMR usage is understudied but likely to play an important role in the interpretation of longitudinal clinical data. For example, in the EMR cohort discussed above, no more than 250 of 658 individuals had EMR usage at any given age. In fact, for some disease groups as defined by genetic etiology, groups were temporally separated in the EMR data; individuals within these groups never overlapped chronologically, restricting our ability to make age-matched phenotypic comparisons. For example, all individuals with PRRT2-related diseases only had EMR usage up to age 5 years and those with GRIN2A-related diseases did not have any date before age 6 years so the clinical features of these two disorders were not directly compared (Ganesan et al., 2020; D. Lewis-Smith, Ganesan, et al., 2021). Future studies could compare the ages at which HPO terms can be inferred from the EMR to mitigate non-overlapping EMR usage. Third, individuals have often been cared for at multiple institutions (they may not have reached our center for a specialist assessment until they had had phenotypes for some time or they may be discharged to a local healthcare provider for follow up) and unless EMR data are visible across institutional boundaries (for example national and regional services and networks), a comprehensive perspective of each individual’s dynamic phenotype requires collection of data from each institution within data protection regulations. Companies such as Ciitizen are now helping patient advocacy groups with such efforts {Ciitizen corporation, #3408}. In summary, EMR usage is a critically relevant phenomenon in real-world EMR data with consequences for data analysis that may be underestimated.
Common Data Elements can be combined with biomedical ontology data
In pediatric epileptology, initiatives such as the Epilepsy Foundation-championed Epilepsy Learning Health System and the Pediatric Epilepsy Learning Health System are developing standardized methods to assess disease severity using Common Data Elements (CDE) that can be directly integrated into the EMR across different healthcare networks, facilitating not only local but harmonized national quality evaluation and improvement (Grinspan et al., 2021). In a study by Fitzgerald and collaborators, we reported the first analysis of seizure frequency using an EMR-implemented CDE form to document disease severity for 1,696 patient encounters involving 1,038 patients (Fitzgerald et al., 2021). This allowed us to describe how seizure frequency varied as the COVID-19 pandemic necessitated transition from predominantly face-to-face to telemedicine encounters, and by combining clinical data with routine demographic data we were able to assess racial and socioeconomic healthcare disparities.
We have recently combined our HPO- and CDE- based methodologies to analyze how clinical features including both phenotypes and treatment responses vary in a single genetically defined disorder (Xian et al., 2021). Within our international study of 534 people with STXBP1-related disorders, we reconstructed the clinical features of 62 individuals recruited from our center across 4,433 monthly intervals, demonstrating a dynamic seizure pattern in the first year of life that was previously underappreciated. Combined with prescriptions for anti-seizure medication extracted from the EMR, this allowed us to outline the medication landscape of STXBP1-related disorders and assess the comparative effectiveness of various ASM. This method enables reconstruction of disease histories and easily be adapted to other conditions. Figure 6 demonstrates a comparable analysis in 13 individuals with SCN8A-related epilepsies followed in our center, showing a gradual increase of seizure frequency in infancy compared to those with STXBP1-related disorders (Xian et al., 2021) and differences in their responses to specific treatments.
Figure 6.

The distribution of seizure frequencies and prescription of various antiseizure treatments with age as well as the odds ratios of achieving a reduction in seizure frequency or maintaining seizure freedom for a selection of medications and the ketogenic diet in patients at our center with (A) SCN8A-related and (B) STXBP1-related disorders. Seizures tend to become more common over the first year of life among people with SCN8A-related disorders, and those taking oxcarbazapine (a sodium channel blocker) are most likely to experience an improvement in seizure frequency. Regarding STXBP1, seizures tend to become less common over the first year of life, typically responding well to the ketogenic diet and clobazam. However, those requiring antiseizure medication into adulthood commonly take levetiracetam rather than alternative treatments. Panel b was reproduced and adapted under CC-BY from Xian et al. (2021).
Digitization of rare disease cohorts create phenotypic references for genomic diagnostics
As many neurodevelopmental disorders include a broad phenotypic range and various known and as yet unidentified subgroups, it is desirable to describe the entire phenotypic landscape of a disorder or genetic etiology. In 2020, we undertook this for SCN2A, making it the first of the established monogenic neurodevelopmental disorder genes to have its phenotypic landscape represented digitally, i.e., the clinical features of all individuals reported in the literature supplemented by those followed at our center were systematically coded in HPO terms and analyzed jointly, akin to a GWAS mega-analysis (Crawford et al., 2021). This combined data set included 10,860 HPO annotations in 413 individuals, allowing a us to examine the genotype-phenotype relationships of a gene-defined disorder quantitatively in unprecedented detail. By adapting the semantic similarity approach from natural language analytics to clinical phenotypes, we found that in the case of eight recurrent variants, people sharing the same variant were more similar clinically than expected by chance. We used principal component analysis to identify the distinct phenotypic dimensions that contribute to clinical constellations seen in SCN2A-related disorders. This study was followed by our study of STXBP1-related disorders (Xian et al., 2021) and preceded by a joint analysis of all prior major multicenter genetic studies of developmental and epileptic encephalopathies with a total of 31,742 HPO terms in 846 individuals (Galer et al., 2020). Given that all three studies use the HPO as a joint phenotyping framework, information can easily be combined and jointly analyzed (D. Lewis-Smith, Galer, et al., 2021) or combined with even larger datasets, such as the Deciphering Developmental Disorders study that currently provides more than 600,000 harmonized clinical annotations in more than 9,000 individuals with genomic data (Firth & Wright, 2011). Accordingly, these harmonized datasets not only provide unprecedented insight into the range of phenotypic features in neurodevelopmental disorders, but also provide a critical reference framework for future diagnostic approaches in genomic medicine, particularly where the data collected prior to any potential bias from knowledge of the genetic diagnosis can be identified. One can hope that in the phenotypic data will increasingly be made available in the GA4GH Phenopackets format, which includes not only present and explicitly absent phenotypes but additionally, allow the use of annotations of each PhenotypicFeature with elements that can be used to modify the phenotypic feature with indicators of severity and chronological data such as the age of onset and resolution (Jacobsen et al., 2021). A universally accepted format capturing such phenotypic richness would empower reuse of valuable data that might have already been published in future analyses.
Distinct genetic etiologies have recognizable phenotypic constellations
A harmonized genomic and clinical dataset enables a joint analysis of clinical and genomic data. In 2019, we used harmonized clinical and genetic data to identify AP2M1 as a novel genetic etiology through a method referred to as phenotypic similarity analysis (Helbig et al., 2019). Algorithms assessing phenotypic similarity aim to formalize an intuitive concept in clinical practice: that those individuals whose overall constellations of clinical features are remarkably similar are likely to share the same etiology (Figure 7). Historically, such reasoning has led to the discovery of the underlying cause of many monogenic neurodevelopmental disorders, such as Rett Syndrome and Dravet Syndrome. Yet these discoveries depended on a single expert clinician being fortunate enough to assess sufficient patients with these disorders to recognize that these patients had characteristic patterns of clinical features. Phenotypic similarity algorithms aim to replicate this discovery process through a formal comparison of phenotypic features across an entire cohort. Once such algorithms can be successfully deployed on adequate and harmonized data, the distribution of similarities for individuals grouped by shared genetic features compared to a Monte Carlo generated empirical null distribution can identify novel genetic etiologies based on phenotypic evidence. We demonstrated that individuals with de novo variants in AP2M1 were more similar than expected by chance in a cohort with developmental and epileptic encephalopathies, emphasizing that such a framework can be generalized to larger datasets, even if these cohorts have broadly similar clinical disorders. In our subsequent study combining data from three cohorts with developmental and epileptic encephalopathies, we used the same approach to show that the HPO data available were sufficient to demonstrate that 11/41 genetic etiologies (including prominent genes such as SCN1A, STXBP1, and KCNB1) had distinctive phenotypic patterns (Galer et al., 2020). In this same article, we introduced novel visualizations to assess the contribution of harmonized phenotypic terms referred to as ‘phenograms’. For example, the phenogram of those with de novo SCN1A variants showed a strong pattern consistent with the features observed in Dravet Syndrome, even though none of these individuals had been given this diagnosis clinically (Figure 8). This example demonstrates the power of computational phenotyping approaches, known disease entities can be reconstructed from clinical information sufficiently sparse to preclude manual diagnosis in clinic. Additionally, such approaches can be adapted using binned longitudinal EMR data to map the ages at which individuals sharing the same genetic etiology are more or less clinically homogeneous (D. Lewis-Smith, Ganesan, et al., 2021).
Figure 7.

An example of phenotypic similarity analysis using the simmax algorithm. (A) The annotations of two individuals are compared to each other. For each pairwise comparison of a phenotype from P1 and P2 the most informative common ancestor (MICA) is found. The MICA is the term that is an ancestor of the two terms being compared with the highest information content (IC). The similarity of the two terms being compared is defined as the IC of their MICA. Once this has been completed for all pairwise comparisons of phenotypes, the overall similarity of P1 and P2 is calculated as the sum of highest similarity of each of P1’s annotations and each of P2’s annotations. The denominator of 2 helps with comparison of the similarity score calculated using this algorithm to those obtained using other similarity algorithms. (B) The median similarity of individuals grouped according to a genetic feature such as de novo variants in AP2M1 is compared to the null distribution of median similarity scores generated by Monte Carlo simulation for groups of the same number of individuals (in this case n = 2) to yield an empirical p-value, estimating the probability of having observed a similarity this great due to chance in this cohort. Panel B created using data from Helbig et al. (2019).
Figure 8.

HPO-based visualizations demonstrate the clinical features associated with de novo variants in SCN1A in published cohorts with developmental and epileptic encephalopathies. (A) The frequency of annotation of HPO terms in carriers of SCN1A de novo variants versus non-carriers regardless of age. (B) The same data presented to demonstrate the conceptual relationships between associated features within the structure of the HPO. p-values were calculated using Fisher’s exact test. Reproduced and adapted under CC-BY from D. Lewis-Smith, Galer, et al. (2021) using data from Galer et al. (2020).
How can computational assessment of phenotypic features aid variant interpretation?
Quantitative phenotyping based on widely accepted models such as the HPO can provide a framework through which to operationalize concepts such as phenotypic overlap at scale, including measures of significance and uncertainty. We previously demonstrated that existing HPO annotations from relatively modestly sized patient cohorts enables prediction of underlying genetic etiologies with high positive predictive values. For example, the combination of “Febrile Seizures” [HP:0002373], “Generalized Tonic-Clonic Seizures” [HP:0002069], “Infantile Onset” [HP:0003593)], “EEG with Spike-Wave Complexes” [HP:0010850], and “Focal Motor Seizures” [HP:0011153] has a positive predictive value of 0.9 for SCN1A de novo variants in children with unexplained yet suspected genetic developmental and epileptic encephalopathies (Galer et al., 2020). Thus, even if individual terms appear non-specific, specific combinations allow for sufficiently powered predictions. Rare disease syndromes from Orphanet (Orphanet, 1997) are now being annotated with HPO terms, each with an estimate of the frequency with which it is observed in the disorder through the HPO – ORDO Ontology Module, which integrates the Orphanet Rare Disease Ontology with the HPO (Kohler et al., 2019). Similar efforts are being made with monogenic disorders in OMIM (McKusick-Nathans Institute of Genetic Medicine, 2020) and sets of HPO terms are available on the HPO website (Core Human Phenotype Ontology/Monarch Initiative Team). The value of these resources as reference data that can be compared to the features of a particular individual during variant prioritization will increase as manually curated gene-specific phenomic results become available from international efforts to collate clinical data from large cohorts such as the digitization of the clinical features of SCN2A and STXBP1 discussed above. For example, the Genomics England now provide researchers with the frequency of HPO phenotypes among 100,000 Genomes Project participants grouped by gene and variant classification (Turro et al., 2020). The development of algorithms optimized for comparison of an individual’s HPO annotations to a reference set continues (Robinson et al., 2020). At the time of writing, the means by which such frameworks can be meaningfully integrated into ACMG/AMP criteria is unclear given that phenotypic features are often gene-specific rather than variant-specific and detailed phenotypic considerations contribute only supporting evidence (Richards et al., 2015). If validated, increased weighting of phenotypic evidence may be particularly helpful in the molecular diagnosis of people without both parents available for demonstration of de novo status (a not infrequent scenario for clinicians caring for adults with neurodevelopmental disorders), particularly if this unlocks access to precision treatments. Even if phenomic tools provide evidence outside of formal variant interpretation criteria, they are likely to reduce lag to genetic diagnosis if successfully implemented within clinical decision support and learning health systems.
Acknowledgements
This research was funded in whole, or in part, by the Wellcome Trust [203914/Z/16/Z] supporting D.L.S.. I.H. was supported by The Hartwell Foundation (Individual Biomedical Research Award), the National Institute for Neurological Disorders and Stroke (K02 NS112600), the Eunice Kennedy Shriver National Institute of Child Health and Human Development through the Intellectual and Developmental Disabilities Research Center (IDDRC) at Children’s Hospital of Philadelphia and the University of Pennsylvania (U54 HD086984), and by the German Research Foundation (HE5415/3-1, HE5415/5-1, HE5415/6-1, HE5415/7-1). Research reported in this publication was also supported by the National Center for Advancing Translational Sciences of the National Institutes of Health (UL1TR001878), by the Institute for Translational Medicine and Therapeutics’ (ITMAT) at the Perelman School of Medicine of the University of Pennsylvania, and by Children’s Hospital of Philadelphia through the Epilepsy NeuroGenetics Initiative (ENGIN).
Footnotes
Disclosures
I.H. serves on the Scientific Advisory Board of Biogen. The other authors declare no conflict of interest. R.H.T. has received honoraria and meeting support from Arvelle, Bial, Eisai, GW Pharma, LivaNova, Novartis, Sanofi, UCB Pharma, UNEEG and Zogenix. The other authors report no competing interests.
Data availability statement
Requests for access to de-identified data depicted in panel b of figure 6 should be made to the corresponding author. Other data presented in this review are taken from published sources, to which readers should refer for data availability statements.
References
- Abul-Husn NS, Cheng X, Li AH, Xin Y, Schurmann C, Stevis P, . . . Dewey FE (2018). A Protein-Truncating HSD17B13 Variant and Protection from Chronic Liver Disease. N Engl J Med, 378(12), 1096–1106. doi: 10.1056/NEJMoa1712191 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Akawi N, McRae J, Ansari M, Balasubramanian M, Blyth M, Brady AF, . . . study, D. D. D. (2015). Discovery of four recessive developmental disorders using probabilistic genotype and phenotype matching among 4,125 families. Nat Genet, 47(11), 1363–1369. doi: 10.1038/ng.3410 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Amir RE, Van den Veyver IB, Wan M, Tran CQ, Francke U, & Zoghbi HY (1999). Rett syndrome is caused by mutations in X-linked MECP2, encoding methyl-CpG-binding protein 2. Nat Genet, 23(2), 185–188. doi: 10.1038/13810 [DOI] [PubMed] [Google Scholar]
- Andrews T, Meader S, Vulto-van Silfhout A, Taylor A, Steinberg J, Hehir-Kwa J, . . . Webber C (2015). Gene networks underlying convergent and pleiotropic phenotypes in a large and systematically-phenotyped cohort with heterogeneous developmental disorders. PLoS Genet, 11(3), e1005012. doi: 10.1371/journal.pgen.1005012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bastarache L (2021). Using Phecodes for Research with the Electronic Health Record: From PheWAS to PheRS. Annual Review of Biomedical Data Science, 4(1), 1–19. doi: 10.1146/annurev-biodatasci-122320-112352 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bodenreider O (2004). The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res, 32(Database issue), D267–270. doi: 10.1093/nar/gkh061 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Claes L, Del-Favero J, Ceulemans B, Lagae L, Van Broeckhoven C, & De Jonghe P (2001). De novo mutations in the sodium-channel gene SCN1A cause severe myoclonic epilepsy of infancy. Am J Hum Genet, 68(6), 1327–1332. doi: 10.1086/320609 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Core Human Phenotype Ontology/Monarch Initiative Team. The Human Phenotype Ontology. Retrieved from https://hpo.jax.org/app/
- Crawford K, Xian J, Helbig KL, Galer PD, Parthasarathy S, Lewis-Smith D, . . . Helbig I (2021). Computational analysis of 10,860 phenotypic annotations in individuals with SCN2A-related disorders. Genet Med. doi: 10.1038/s41436-021-01120-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dewey FE, Gusarova V, O’Dushlaine C, Gottesman O, Trejos J, Hunt C, . . . Shuldiner AR (2016). Inactivating Variants in ANGPTL4 and Risk of Coronary Artery Disease. N Engl J Med, 374(12), 1123–1133. doi: 10.1056/NEJMoa1510926 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dravet C (2011). The core Dravet syndrome phenotype. Epilepsia, 52 Suppl 2, 3–9. doi: 10.1111/j.1528-1167.2011.02994.x [DOI] [PubMed] [Google Scholar]
- eMerge. (2007). Retrieved from https://emerge-network.org
- Escayg A, MacDonald BT, Meisler MH, Baulac S, Huberfeld G, An-Gourfinkel I, . . . Malafosse A (2000). Mutations of SCN1A, encoding a neuronal sodium channel, in two families with GEFS+2. Nat Genet, 24(4), 343–345. doi: 10.1038/74159 [DOI] [PubMed] [Google Scholar]
- Firth HV, & Wright CF (2011). The Deciphering Developmental Disorders (DDD) study. Dev Med Child Neurol, 53(8), 702–703. doi: 10.1111/j.1469-8749.2011.04032.x [DOI] [PubMed] [Google Scholar]
- Fitzgerald MP, Kaufman MC, Massey SL, Fridinger S, Prelack M, Ellis C, . . . Helbig I (2021). Assessing seizure burden in pediatric epilepsy using an electronic medical record-based tool through a common data element approach. Epilepsia, 62(7), 1617–1628. doi: 10.1111/epi.16934 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gahl WA, Mulvihill JJ, Toro C, Markello TC, Wise AL, Ramoni RB, . . . Udn. (2016). The NIH Undiagnosed Diseases Program and Network: Applications to modern medicine. Mol Genet Metab, 117(4), 393–400. doi: 10.1016/j.ymgme.2016.01.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Galer PD, Ganesan S, Lewis-Smith D, McKeown SE, Pendziwiat M, Helbig KL, . . . Helbig I (2020). Semantic similarity analysis reveals robust gene-disease relationships in developmental and epileptic encephalopathies. Am J Hum Genet, 107(4), 683–697. doi: 10.1016/j.ajhg.2020.08.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ganesan S, Galer PD, Helbig KL, McKeown SE, O’Brien M, Gonzalez AK, . . . Helbig I (2020). A longitudinal footprint of genetic epilepsies using automated electronic medical record interpretation. Genet Med, 22, 2060–2070. doi: 10.1038/s41436-020-0923-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Geisinger Healthcare System. (2021). MyCode Community Healthcare Initiative. Retrieved from https://www.geisinger.org/precision-health/mycode
- Grinspan ZM, Patel AD, Shellhaas RA, Berg AT, Axeen ET, Bolton J, . . . Pediatric Epilepsy Learning Healthcare, S. (2021). Design and implementation of electronic health record common data elements for pediatric epilepsy: Foundations for a learning health care system. Epilepsia, 62(1), 198–216. doi: 10.1111/epi.16733 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gusarova V, O’Dushlaine C, Teslovich TM, Benotti PN, Mirshahi T, Gottesman O, . . . Gromada J (2018). Genetic inactivation of ANGPTL4 improves glucose homeostasis and is associated with reduced risk of diabetes. Nat Commun, 9(1), 2252. doi: 10.1038/s41467-018-04611-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hagberg B, Aicardi J, Dias K, & Ramos O (1983). A progressive syndrome of autism, dementia, ataxia, and loss of purposeful hand use in girls: Rett’s syndrome: report of 35 cases. Ann Neurol, 14(4), 471–479. doi: 10.1002/ana.410140412 [DOI] [PubMed] [Google Scholar]
- Helbig I, Lopez-Hernandez T, Shor O, Galer P, Ganesan S, Pendziwiat M, . . . Consortium, G. (2019). A recurrent missense variant in AP2M1 impairs clathrin-mediated endocytosis and causes developmental and epileptic encephalopathy. Am J Hum Genet, 104(6), 1060–1072. doi: 10.1016/j.ajhg.2019.04.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jacobsen JOB, Baudis M, Baynam GS, Beckmann JS, Beltran S, Callahan TJ, . . . Robinson PN (2021). The GA4GH Phenopacket schema: A computable representation of clinical data for precision medicine. medRxiv, 2021.2011.2027.21266944. doi: 10.1101/2021.11.27.21266944 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kirby JC, Speltz P, Rasmussen LV, Basford M, Gottesman O, Peissig PL, . . . Denny JC (2016). PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability. Journal of the American Medical Informatics Association : JAMIA, 23(6), 1046–1052. doi: 10.1093/jamia/ocv202 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kohler S, Carmody L, Vasilevsky N, Jacobsen JOB, Danis D, Gourdine JP, . . . Robinson PN (2019). Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources. Nucleic Acids Res, 47(D1), D1018–D1027. doi: 10.1093/nar/gky1105 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kohler S, Doelken SC, Mungall CJ, Bauer S, Firth HV, Bailleul-Forestier I, . . . Robinson PN (2014). The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data. Nucleic Acids Res, 42(Database issue), D966–974. doi: 10.1093/nar/gkt1026 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kohler S, Gargano M, Matentzoglu N, Carmody LC, Lewis-Smith D, Vasilevsky NA, . . . Robinson PN (2021). The Human Phenotype Ontology in 2021. Nucleic Acids Res, 49(D1), D1207–D1217. doi: 10.1093/nar/gkaa1043 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kohler S, Vasilevsky NA, Engelstad M, Foster E, McMurry J, Ayme S, . . . Robinson PN (2017). The Human Phenotype Ontology in 2017. Nucleic Acids Res, 45(D1), D865–D876. doi: 10.1093/nar/gkw1039 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lewis-Smith D, Galer PD, Balagura G, Kearney H, Ganesan S, Cosico M, . . . Helbig I (2021). Modeling seizures in the Human Phenotype Ontology according to contemporary ILAE concepts makes big phenotypic data tractable. Epilepsia, 62(6), 1293–1305. doi: 10.1111/epi.16908 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lewis-Smith D, Ganesan S, Galer PD, Helbig KL, McKeown SE, O’Brien M, . . . Helbig I (2021). Phenotypic homogeneity in childhood epilepsies evolves in gene-specific patterns across 3251 patient-years of clinical data. Eur J Hum Genet. doi: 10.1038/s41431-021-00908-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lewis-Smith DG, S; Galer PD; Krause R; Thomas RH; Helbig I; Epi25 Collaborative,. (2021). An Expanded Genotype-Phenotype Analysis of 11,500 People with Epilepsy. Paper presented at the American Epilepsy Society Annual Meeting, Chicago. https://cms.aesnet.org/abstractslisting/an-expanded-genotype-phenotype-analysis-of-11-500-people-with-epilepsy [Google Scholar]
- McKusick-Nathans Institute of Genetic Medicine, J. H. U. B., MD),. (2020). Online Mendelian Inheritance in Man, OMIM®. Retrieved from https://omim.org/
- NHS Health Education England. (2020). Requesting whole genome sequencing: information for clinicians. Retrieved from https://www.genomicseducation.hee.nhs.uk/supporting-the-nhs-genomic-medicine-service/requesting-whole-genome-sequencing-information-for-clinicians/
- Northrup H, Aronow ME, Bebin EM, Bissler J, Darling TN, de Vries PJ, . . . Krueger DA (2021). Updated International Tuberous Sclerosis Complex Diagnostic Criteria and Surveillance and Management Recommendations. Pediatr Neurol, 123, 50–66. doi: 10.1016/j.pediatrneurol.2021.07.011 [DOI] [PubMed] [Google Scholar]
- Orphanet. (1997). The Orphanet database. Retrieved from www.orpha.net
- Pruett DG, Shaw DM, Chen HH, Petty LE, Polikowsky HG, Kraft SJ, . . . Below JE (2021). Identifying developmental stuttering and associated comorbidities in electronic health records and creating a phenome risk classifier. J Fluency Disord, 68, 105847. doi: 10.1016/j.jfludis.2021.105847 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rett A (1966). [On a unusual brain atrophy syndrome in hyperammonemia in childhood]. Wien Med Wochenschr, 116(37), 723–726. Retrieved from https://www.ncbi.nlm.nih.gov/pubmed/5300597 [PubMed] [Google Scholar]
- Richards S, Aziz N, Bale S, Bick D, Das S, Gastier-Foster J, . . . Rehm HL (2015). Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med, 17(5), 405–424. doi: 10.1038/gim.2015.30 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robinson PN, Kohler S, Bauer S, Seelow D, Horn D, & Mundlos S (2008). The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease. Am J Hum Genet, 83(5), 610–615. doi: 10.1016/j.ajhg.2008.09.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robinson PN, Ravanmehr V, Jacobsen JOB, Danis D, Zhang XA, Carmody LC, . . . Smedley D (2020). Interpretable Clinical Genomics with a Likelihood Ratio Paradigm. Am J Hum Genet, 107(3), 403–417. doi: 10.1016/j.ajhg.2020.06.021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shaw DM, Polikowsky HP, Pruett DG, Chen HH, Petty LE, Viljoen KZ, . . . Below JE (2021). Phenome risk classification enables phenotypic imputation and gene discovery in developmental stuttering. Am J Hum Genet, 108(12), 2271–2283. doi: 10.1016/j.ajhg.2021.11.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Turro E, Astle WJ, Megy K, Graf S, Greene D, Shamardina O, . . . Ouwehand WH (2020). Whole-genome sequencing of patients with rare diseases in a national health system. Nature, 583(7814), 96–102. doi: 10.1038/s41586-020-2434-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- UK Biobank. (2007). Retrieved from www.ukbiobank.ac.uk
- Vreeman DJ, McDonald CJ, & Huff SM (2010). LOINC® - A Universal Catalog of Individual Clinical Observations and Uniform Representation of Enumerated Collections. Int J Funct Inform Personal Med, 3(4), 273–291. doi: 10.1504/ijfipm.2010.040211 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Waisbren SE, Noel K, Fahrbach K, Cella C, Frame D, Dorenbaum A, & Levy H (2007). Phenylalanine blood levels and clinical outcomes in phenylketonuria: A systematic literature review and meta-analysis. Molecular Genetics and Metabolism, 92(1), 63–70. doi: 10.1016/j.ymgme.2007.05.006 [DOI] [PubMed] [Google Scholar]
- World Health Organization. (1992). The ICD-10 classification of mental and behavioural disorders : clinical descriptions and diagnostic guidelines. Geneva: World Health Organization,. [Google Scholar]
- Xian J, Parthasarathy S, McKeown S, Balagura G, Fitch E, Helbig K, . . . Helbig I (2021). Assessing the landscape of STXBP1-related disorders in 534 individuals. Brain. doi: 10.1093/brain/awab327 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Requests for access to de-identified data depicted in panel b of figure 6 should be made to the corresponding author. Other data presented in this review are taken from published sources, to which readers should refer for data availability statements.
