Skip to main content
iMeta logoLink to iMeta
. 2024 Jan 2;3(1):e157. doi: 10.1002/imt2.157

Recent advances in the characterization of essential genes and development of a database of essential genes

Ya‐Ting Liang 1, Hao Luo 1, Yan Lin 1, Feng Gao 1,2,3,
PMCID: PMC10989110  PMID: 38868518

Abstract

Over the past few decades, there has been a significant interest in the study of essential genes, which are crucial for the survival of an organism under specific environmental conditions and thus have practical applications in the fields of synthetic biology and medicine. An increasing amount of experimental data on essential genes has been obtained with the continuous development of technological methods. Meanwhile, various computational prediction methods, related databases and web servers have emerged accordingly. To facilitate the study of essential genes, we have established a database of essential genes (DEG), which has become popular with continuous updates to facilitate essential gene feature analysis and prediction, drug and vaccine development, as well as artificial genome design and construction. In this article, we summarized the studies of essential genes, overviewed the relevant databases, and discussed their practical applications. Furthermore, we provided an overview of the main applications of DEG and conducted comprehensive analyses based on its latest version. However, it should be noted that the essential gene is a dynamic concept instead of a binary one, which presents both opportunities and challenges for their future development.

Keywords: database of essential genes, drug and vaccine design, essential gene, gene essentiality prediction, genome design


Essential genes are indispensable for the survival and development of organisms, and have been successfully identified using various experimental methods across diverse organisms. DEG, a database of essential genes is dedicated to collecting experimental results on essential genes and has become one of the most commonly used tools for studying essential genes, where the information on essential genes for a specific species can be quickly accessed, with a broad range of applications, such as essential gene feature analysis and prediction, drug and vaccine design, as well as artificial genome design and construction. The definition of essential genes is environment‐specific rather than simple binary, which should not be overlooked in the study of essential genes.

graphic file with name IMT2-3-e157-g002.jpg

Highlights

  • Essential genes are crucial for organism survival and development, which have been identified using various experimental methods across different organisms, providing valuable resources for studying their characteristics and developing predictive algorithms.

  • The database of essential genes (DEG) collects experimental results on essential genes and serves as a widely used resource for studying them. It enables quick access to essential gene information for specific species and has applications in essential gene analysis, prediction, drug and vaccine design, and artificial genome construction.

  • The definition of essential genes is context‐dependent, not a simple binary classification. Integrating different experimental results of essential genes is challenging but offers opportunities to explore gene interaction mechanisms.

INTRODUCTION

Essential genes are necessary for the survival of an organism under specific environmental conditions and play crucial roles in fundamental biological and genetic processes [1]. The gene essentiality is important in both theoretical and applied research [2, 3]. Therefore, it is vital to identify essential genes as this information can be applied in the fields of life sciences, pharmacology, and synthetic biology. However, gene essentiality is conditionally dependent on various factors, such as growth conditions, developmental stages, and genetic environment [4]. Different experimental conditions for the same organism may yield different results, which can change over the course of evolution [5]. Gene essentiality is a quantitative trait that must be evaluated continuously. Therefore, the effects of the essential genes under specific conditions should not be overlooked.

In general, the prediction of essential genes relies primarily on laboratory approaches. In 1999, the first set of essential genes in living organisms was determined using global transposon mutagenesis in Mycoplasma genitalium [6]. Subsequently, extensive research has been conducted on essential genes in a wide range of organisms. The development of experimental techniques for identifying essential genes has facilitated the collection of essential gene data. The most commonly used experimental methods for predicting essential genes include single‐gene knockout [7], RNA interference [8], antisense RNA [9], transposon mutagenesis [6], and clustered regularly interspaced short palindromic repeat‐associated protein (CRISPR)‐Cas9 [10, 11]. In addition to experimental methods, computational prediction methods have emerged to facilitate the identification of essential genes. The most commonly used computational methods include comparative genomics‐based, constraint‐based, and machine‐learning‐based prediction methods, which provide important references for future research on essential genes.

In recent years, studies on essential genes have been conducted in various fields. For example, understanding the functions of essential genes is essential for discovering the minimal core components of cells, which are crucial in synthetic biology [12]. Research on essential genes can accelerate the development of microorganisms with desired phenotypic traits [13], and facilitate the development of drugs and vaccines. Essential bacterial proteins can serve as potential drug targets for new antibiotics because their indispensable role in bacterial life renders them vulnerable to disruption [14]. Certain essential proteins conserved in specific species are considered promising candidates for broad‐spectrum drug targets, whereas others specific to a particular bacterium are viable candidates for species‐specific targets. Moreover, the essentiality of novel genes can be predicted by constructing models based on existing essential gene data.

With the increasing amount of essential gene data, databases, and online services based on experimental or predicted essential gene data have emerged, providing convenient and reliable references for essential gene‐related research. The database of essential genes (DEG) was established in 2004 and has been continuously updated with the accumulation of experimental data [15, 16, 17, 18]. This database compiles essential gene data obtained from a diverse range of experimental methods on a whole‐genome scale. To date, DEG has become an important reference for essential gene‐related research, with the database cited more than 1100 times based on Web of Science data.

In this review, we provide a brief overview of the technological advances in essential gene prediction, relevant databases, online services, as well as the applications of essential genes, and emphasize that the concept of gene essentiality is environmentally specific. In addition, we discuss the significant applications of DEG in providing researchers with reliable references. Furthermore, we propose potential directions for expanding the use of DEG.

The definition of essential genes

Essential genes are indispensable for the survival of an organism and are, therefore, considered the foundation of life. Nonessential genes refer to the genes that have been experimentally determined to be dispensable for the growth of an organism, rather than the genes other than the essential ones. However, the estimated proportions of essential genes vary significantly among species and studies [19]. Diverse studies within the same species have shown that genes may be essential in one strain but not in another [5, 20] or that genes may be essential for one growth condition but not for others [21, 22], suggesting that essentiality is not an intrinsic property of genes. The reasons for such variations are diverse, including differences in experimental methods, conditions, and even the influence of experimental errors. Hence, the term “essential gene” is highly dependent on context. Only when the environment for the survival of an organism is precisely defined can a gene be classified as essential. The dependence of a cell on specific genes/gene products is influenced by its external environment and genetic context, including the presence or absence of other genes/gene products [23]. As a result, the essentiality of a gene can vary depending on these factors and change with each deletion. These genes are referred to as “conditionally essential genes” [24]. For example, some genes have been identified as “protective essential” as they can become dispensable with the removal of another gene, subsequently rendering them nonessential. This is typically because the former encodes a protective function against the toxic effects of the latter gene [25, 26, 27]. Conversely, the loss of a second gene may render a nonessential gene essential (synthetic lethality) [28]. Synthetic lethality was originally discovered in studies involving fruit fly and yeast, where individual inactivation of either of the two genes had minimal impact on cell viability. However, simultaneous disruptions in the expression of two genes or multiple independent genes, including mutations, overexpression, and gene suppression, can lead to cell death [29, 30]. Furthermore, gene products can form complexes in which nonessential genes contribute to essential functions. For instance, among the protein‐coding genes in budding yeast involved in reproduction, five groups of nonessential genes that potentially encode proteins with fundamental functions have been identified [31]. Nutritional conditions can also influence essential genes, as mutant strains carrying inactivated nonessential genes may exhibit minimal or negligible effects on cell phenotypes under optimal conditions and severe impairment or loss of viability under suboptimal conditions [32, 33]. However, evidence suggests that many nonessential genes grown in nutrient‐rich media are important for adaptability to alternative growth conditions [34]. Additionally, different experimental methods may yield different outcomes. For example, CRISPR‐based approaches have identified more essential genes in human cell lines compared to RNA interference (RNAi)‐based methods [35, 36]. Considering these factors, essentiality may be a quantitative characteristic rather than a simple essential/nonessential classification, necessitating standardized quantitative approaches [37]. Research indicates that dispensable essential genes often exhibit characteristics akin to those of nonessential genes. This is attributed to the abundance of genes with paralogous counterparts compared to other essential genes, lower levels of co‐expression, and the near absence of genes encoding protein complex components. This outcome explains the distinction between essential dispensable and permanently essential genes. Moreover, based on these features, a random forest prediction model has been developed for the identification of conditionally essential genes [38].

Recent studies have found that bacteria and yeast can adapt through genomic changes and adaptive mutations that restore cellular function in response to the inactivation of conditionally essential genes [39, 40]. These studies suggest that, despite the loss of essential gene function, the essentiality of genes depends on the cell's capacity to acquire mutations and restore proliferation. This indicates that essentiality is no longer solely a gene attribute, but rather a cellular attribute, attributing essentiality to cellular pathways rather than the genes themselves [41]. These findings present new research directions and challenges for defining and studying essential genes.

The identification of essential genes

The identification of essential genes is a long‐standing question in the field of molecular biology. Currently, there are two main approaches for determining essential genes: experimental and computational. Experimental methods can provide specific results for essential genes under different experimental conditions (Figure 1). However, they can be expensive and labor‐intensive. As a result, various computational methods have been proposed to predict essential genes using computers with the aim of alleviating these limitations.

Figure 1.

Figure 1

Milestones of technological breakthroughs in gene essentiality. It illustrates some of the important technological advancements and their initial applications in essential gene research. The update of different versions of the database of essential genes (DEG) and the data volume obtained through different methods in each version of DEG is also indicated.

Experimental‐based methods

Identification of essential genomes has a long history in the field of molecular biology. In 1951, Horowitz and Leupold proposed that some major components of proteins may be essential for life [42]. Before the “genomics era” early mutagenesis techniques involved the induction of random mutations in the genome of organisms using chemical or physical agents [43]. Essential genes were predicted by inducing random chemical mutagenesis and analyzing progeny survival. For example, studies have shown that approximately 50%, 15%, and 12% of the genomes of Drosophila melanogaster, Caenorhabditis elegans, and Saccharomyces cerevisiae are essential [44, 45, 46, 47]. Nonetheless, the experimental techniques at that time were insufficient to identify specific essential genes. In the late 20th century, transposon techniques [48, 49], single‐gene knockout [50], antisense RNA [51], and RNA interfaces [52] were developed, providing more possibilities for research focused on the gene level. In 1995, the first complete genome sequences of Haemophilus influenzae and M. genitalium were obtained by using high‐throughput [53, 54]. The complete genome sequence of an organism is a prerequisite for acquiring a complete list of genes. The complete genomes of model organisms such as Escherichia coli, Bacillus subtilis, and S. cerevisiae were published in the subsequent 2 years [55, 56]. The first experiment to identify the essential gene set of an organism was conducted in 1999, which confirmed the minimal gene set required for the survival of M. genitalium using whole‐genome transposon mutagenesis [6]. The principle of transposon insertion for detecting essential genes is based on the random insertion of transposons and their effects on gene disruption. By analyzing the insertion site and expression of transposon‐disrupted genes, essential genes and their effects on growth can be determined. Subsequently, methods have been developed to identify essential genes by targeting single‐gene knockout [7], RNAi [8], and antisense RNA [9], thereby enabling researchers to explore the essentiality of genes in various organisms. Single‐gene knockout experiments involve the removal of one or more genes to observe phenotypic changes. For genome‐wide studies, this process must be repeated several times, which necessitates comprehensive genome annotation. RNAi technology identifies essential genes by introducing double‐stranded RNA molecules that are complementary to the messenger RNA (mRNA) of a target gene, leading to mRNA degradation and inhibition of protein expression. Antisense RNA technology identifies essential genes by introducing single‐stranded RNA molecules that are complementary to the mRNA of the target gene, leading to the inhibition of gene expression.

In the early stages of essential gene research, the focus was primarily on microorganisms. Efforts to target gene disruption in animals were hindered by the inherent inefficiency of homologous recombination in animal tissue cultures [57]. The first breakthrough was the discovery that homologous recombination is more effective in embryonic stem cells derived from mouse blastocysts, leading to the development of the first knockout mouse in 1989 [58]. In 2003, the first genome‐wide RNAi screen was conducted on the nematode C. elegans to systematically define mutant phenotypes [59]. Subsequently, this technology has been rapidly applied to mammalian cells, and several groups have generated RNAi libraries encompassing human and mouse genomes for gene importance screening [60, 61]. RNAi has become the primary method for studying essential genes in mammals. However, off‐target effects and incomplete gene function limitations cannot be ignored.

The advent of next‐generation sequencing technologies has enabled the rapid acquisition of whole‐genome sequences of various species. Convenient access to sequencing data has facilitated the introduction of other methodologies for handling genetic variations. Transposon sequencing (Tn‐seq) is an emerging technology that combines transposon mutagenesis with high‐throughput sequencing and utilizes a high‐density transposon insertion library constructed in the target organism to perform functional analysis of its whole genome through high‐throughput sequencing [62]. Tn‐seq techniques such as TraDIS [63], INSeq [64], HITS [65], and the Tn‐seq Circle [66] have been extensively used for essential gene detection in microorganisms. The application of Tn‐seq has enhanced the understanding of essential genes by incorporating essential genomic elements, including noncoding RNAs, rather than focusing solely on protein‐coding genes. In addition, Tn‐seq can be used to screen essential genes under various experimental conditions, both in vitro and in vivo, and is not just limited to culture conditions [67]. Tn‐seq significantly increases the number of essential genes identified under different conditions.

Whole‐genome essentiality screens have elucidated the molecular basis of several biological processes. However, RNAi‐based screens are often confounded by off‐target effects and gene knockdown rather than complete loss of function, limiting our knowledge of essential genes in human cells [68]. However, the emergence of the CRISPR/Cas9 system has revolutionized mammalian cell genome editing [69, 70]. At its core, the programmable DNA endonuclease is composed of two components: the Cas9 protein derived from Streptococcus pyogenes (or similar proteins from other species), and a single guide RNA that guides endonuclease activity to the target DNA sequence [71]. CRISPR induces sequence‐specific DNA double‐strand breaks, leading to frameshift insertions/deletions that result in the complete loss of protein function [72]. This technology has made cost‐effective and efficient gene editing possible in yeast, plants, and animals with a significant impact on human cell editing [73, 74, 75]. Additionally, the catalytically dead version of Cas9 (dCas9) can be employed to target specific DNA sequences with single guide RNA, which is known as CRISPR interference (CRISPRi), or to activate gene expression, which is referred to as CRISPR activation (CRISPRa) [76]. Cas9 and dCas9 have been used to map the composition of important components in human cell lines [77]. In 2015, three papers simultaneously reported the genome‐wide identification of essential genes in different human cell types [78, 79, 80]. In addition, whole exome sequencing is another breakthrough that captures and enriches the DNA of exonic regions of the entire genome using either sequence capture or targeted amplification techniques [80]. Compared to whole‐genome resequencing, whole‐exome sequencing mainly targets gene sequences in exonic regions with deeper coverage and higher data accuracy to quickly identify essential human genes in vivo [81].

Computational‐based methods

Considering the complexity, high cost, labor, and time intensiveness of experimental methods, computational approaches have been deployed as a supplement to experimental techniques to minimize the resources required for essentiality analysis. As the number of essential genes identified through experiments increases, reliable references are being provided to develop methods for essential gene prediction. In general, three approaches are employed to determine gene essentiality: comparative genomics‐based, constraint‐based, and machine‐learning‐based methods.

Initially, essential gene prediction was conducted using comparative genomics‐based methods that rely on homology [82]. Homology mapping occurs between genes that are duplicated within an organism (paralogs) or between related genes in two or more different organisms (orthologs) that are products of speciation. Homology mapping involves comparison of the sequences of two organisms and determination of their sequence similarity based on a defined percentage identity threshold. This method has been used to predict core genes in bacterial species such as Mycoplasma spp., Corynebacterium spp., Plasmodium falciparum, and Brucella spp. [83, 84, 85, 86].One major challenge is the significant impact of evolutionary distance on the results of comparative genomic analysis. Although essential genes are often highly evolutionarily conserved, especially in bacteria, conserved genes across species are not always essential [87], which render homology‐based methods less effective in predicting essentiality. Previous large‐scale analyses have shown that only a small number of genes are conserved across the tree of life, implying that many essential genes are species‐specific [88].

Constraint‐based prediction methods utilize genome‐scale metabolic networks to elucidate the biology of the metabolic pathways within an organism. This approach relies on constraint‐based modeling techniques applied to reconstructed metabolic networks based on genome sequencing and annotation to study the structure, function, and interactions of the network [89]. Flux balance analysis (FBA) is the most widely used constraint‐based method for analyzing metabolic network properties. It predicts the fluxes of metabolites under steady‐state conditions by applying mass balance constraints to stoichiometric models [90]. Using FBA to predict essential genes involves simulating gene knockouts and assessing their impact on a network [91]. Constraint‐based models have been constructed in organisms from all three domains of life and have facilitated the study of gene essentiality [92]. The FBA has a low computational cost because it does not require kinetic parameters. However, FBA has notable limitations. First, it can only predict the importance of metabolic genes [93]. In addition, unlike the ability to combine steady‐state and dynamic analyses, FBA requires enzyme kinetic data to evaluate the activity of genome‐scale metabolic reactions under transient conditions [94]. Finally, FBA often requires enzymatic reactions to address the limitations in metabolic models, which may sometimes be inconsistent with experimental data. This depends on empirical modeling, and in some cases, parameter prediction can be challenging [95].

Currently, the most widely used essential gene prediction method is based on the construction of machine learning algorithms, using features derived from the analysis of experimental results on essential genes. Generally, the features of essential genes can be divided into two categories: sequence‐ and context‐related (Table 1).

Table 1.

Summary of characteristics of essential genes.

Category Feature Description References
Sequence features Expression level Studies have shown that essential genes are expressed at higher levels than nonessential genes. [9697]
Codon bias Essential genes are more likely to use optimal codons. Some characteristics of codons can serve as parameters for assessing the ideal codon usage, which plays a critical role in ensuring precise translation of highly expressed genes. [9899]
Protein size Larger proteins have more biological functions and are more conserved, it is believed that larger proteins tend to be enriched in essential proteins. [100]
Gene position There is a higher proportion of essential genes located in operons and the leading strand. [101]
Subcellular location Essential proteins are found predominantly in the cytoplasm, although significantly greater proportions of nonessential proteins are situated in other cellular areas. [100102]
Hurst exponent Essential genes have a lower average Hurst exponent. [103]
Context‐related features Evolutionary relationships Essential genes evolve more slowly than nonessential genes. [104105]
Protein function Essential genes are hubs of protein–protein interaction networks, which are more commonly involved in fundamental categories according to gene ontology and Kyoto Encyclopedia of Genes and Genomes pathway. [106107]
Protein‐protein interactions Essential genes tend to have higher connectivity in protein‐protein interaction networks. [108109]
Domain properties Protein essentiality is to be conserved through the function of protein domains or domain combinations. [110111]

Research on the characteristics of essential genes provides a reference for predicting essential genes using machine learning [112]. Typically, the development of predictive models in machine learning involves the following steps: feature selection (different features of essential genes), construction of training and testing datasets (essential/nonessential gene data), selection and design of machine learning algorithms, and evaluation of model predictive performance [113]. Various studies have used genomic and protein features to develop and train classifiers for predicting gene essentiality. Over the years, several algorithmic models based on genomic features have been successfully developed for the computational identification of essential genes. For example, Fan et al. developed an SCP algorithm that combines essential gene protein–protein interaction (PPI) and subcellular localization. This method integrates an improved PageRank algorithm based on gene expression data with weighted subcellular localization and the Pearson correlation coefficient of weighted subcellular localization [114]. Additionally, predictive models have emerged for essential genes in the noncoding regions. Zhang et al. developed an iEssLnc model using metapath‐guided random walks, which was the first estimation model for the essentiality of lncRNA genes [115]. It can be inferred that “good” data and efficient machine learning techniques are required for accurate prediction. Supervised, semi‐supervised, unsupervised, and reinforcement learning are commonly used machine learning techniques [116, 117]. However, gene essentiality prediction is often modeled as a classification problem under supervised learning.

Deep learning is a subset of machine learning in artificial intelligence, in which networks can learn from unstructured or unlabeled data in an unsupervised manner. Recently, deep learning has been used to predict essential genes. For example, Deeply Essential is a deep neural network that uses only sequence information to predict essential genes [118]. Compared with previous approaches using clustering and undersampled data sets, this model achieves higher sensitivity and accuracy [119]. Another deep learning model for essentiality prediction employs a different approach using a framework that automatically learns biological features without prior information. This network utilizes information on gene expression, subcellular localization, and PPI networks to learn topological features [120]. However, the two main drawbacks of applying deep learning to gene essentiality prediction are as follows: (i) deep neural networks require large amounts of data for training to outperform traditional machine learning algorithms and (ii) tuning hyperparameters in deep learning models is complex.

While the application of machine learning methods is convenient, it also faces challenges such as the difficulty of measuring the quality of predictions and the inability to generalize across specific experimental contexts. Furthermore, considering that the definition of essential genes is context‐specific, caution should be exercised in defining the criteria for training ML models for essentiality prediction, depending on the purpose of the study. Moreover, the selection of features and combinations may affect the predictive performance, and there is no definitive method for selecting appropriate features for different organisms [121]. Finally, for understudied species, the choice to study within‐species data is limited by the small number of known essential genes, whereas the use of cross‐species data may result in decreased accuracy.

Related databases and web servers

Online databases of essential genes have been created for various species based on experimental data and computational models (Table 2). Researchers can use data from these databases to study the intrinsic characteristics of essential genes/proteins and uncover features closely associated with essentiality. In addition to the experimentally derived databases of essential genes, databases storing predicted essential gene data have also been established, along with some open‐access programs for conducting essential gene predictions.

Table 2.

Description of essential gene databases and web servers.

Resource Description URL References
OGEE v3 Database that provides essential and nonessential genes with factors known to contribute to gene essentiality https://v3.ogee.info [122]
EGGs Database that provides essential genes for visualization and analysis on a subsystem diagram http://www.nmpdr.org/FIG/eggs.cgi [123]
DEG 15.0 Database of essential genomic regions based on experimental data with embedded BLAST tools http://www.essentialgene.org [18]
pDEG Database of predicted essential genes in Mycoplasma species, which integrates biased distribution information of essential genes http://tubic.org/pdeg [124]
CEG 2.0 Database based on clusters of directly homologous essential genes, which was originally derived from DEG and OGEE http://cefg.uestc.cn/ceg [125]
NetGenes Database that contains predicted essential genes for more than 2700 bacteria using features derived from PPI networks https://rbc-dsai-iitm.github.io/NetGenes [126]
ePATH Database of predicted essential genes for more than 4000 organisms based on experimental data and functional KEGG orthologs https://www.pubapps.vcu.edu/epath [127]
CEG_Match Web server developed based on the CEG database, which is a gene essentiality prediction tool based on its function http://cefg.uestc.cn/ceg [125]
Geptop 2.0 Web server that identifies essential genes in prokaryotes using ortholog definitions, incorporating best hits and weighting phylogenetic distances from DEG or OGEE http://cefg.uestc.cn/geptop [128]
ZCURVE 3.0 Web server used to predict genes in bacterial or archaeal genomes, which is developed based on the Z‐curve theory http://guolab.whu.edu.cn/ZCURVE/ [129]
EGP Web server that predicts essential genes in bacteria using SVM, incorporating 16 independent feature sets for model construction http://cefg.uestc.edu.cn:9999/egp [130]

Abbreviations: DEG, database of essential genes; KEGG, Kyoto Encyclopedia of Genes and Genomes; PPI, protein–protein interactions.

The practical applications of essential genes

Applications in the field of synthetic biology

Essential genes are responsible for maintaining normal physiological and metabolic processes in cells, making them essential for constructing cells with high stability or specific functions [23]. Therefore, these genes provide a theoretical basis for relevant research such as genome design. Minimal genomes are currently one of the best proofs of concept in synthetic biology. A minimal genome refers to the genome required to maintain the most basic life activities of an organism; it has been a crucial research aim in the field of synthetic biology [15]. Establishing a minimal, universal set of genes required to sustain life can considerably enhance our understanding of life at the most basic level and also has practical applications in production. As cornerstones of synthetic biology, the essential genes could serve as references for constructing a minimal genome. Notably, the concepts of essential genes and minimal genomes are not completely equivalent. The essential genes represent the set of genes required for successful reproduction, whereas the minimal genome represents genes necessary to maintain cell viability [131]. In practical, the identification and investigation of essential genes are typically employed to infer the composition of the minimal genome. However, computational modeling of bacterial metabolic networks has demonstrated that a minimal genome requires a greater number of genes than all essential genes combined [132]. For constructing a minimal genome, the top‐down approach and bottom‐up approach have been proposed [133, 134, 135] The top‐down approach reduces the genome size by deleting randomly selected or unidentified genomic segments. Deletions can be accomplished through various experimental approaches, such as plasmid‐ and linear DNA‐based methods, as well as the utilization of site‐specific recombinases, transposons, and the CRISPR/Cas system [133, 136]. Through a top‐down approach of progressively removing nonessential genes and functional elements, several examples of genome minimization have been achieved [3, 137, 138]. Additionally, a top‐down genome deletion algorithm called MinGenome was proposed, which starts with the longest possible deletions and consecutively eliminates metabolic and regulatory genes. To avoid lethal or growth‐defective deletions, this algorithm retains essential genes and synthetic‐lethal pairs by imposing constraints on biomass production [139]. The main advantage of the top‐down approach is that it begins with an operable genome, allowing any detrimental effects caused by deletions to be remedied by reverting to the latest nondeleted state. However, this method is time‐consuming and may lead to unforeseen dead ends, because the genetic landscape is altered at each step of the process, influencing the importance of other genes.

The bottom‐up approach connects gene fragments to specific functions. Advancements in DNA synthesis, sequencing technologies, and genome transplantation have enabled the synthesis of long DNA sequences with complex gene compositions. The primary method utilized for this is polymerase chain reaction technology, which enables the assembly of overlapping pools of short oligonucleotides and provides a technical foundation for bottom‐up approaches [140]. Based on several rounds of rational design and random mutagenesis of gradually reduced genomes, an approximate minimal bacterial genome, JCVI‐syn3.0, which contained 98 additional genes compared to the initially predicted set of essential genes in individuals, was constructed. This observation can be attributed to the fact that nonessential genes in the original genome become essential or quasi‐essential because of synthetic lethality during genome reduction [141]. Breuer et al. utilized the accumulated metabolic data from mycoplasmas and other bacteria, and applied it to a flux balance analysis model to establish a computational model of the JCVI‐syn3.0 metabolic network, which could better predict essential and quasi‐essential genes in JCVI‐syn3.0 [142]. During the bottom‐up design process, it is necessary to elucidate the complete genetic information for each gene and its interactions within the entire genetic background. Owing to a limited understanding of the principles of genome design and the complexity of the target organism, even for bacteria with smaller genomes, numerous possible genome configurations pose challenges for this approach [143].

As mentioned above, one of the major difficulties in constructing a minimal genome is elucidating the interactions between genes within the entire genetic background, which poses significant challenges for the identification of essential genes and genome construction in the early stages. However, computer‐assisted methods can accelerate the generation of large‐scale data corresponding to genetic content and cellular functionality by characterizing cells with various types of genome modifications, and thus has the potential to broaden and deepen our understanding of the entire cellular system and contribute to the production of industrially valuable biological systems.

Applications in the field of medicine

The traditional drug and vaccine discovery methods are resource‐intensive and time‐consuming. Subtractive genomics and reverse vaccinology have recently been classified as powerful approaches for identifying drug and vaccine candidates [144, 145, 146]. These approaches have streamlined drug development by eliminating the need for costly and time‐consuming trial‐and‐error experiments. The identification of potential targets is the first step in drug and vaccine discovery. Considering that the absence or inhibition of essential genes can have lethal effects on microorganisms, essential genes can be used as screening criteria for drug and vaccine targets in subtractive genomics and reverse vaccinology [14]. In addition, in the field of cancer treatment, the concept of synthetic lethality has been extended to paired genes, where one gene is inactivated by a deletion or mutation, and pharmacological inhibition of another gene leads to cancer cell death, while normal cells are not affected. In the most direct application, a targeted therapy could be determined to kill cancer cells lacking specific tumor suppressor genes but retain normal cells [24]. Moreover, the genes essential in cancer cell lines but nonessential in human tissues can reveal oncogenic drivers, paralog expression patterns, and chromosomal structures associated with the corresponding cancer types [147]. A recent analysis of genome‐scale CRISPR screening in a large group of cancer cell lines provided evidence that there may be hundreds of effective drug targets, but the vast majority is context‐specific [148]. Additionally, some essential genes have been associated with human diseases. For example, by population analysis‐based essential gene identification, genes intolerant to loss‐of‐function mutations are primarily essential in humans and have been found to play a role in human meiotic recombination, potentially contributing to the occurrence of certain diseases [149]. Furthermore, information on gene essentiality has been utilized to prioritize potential pathogenic variants of unknown disease‐related genes in human sequencing studies [150].

However, most of the existing essential genes have been identified in vitro, and significant differences between the genes essential for in vitro and those in vivo growth may be observed. An innovative solution is to perform screening in unconventional growth media such as acetate diversion, targeting Pseudomonas aeruginosa involved in lung infections. The screening workflow prioritizes compounds that exhibit activity in nutrient‐limited media [151]. Disease screening in host models is another approach to discovering in vivo targets. For example, high‐content microscopic screening of macrophages infected with M. tuberculosis identified a series of lead compounds targeting cytochrome c [152]. Although systematic investigations of genes have revealed numerous potential targets, a clear understanding of their functions is often inherent in the target validation process, which poses challenges in the identification of drug targets. Furthermore, the continuous evolution of drug resistance is inevitable, and further understanding of the resistance mechanisms is required. Finally, it must be acknowledged that gene essentiality is an evolvable trait, and genes associated with maximal importance may serve as promising drug targets. This poses difficulties in the search for drug targets but also brings opportunities. Therefore, more investigation is needed to fully define the importance of genes in different contexts and timescales and to further explore the genetic background of essential genes.

Introduction of the DEG database

With the growing amount of essential gene data, there is an urgent need to organize them into a database to facilitate the use of these data. Thus, we established DEG in 2004 and updated it continuously with the accumulation of experimental data [15, 16, 17, 18]. This database compiles genome‐wide essential gene data obtained from a diverse range of experimental methods. The latest version of DEG 15.0, which encompasses bacteria, eukaryotes, and archaea, was released in 2021 (https://tubic.org/deg). Overall, there are 78, 35, and two essential gene sets for bacteria eukaryotes, and archaea determined by different experimental methods were stored in DEG 15.0 (Figure 2A,B), and displayed the specific information for each group of essential genes (Figure 2C). Moreover, DEG also comprises analysis modules associated with essential genes, particularly for those in bacteria, which contain the following functions: (i) The distribution of essential genes on the leading or lagging strands. (ii) The subcellular localization distribution of essential genes (Figure 2E). (iii) Orthologous groups, EC number [153], KEGG pathway [154], and GO [155] information on the essential genes (Figure 2F,G). (iv) A customized BLAST search tool allows users to conduct species‐ and experiment‐specific searches to identify essential genes in annotated or unannotated genomes (Figure 2D) [156]. In addition to essential genes, this database also collects experimental results for a subset of nonessential genes, as well as basic genetic elements other than protein‐coding genes, such as noncoding RNAs and replication origins [157, 158, 159, 160].

Figure 2.

Figure 2

Screenshots and examples of user cases in the database of essential genes (DEG). (A) Homepage of the DEG database. Interfaces to other modules are supplied and the links to experimental results information for multiple essential genes in different species. (B) The summary page of all data within DEG. (C) A screenshot of the relevant information for an experimental result of a specific strain, including strain information, cultivation conditions, and reference citations. (D) The BLAST search interface for the corresponding strain. (E) The distribution of essential genes on the leading strand/lagging strand in the corresponding strain. (F) The Kyoto Encyclopedia of Genes and Genomes analysis results for essential genes in the corresponding strain. (G) The Gene Ontology analysis results for essential genes in the corresponding strain.

Applications of the DEG database

DEG has become an important reference for research related to essential genes, and a “golden set” of the most popular databases featured in NAR database issues [161]. Information on essential genes for specific species can be quickly accessed using DEG [2]. Currently, the applications of DEG primarily focus on the following four areas: artificial genome design and construction, drug and vaccine design, essential gene feature analysis, and the prediction of essential genes (Figure 3).

Figure 3.

Figure 3

Main applications of the database of essential genes (DEG). The applications of DEG primarily focus on the following four areas: artificial genome design and construction, drug and vaccine design, essential gene feature analysis, and the prediction of essential genes.

Studies using essential data from DEG for genome design are typically top‐down approaches because nonessential regions of the genome are often considered candidates for deletion. In the top‐down approach, essential/nonessential gene data from the database are usually used to identify the location of the deletion in the genome to increase its stability [162, 163, 164]. Generally, regions containing nonessential genes are targeted for deletion, whereas regions containing essential genes are typically conserved. In addition, transporter genes, insertion sequences (ISs), toxin‐antitoxin pairs, and other functional elements have also been designed for deletion [165], depending on the research purpose. For example, Liu et al. selectively deleted genes from Burkholderiales strains to optimize their genome structure and growth rate [166]. This study used nonessential gene data from DEG to facilitate the identification of highly conserved genome‐reduced strains, which enhanced the predictability of genome engineering strategies and improved the efficiency and stability of strain production. In fact, the data from DEG can provide valuable references for bottom‐up genome construction. Therefore, DEG has been recommended as a reference resource for studying the minimal gene set required for bacterial survival [167, 168, 169].

Using a computational strategy to identify potential drug targets rather than culturing the entire microbe can significantly reduce time and cost. Gene information from DEG was used to identify essential proteins of pathogenic bacteria. To date, this method has been applied for the determination of potential drug targets in pathogens, including Yersinia pseudotuberculosis [170], E. coli [171], P. aeruginosa [172], Shigella sonnei [173], Enterococcus faecium [174], Streptococcus pneumoniae [175] and Corynebacterium pseudotuberculosis [176]. In addition, this process is applicable to vaccine design, known as reverse vaccinology [177]. Reverse vaccinology can significantly accelerate vaccine development as it can reduce the need for extensive empirical testing of individual antigens. Using this strategy, potential vaccine targets for Helicobacter pylori [178], Brucellosis [179], Salmonella typhi [180], and Staphylococcus aureus [181] have been identified. Furthermore, the vaccinology design pipelines, VacSol and PanRV, have been augmented by a process that employs DEG data to detect essential genes [182, 183].

DEG provides valuable data for the analysis of essential gene features, thus contributing to the field of essential gene characterization. The inclusion of nonessential gene data in the database enables comparative studies between the two categories. Numerous studies have already leveraged the essential gene features provided by DEG. For example, Luo et al. conducted an evolutionary conservation analysis of bacterial genomes and found that essential genes evolved more slowly than nonessential genes [104]. Essential genes with important functions such as translation, transcription, and replication are more critical and widespread in the leading strand than other functional subtypes of essential genes [101]. Surprisingly, if a specific functional domain of a protein is present in multiple organisms, its likelihood of essentiality increases [111]. Moreover, the metabolic networks of many microorganisms have been reconstructed based on essential genes and, to some extent, managed using automated systems such as the SEED [184] and BiGG [185] models. Based on data from DEG and other sources, Magnúsdóttir et al. systematically analyzed metabolic interactions within the gut microbiota as well as the influence of external factors on these interactions [186].

In the prediction of essential genes using machine learning methods, DEG provides an ideal training data set. For example, Guo et al. used a support vector machine to predict the integrity of human genes based on lambda‐interval Z‐curve features derived from nucleotide sequence data [187]. Zeng et al. constructed a deep learning‐based framework for essential gene prediction using PPI networks, gene expression data, and subcellular location information [120]. DeepHE predicted essential human genes by integrating features from both sequence data and PPI networks [188]. Shi et al. proposed a method called iEsGene‐CSMOTE for the machine learning method based on Support Vector Machines for identifying essential genes, which introduced a clustering‐based synthetic minority oversampling technique, that is, CSMOTE, to overcome the issue of imbalanced data [189]. In addition to utilizing essential gene features, Zhou et al. proposed an algorithm based on image recognition to predict essential genes, which utilized a convolutional spiking neural network with a learning rule called R‐STDP. The frequency codon group recognition images of essential and nonessential genes from DEG were used to predict essential genes [190].

Analysis based on the DEG database

Comparison between essential and nonessential genes

Because DEG 15.0 contains a significant quantity of essential gene and nonessential gene datasets of bacteria that have undergone genome‐wide gene essentiality screening, it presents an opportunity to explore the correlation between genome size and the number of essential genes. Certain essential genes, such as those involved in replication, transcription, and translation, encode fundamental cellular functions that are mandatory for all genomes irrespective of their size [191]. Therefore, we performed statistical analysis of the data in DEG that contained both essential and nonessential genes. Linear regression results showed a positive correlation between the number of nonessential genes and genome size, while the number of essential genes remained essentially constant (Figure 4A). Regardless of the genome size and experimental conditions, no bacterial species had more than 1000 essential genes. The percentage of essential genes, relative to the total number of genes, decreased as the total number of genes increased (Figure 4B). This result indicates that the number of essential genes remained relatively constant and did not change with genome length, whereas the number of nonessential genes was positively correlated with genome length.

Figure 4.

Figure 4

Relevant analysis results based on essential genes in the database of essential genes (DEG). (A) Numbers of essential and nonessential genes against genome length. (B) The percentage of essential genes against the total gene number. (C) The relationship between the number of core‐essential genes and total essential genes. (D) Distribution of essential genes under the pan‐genomic framework. The x‐axis represents different species. The essential genes existing in the unique, accessory, and core gene sets are highlighted in pink, green, and blue, respectively. The y‐axis represents the proportion of different types of essential genes. Those with multiple experimental results are labeled with a sequential number appended to the strain name, which was generated by ImageGP [192].

Subsequently, we performed a statistical analysis of the COG annotation results between essential and nonessential genes in DEG. The proportions of each COG category for essential and nonessential genes are represented in blue and orange, respectively. Specifically, there were more essential than nonessential genes in categories C (energy production and conversion), F (nucleotide transport and metabolism), H (coenzyme transport and metabolism), J (translation, ribosomal structure, and biogenesis), K (transcription), and O (posttranslational modification, protein turnover, and chaperones). However, in categories G (carbohydrate transport and metabolism), N (cell motility), M (cell wall, Membrane, Envelope biogenesis), and U (intracellular trafficking, secretion, and vesicular transport), the number of nonessential genes was higher than that of the essential genes (Figure S1). In conclusion, the COG functional enrichment analysis suggests that essential genes tend to cluster in fundamental life processes, whereas nonessential genes play diverse roles and functions in environmental adaptation and substance synthesis.

The distribution of essential genes in pan‐genomic classification

The concept of “pan‐genomics” was coined by Tettelin in 2008, which has allowed for the study of genome at the species level. “Pan‐genome” refers to a collection of all genes of a species, which can be categorized as “core genes” shared by all strains, “accessory genes” shared by two or more strains, and “unique genes” specific to certain strains [193]. Constructing a pan‐genome enables the identification of the diversity and composition of a bacterial species beyond the limitations of a single reference genome. The core genes in pan‐genomic classification are those shared by all individuals within a species and often perform important functions that encode basic biological and phenotypic relevance [194]. Considering the importance of essential and core genes in organisms, pan‐genome classification has been incorporated into present essential gene research. For example, by employing manual curation and a predictive model, Saxena et al. have successfully identified key essential gene sets of Oleidesulfovibrio alaskensis G20 using the sulfate‐reducing bacterium model and classified these genes through pan‐genome analysis. The results indicated that most essential genes belonged to the core gene category, whereas essential genes in other categories may be environment‐specific [195]. Furthermore, Wu et al. proposed a strategy to reduce the genome size of B. subtilis while simultaneously ensuring the retention of core and essential genes [196].

However, there is a lack of large‐scale pan‐genomic classification, specifically for essential genes across different species, to estimate the degree of overlap. Therefore, we propose a classification method for essential genes using a pan‐genomic framework. We downloaded the complete bacterial strains available on NCBI for the species included in DEG. To ensure the accuracy of pan‐genome analysis, 17 species with more than 70 complete genomes were selected for subsequent analysis. First, we filtered out sequences containing more than 1% N nucleotides. Next, we excluded strains with Average nucleotide identity (ANI) values <95% for the same organism. ANI refers to the average base similarity between homologous segments of two microbial genomes. ANI values between organisms belonging to the same species are typically ≥95% [197]. Ultimately, we obtained 5900 strains related to 17 bacterial species for pan‐genome analysis. We conducted a pan‐genome analysis and obtained results for the pan‐genomes (including core, accessory, and unique genes) of 17 species. By comparing the results with the essential gene data for the corresponding strains in DEG using BLAST, we determined the distribution of essential genes in the pan‐genomic classification (Figure 4D and Table S1).

The results showed that the majority of essential genes in bacteria belonged to core genes, except for E. coli O157:H7. This suggests that the essential genes and core genes that play important roles are largely overlapped; these types of genes are typically more conserved, which are hereinafter referred to as core‐essential genes. Linear regression analysis revealed a significant correlation between the number of core‐essential genes and the total number of essential genes (Figure 4C). However, some essential genes did not belong to the core genes, which may be related to the adaptation of the strains to specific growth conditions. Therefore, this classification would provide new insights into the function of essential genes. Moreover, the integration of pan‐genomic analysis with essential gene studies provides a new perspective for the development of vaccines and drugs. Because the core‐essential genes dictate the biological characteristics of pathogens to a greater extent, identifying core‐essential genes could be beneficial for devising efficacious broad‐spectrum drugs. By contrast, essential genes specific to certain pathogens may serve as potential targets for strain‐specific drugs.

CONCLUSION AND FUTURE PERSPECTIVES

Technological advancements have facilitated large‐scale whole‐genome screening, providing valuable insights into the molecular basis of the essential genes involved in many biological processes. This reveals the complex and multifaceted nature of essential genes, opening up possibilities for their applications in synthetic biology and medicine. The generation of massive amounts of data has also led to the emergence of related online services such as databases and tools, which serve as references for future research. However, as discussed in this review, the essential gene is not an absolute concept, which depends on specific environments and contexts. Owing to the complexity of molecular interactions, comprehensively studying various aspects of essential genes using computational approaches, particularly machine learning methods, has proven to be promising. In the near future, the characterization of cells with diverse rational designs of genomes or genes will accelerate the generation of big data on the correlation between genomic content and cellular functionality. This will ultimately broaden our understanding of the entire cellular system and contribute to the application of essential genes in various fields. Furthermore, we are committed to continuously updating DEG. In addition to gathering experimental data on essential genes, we will extend our focus to the latest research findings and emerging trends pertaining to essential genes. We have a comprehensive plan to enhance and refine the database in the following areas. First, we aim to evaluate the degree of essentiality more effectively. In upcoming versions of the database, we will emphasize context dependence by annotating conditionally essential genes identified through experiments. Their significance was further stratified based on the likelihood of their existence under different experimental conditions. In addition, we will explore the provision of cross‐species links for corresponding essential genes, facilitating easy access to information on the shared presence of essential genes across diverse strains or species. Moreover, we will also add a pan‐genomic analysis module for essential genes, which will categorize essential genes into core, accessory, and essential categories in a pan‐genomic framework, to better study specific genes at the species level. Finally, we intended to augment the database with more additional features related to essentiality, including PPI data, expression profiles, and potential drug targets. Furthermore, suitable visualizations were employed to present the obtained results in an intuitive manner. Based on this series of improvements, we hope DEG could provide a more comprehensive and valuable reference for research on essential genes.

AUTHOR CONTRIBUTIONS

Ya‐Ting Liang drafted the manuscript. Ya‐Ting Liang performed the major part of the data analyses. Feng Gao, Hao Luo, and Yan Lin conceived this study and reviewed, edited and refined the manuscript. Feng Gao supervised this project. All authors have read the final manuscript and approved it for publication.

CONFLICT OF INTEREST STATEMENT

The authors declare no conflict of interest.

Supporting information

Figure S1: The proportion of essential and non‐essential gene categories in COG classification.

IMT2-3-e157-s001.docx (84.6KB, docx)

Table S1: Distribution of essential genes under the pan‐genome classification.

IMT2-3-e157-s002.xlsx (19.5KB, xlsx)

ACKNOWLEDGMENTS

This work has been supported by the National Key Research and Development Program of China [Grant number 2018YFA0903700]; and the National Natural Science Foundation of China [Grant numbers 32270692, 31801104, and 31571358]. The authors would like to thank Prof. Chun‐Ting Zhang and Dr. Ren Zhang for their invaluable assistance and inspiring discussions.

Liang, Ya‐Ting , Luo Hao, Lin Yan, and Gao Feng. 2024. “Recent Advances in the Characterization of Essential Genes and Development of a Database of Essential Genes.” iMeta 3, e157. 10.1002/imt2.157

DATA AVAILABILITY STATEMENT

The DEG database is available at https://tubic.org/deg. Supporting Information materials (figures, tables, graphical abstract, slides, videos, Chinese translated version, and update materials) may be found in the online DOI or iMeta Science https://www.imeta.science/. The data that supports the findings of this study are available in the supplementary material of this article.

REFERENCES

  • 1. Dowell, Robin D. , Ryan Owen, Jansen An, Cheung Doris, Agarwala Sudeep, Danford Timothy, Bernstein Douglas A., et al. 2010. “Genotype to Phenotype: A Complex Problem.” Science 328: 469. 10.1126/science.1189015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Bartha, István , di Iulio Julia, Venter J. Craig, and Telenti Amalio. 2018. “Human Gene Essentiality.” Nature Reviews Genetics 19: 51–62. 10.1038/nrg.2017.75 [DOI] [PubMed] [Google Scholar]
  • 3. Koonin, Eugene V . 2000. “How Many Genes Can Make a Cell: The Minimal‐Gene‐Set Concept.” Annual Review of Genomics and Human Genetics 1: 99–116. 10.1146/annurev.genom.1.1.99 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. D'Elia, Michael A. , Pereira Mark P., and Brown Eric D.. 2009. “Are Essential Genes Really Essential.” Trends in Microbiology 17: 433–438. 10.1016/j.tim.2009.08.005 [DOI] [PubMed] [Google Scholar]
  • 5. Sharma, Shukriti , Markham Philip F., and Browning Glenn F.. 2014. “Genes Found Essential in Other Mycoplasmas are Dispensable in Mycoplasma bovis .” PLOS ONE 9: e97100. 10.1371/journal.pone.0097100 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Hutchison, Clyde A. , Peterson Scott N., Gill Steven R., Cline Robin T., White Owen, Fraser Claire M., Smith Hamilton O., and Craig Venter J.. 1999. “Global Transposon Mutagenesis and a Minimal Mycoplasma Genome.” Science 286: 2165–2169. 10.1126/science.286.5447.2165 [DOI] [PubMed] [Google Scholar]
  • 7. Kea, Kobayashi , Ehrlich Stanislav, Albertini Alessandra, Amati Giorgio, Andersen Klaus K., Arnaud Monteil, Asai Kiyoshi, et al. 2003. “Essential Bacillus subtilis genes.” Proceedings of the National Academy of Sciences 100: 4678–4683. 10.1073/pnas.0730515100 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Elbashir, Sayda M. , Harborth Jens, Lendeckel Winfried, Yalcin Abdullah, Weber Klaus, and Tuschl Thomas. 2001. “Duplexes of 21‐Nucleotide RNAs Mediate RNA Interference in Cultured Mammalian Cells.” Nature 411: 494–498. 10.1038/35078107 [DOI] [PubMed] [Google Scholar]
  • 9. Ji, Yinduo , Zhang Barbara, Van Stephanie F., Horn U., Warren Patrick, Woodnutt Gary, Burnham Martin K. R., and Rosenberg Martin. 2001. “Identification of Critical Staphylococcal Genes Using Conditional Phenotypes Generated by Antisense RNA.” Science 293: 2266–2269. 10.1126/science.1063566 [DOI] [PubMed] [Google Scholar]
  • 10. Shalem, Ophir , Sanjana Neville E., Hartenian Ella, Shi Xi, Scott David A., Mikkelsen Tarjei S., Heckl Dirk, et al. 2014. “Genome‐Scale CRISPR‐Cas9 Knockout Screening in Human Cells.” Science 343: 84–87. 10.1126/science.1247005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Wang, Tim , Wei Jenny J., Sabatini David M., and Lander Eric S.. 2014. “Genetic Screens in Human Cells Using the CRISPR‐Cas9 System.” Science 343: 80–84. 10.1126/science.1246981 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Jewett, Michael C. , and Forster Anthony C. 2010. “Update on Designing and Building Minimal Cells.” Current Opinion in Biotechnology 21: 697–703. 10.1016/j.copbio.2010.06.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Gibson, Daniel G. , Glass John I., Lartigue Carole, Noskov Vladimir N., Chuang Ray‐Yuan, Algire Mikkel A., Benders Gwynedd A., et al. 2010. “Creation of a Bacterial Cell Controlled by a Chemically Synthesized Genome.” Science 329: 52–56. 10.1126/science.1190719 [DOI] [PubMed] [Google Scholar]
  • 14. Yan, Fangfang , and Gao Feng. 2020. “A Systematic Strategy for the Investigation of Vaccines and Drugs Targeting Bacteria.” Computational and Structural Biotechnology Journal 18: 1525–1538. 10.1016/j.csbj.2020.06.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Zhang, Ren , Ou Hong‐Yu, and Zhang Chun‐Ting. 2004. “DEG: A Database of Essential Genes.” Nucleic Acids Research 32: D271–D272. 10.1093/nar/gkh024 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Zhang, Ren , and Lin Yan. 2009. “DEG 5.0, a Database of Essential Genes in Both Prokaryotes and Eukaryotes.” Nucleic Acids Research 37: D455–D458. 10.1093/nar/gkn858 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Luo, Hao , Lin Yan, Gao Feng, Zhang Chun‐Ting, and Zhang Ren. 2014. “DEG 10, an Update of the Database of Essential Genes That Includes Both Protein‐Coding Genes and Noncoding Genomic Elements.” Nucleic Acids Research 42: D574–D580. 10.1093/nar/gkt1131 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Luo, Hao , Lin Yan, Liu Tao, Lai Fei‐Liao, Zhang Chun‐Ting, Gao Feng, and Zhang Ren. 2021. “DEG 15, an Update of the Database of Essential Genes That Includes Built‐In Analysis Tools.” Nucleic Acids Research 49: D677–D686. 10.1093/nar/gkaa917 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Zhan, Tianzuo , and Boutros Michael. 2016. “Towards a Compendium of Essential Genes From Model Organisms to Synthetic Lethality in Cancer Cells.” Critical Reviews in Biochemistry and Molecular Biology 51: 74–85. 10.3109/10409238.2015.1117053 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Ryan, Colm J. , Krogan Nevan J., Cunningham Pádraig, and Cagney Gerard. 2013. “All or Nothing: Protein Complexes Flip Essentiality Between Distantly Related Eukaryotes.” Genome Biology and Evolution 5: 1049–1059. 10.1093/gbe/evt074 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Baba, Tomoya , Ara Takeshi, Hasegawa Miki, Takai Yuki, Okumura Yoshiko, Baba Miki, Datsenko Kirill A., Tomita Masaru, Wanner Barry L., and Mori Hirotada. 2006. “Construction of Escherichia coli K‐12 In‐Frame, Single‐Gene Knockout Mutants: The Keio Collection.” Molecular Systems Biology 2: 0008. 10.1038/msb4100050 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Gerdes, Svetlana , Edwards Robert, Kubal Michael, Fonstein Michael, Stevens Rick, Osterman Andrei. 2006. “Essential Genes on Metabolic Maps.” Current Opinion in Biotechnology 17: 448–456. 10.1016/j.copbio.2006.08.006 [DOI] [PubMed] [Google Scholar]
  • 23. Rancati, Giulia , Moffat Jason, Typas Athanasios, and Pavelka Norman. 2018. “Emerging and Evolving Concepts in Gene Essentiality.” Nature Reviews Genetics 19: 34–49. 10.1038/nrg.2017.74 [DOI] [PubMed] [Google Scholar]
  • 24. Wang, Tim , Birsoy Kıvanç, Hughes Nicholas W., Krupczak Kevin M., Post Yorick, Wei Jenny J., Lander Eric S., and Sabatini David M.. 2015. “Identification and Characterization of Essential Genes in the Human Genome.” Science 350: 1096–1101. 10.1126/science.aac7041 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Qian, Wenfeng , Ma Di, Xiao Che, Wang Zhi, and Zhang Jianzhi. 2012. “The Genomic Landscape and Evolutionary Resolution of Antagonistic Pleiotropy in Yeast.” Cell Reports 2: 1399–1410. 10.1016/j.celrep.2012.09.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Commichau, Fabian M. , Pietack Nico, and Stülke Jörg. 2013. “Essential Genes in Bacillus subtilis: A Re‐Evaluation After Ten Years.” Molecular BioSystems 9: 1068–1075. 10.1039/C3MB25595F [DOI] [PubMed] [Google Scholar]
  • 27. Chen, Piaopiao , Wang Dandan, Chen Han, Zhou Zhenzhen, and He Xionglei. 2016. “The Nonessentiality of Essential Genes in Yeast Provides Therapeutic Insights Into a Human Disease.” Genome Research 26: 1355–1362. 10.1101/gr.205955.116 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Dobzhansky, Th . 1946. “Genetics of Natural Populations. XIII. Recombination and Variability in Populations of Drosophila pseudoobscura .” Genetics 31: 269–290. 10.1093/genetics/31.3.269 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Glass, John I. , Merryman Chuck, Wise Kim S., Hutchison Clyde A., and Smith Hamilton O.. 2017. “Minimal Cells—Real and Imagined.” Cold Spring Harbor Perspectives in Biology 9: a023861. 10.1101/cshperspect.a023861 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Kaiser, Chris A. , and Schekman Randy. 1990. “Distinct Sets of SEC Genes Govern Transport Vesicle Formation and Fusion Early in the Secretory Pathway.” Cell 61: 723–733. 10.1016/0092-8674(90)90483-U [DOI] [PubMed] [Google Scholar]
  • 31. Glass, John I. , Assad‐Garcia Nacyra, Alperovich Nina, Yooseph Shibu, Lewis Matthew R., Maruf Mahir, Hutchison Clyde A., Smith Hamilton O., and Venter J. Craig. 2006. “Essential Genes of a Minimal Bacterium.” Proceedings of the National Academy of Sciences 103: 425–430. 10.1073/pnas.0510013103 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Hillenmeyer, Maureen E. , Fung Eula, Wildenhain Jan, Pierce Sarah E., Hoon Shawn, Lee William, Proctor Michael, et al. 2008. “The Chemical Genomic Portrait of Yeast: Uncovering a Phenotype for All Genes.” Science 320: 362–365. 10.1126/science.1150021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Nichols, Robert J. , Sen Saunak, Choo Yoe Jin, Beltrao Pedro, Zietek Matylda, Chaba Rachna, Lee Sueyoung, et al. 2011. “Phenotypic Landscape of a Bacterial Cell.” Cell 144: 143–156. 10.1016/j.cell.2010.11.052 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Papp, Balázs , Pál Csaba, and Hurst Laurence D.. 2004. “Metabolic Network Analysis of the Causes and Evolution of Enzyme Dispensability in Yeast.” Nature 429: 661–664. 10.1038/nature02636 [DOI] [PubMed] [Google Scholar]
  • 35. Morgens, David W. , Deans Richard M., Li Amy, and Bassik Michael C.. 2016. “Systematic Comparison of CRISPR/Cas9 and RNAi Screens for Essential Genes.” Nature Biotechnology 34: 634–636. 10.1038/nbt.3567 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Bergmiller, Tobias , Ackermann Martin, and Silander Olin K.. 2012. “Patterns of Evolutionary Conservation of Essential Genes Correlate With Their Compensability.” PLOS Genetics 8: e1002803. 10.1371/journal.pgen.1002803 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Liu, Gaowen , Yong Mei Yun Jacy, Yurieva Marina, Srinivasan Kandhadayar Gopalan, Liu Jaron, Lim John Soon Yew, Poidinger Michael, et al. 2015. “Gene Essentiality is a Quantitative Property Linked to Cellular Evolvability.” Cell 163: 1388–1399. 10.1016/j.cell.2015.10.069 [DOI] [PubMed] [Google Scholar]
  • 38. Van Leeuwen, Jolanda , Pons Carles, Tan Guihong, Wang Jason Zi, Hou Jing, Weile Jochen, Gebbia Marinella, et al. 2020. “Systematic Analysis of Bypass Suppression of Essential Genes.” Molecular Systems Biology 16: e9828. 10.15252/msb.20209828 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Taylor, Tiffany B. , Mulley Geraldine, Dills Alexander H., Alsohim Abdullah S., McGuffin Liam J., Studholme David J., Silby Mark W., Brockhurst Michael A., Johnson Louise J., and Jackson Robert W.. 2015. “Evolutionary Resurrection of Flagellar Motility Via Rewiring of the Nitrogen Regulation System.” Science 347: 1014–1017. 10.1126/science.1259145 [DOI] [PubMed] [Google Scholar]
  • 40. Li, Jun , Wang Hai‐Tao, Wang Wei‐Tao, Zhang Xiao‐Ran, Suo Fang, Ren Jing‐Yi, Bi Ying, et al. 2019. “Systematic Analysis Reveals the Prevalence and Principles of Bypassable Gene Essentiality.” Nature Communications 10: 1002. 10.1038/s41467-019-08928-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Larrimore, Katherine E. , and Rancati Giulia. 2019. “The Conditional Nature of Gene Essentiality.” Current Opinion in Genetics & Development 58‐59: 55–61. 10.1016/j.gde.2019.07.015 [DOI] [PubMed] [Google Scholar]
  • 42. Horowitz, Norman H. , and Leupold Urs. 1951. “Some Recent Studies Bearing on the One Gene‐One Enzyme Hypothesis.” Cold Spring Harbor Symposia on Quantitative Biology 16: 65–74. 10.1101/SQB.1951.016.01.006 [DOI] [PubMed] [Google Scholar]
  • 43. Kodym, Andrea , and Afza Rownak. 2003. Physical and Chemical Mutagenesis. New Jersey, USA: Humana Press. 10.1385/1-59259-413-1:189 [DOI] [PubMed] [Google Scholar]
  • 44. Burke, H. Judd , Shen Max W., and Kaufman Thomas C.. 1972. “The Anatomy and Function of a Segment of the X Chromosome of Drosophila melanogaster .” Genetics 71: 139–156. 10.1093/genetics/71.1.139 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Lefevre, George . 1974. “The One Band‐One Gene Hypothesis: Evidence from a Cytogenetic Analysis of Mutant and Nonmutant Rearrangement Breakpoints in Drosophila melanogaster .” Cold Spring Harbor Symposia on Quantitative Biology 38: 591–599. 10.1101/SQB.1974.038.01.063 [DOI] [PubMed] [Google Scholar]
  • 46. Brenner, Sydney . 1974. “The Genetics of Caenorhabditis elegans .” Genetics 77: 71–94. 10.1093/genetics/77.1.71 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Goebl, Mark G. , and Petes Thomas D.. 1986. “Most of the Yeast Genomic Sequences are Not Essential for Cell Growth and Division.” Cell 46: 983–992. 10.1016/0092-8674(86)90697-5 [DOI] [PubMed] [Google Scholar]
  • 48. McClintock, Barbara . 1950. “The Origin and Behavior of Mutable Loci in Maize.” Proceedings of the National Academy of Sciences 36: 344–355. 10.1073/pnas.36.6.344 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Mardis, Elaine R . 2011. “A Decade's Perspective on DNA Sequencing Technology.” Nature 470: 198–203. 10.1038/nature09796 [DOI] [PubMed] [Google Scholar]
  • 50. Smithies, Oliver , Gregg Ronald G., Boggs Sallie S., Koralewski Michael A., and Kucherlapati Raju S.. 1985. “Insertion of DNA Sequences Into the Human Chromosomal β‐globin Locus by Homologous Recombination.” Nature 317: 230–234. 10.1038/317230a0 [DOI] [PubMed] [Google Scholar]
  • 51. Zamecnik, Paul C. , and Martin L. Stephenson. 1978. “Inhibition of Rous Sarcoma Virus Replication and Cell Transformation by a Specific Oligodeoxynucleotide.” Proceedings of the National Academy of Sciences 75: 280–284. 10.1073/pnas.75.1.280 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Fire, Andrew , Xu SiQun, Montgomery Mary K., Kostas Steven A., Driver Samuel E., and Mello Craig C.. 1998. “Potent and Specific Genetic Interference by Double‐Stranded RNA in Caenorhabditis elegans .” Nature 391: 806–811. 10.1038/35888 [DOI] [PubMed] [Google Scholar]
  • 53. Fleischmann, Robert D. , Adams Mark D., White Owen, Clayton Rebecca A., Kirkness Ewen F., Kerlavage Anthony R., Bult Carol J., et al. 1995. “Whole‐Genome Random Sequencing and Assembly of Haemophilus influenzae Rd.” Science 269: 496–512. 10.1126/science.7542800 [DOI] [PubMed] [Google Scholar]
  • 54. Fraser, Claire M. , Gocayne Jeannine D., White Owen, Adams Mark D., Clayton Rebecca A., Fleischmann Robert D., Bult Carol J., et al. 1995. “The Minimal Gene Complement of Mycoplasma genitalium .” Science 270: 397–404. 10.1126/science.270.5235.397 [DOI] [PubMed] [Google Scholar]
  • 55. Goffeau, André , Barrell Berg G., Bussey Howard, Davis Randall W., Dujon Bernard, Feldmann Heinz, Galibert Francis, et al. 1996. “Life With 6000 Genes.” Science 274: 546–567. 10.1126/science.274.5287.546 [DOI] [PubMed] [Google Scholar]
  • 56. Blattner, Frederick R. , Plunkett Guy, Bloch Craig A., Perna Nicole T., Burland Valerie, Riley Monica, Collado‐Vides Julio, et al. 1997. “The Complete Genome Sequence of Escherichia coli K‐12.” Science 277: 1453–1462. 10.1126/science.277.5331.1453 [DOI] [PubMed] [Google Scholar]
  • 57. Vasquez, Karen M. , Marburger Kathleen, Intody Zsofia, and Wilson John H.. 2001. “Manipulating the Mammalian Genome by Homologous Recombination.” Proceedings of the National Academy of Sciences 98: 8403–8410. 10.1073/pnas.111009698 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58. Zijlstra, Maarten , Li En, Sajjadi Fereydoun, Subramani Suresh, and Jaenisch Rudolf. 1989. “Germ‐Line Transmission of a Disrupted β2 Microglobulin Gene Produced by Homologous Recombination in Embryonic Stem Cells.” Nature 342: 435–438. 10.1038/342435a0 [DOI] [PubMed] [Google Scholar]
  • 59. Kamath, Ravi S. , Fraser Andrew G., Dong Yan, Poulin Gino, Durbin Richard, Gotta Monica, Kanapin Alexander, et al. 2003. “Systematic Functional Analysis of the Caenorhabditis elegans Genome Using RNAi.” Nature 421: 231–237. 10.1038/nature01278 [DOI] [PubMed] [Google Scholar]
  • 60. Meister, Gunter , and Tuschl Thomas. 2004. “Mechanisms of Gene Silencing by Double‐Stranded RNA.” Nature 431: 343–349. 10.1038/nature02873 [DOI] [PubMed] [Google Scholar]
  • 61. Moffat, Jason , and Sabatini David M.. 2006. “Building Mammalian Signalling Pathways With RNAi Screens.” Nature Reviews Molecular Cell Biology 7: 177–187. 10.1038/nrm1860 [DOI] [PubMed] [Google Scholar]
  • 62. Barquist, Lars , Boinett Christine J., and Cain Amy K.. 2013. “Approaches to Querying Bacterial Genomes With Transposon‐Insertion Sequencing.” RNA Biology 10: 1161–1169. 10.4161/rna.24765 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63. Reeburgh, William S . 2007. “Oceanic Methane Biogeochemistry.” Chemical Reviews 107: 486–513. 10.1021/cr050362v [DOI] [PubMed] [Google Scholar]
  • 64. Ringwald, Martin , Iyer Vivek, Mason Jeremy C., Stone Kevin R., Tadepally Hamsa D., Kadin James A., Bult Carol J., et al. 2011. “The IKMC Web Portal: A Central Point of Entry to Data and Resources From the International Knockout Mouse Consortium.” Nucleic Acids Research 39: D849–D855. 10.1093/nar/gkq879 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65. Dolgin, Elie . 2011. “Mouse Library Set to be Knockout.” Nature 474: 262–263. 10.1038/474262a [DOI] [PubMed] [Google Scholar]
  • 66. Giaever, Guri , Chu Angela M., Ni Li, Connelly Carla, Riles Linda, Véronneau Steeve, Dow Sally, et al. 2002. “Functional Profiling of the Saccharomyces cerevisiae Genome.” Nature 418: 387–391. 10.1038/nature00935 [DOI] [PubMed] [Google Scholar]
  • 67. Van Opijnen, Tim , Lazinski David W., and Camilli Andrew. 2014. “Genome‐Wide Fitness and Genetic Interactions Determined by Tn‐seq, a High‐Throughput Massively Parallel Sequencing Method for Microorganisms.” Current Protocols in Molecular Biology 106: 7.16.11‐17.16.24 10.1002/0471142727.mb0716s106 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68. Bartoszewski, Rafal , and Sikorski Aleksander F.. 2019. “Editorial Focus: Understanding Off‐Target Effects as the Key to Successful RNAi Therapy.” Cellular & Molecular Biology Letters 24: 69. 10.1186/s11658-019-0196-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69. Cong, Le , Ran F. Ann, Cox David Daniel, Lin Shuailiang, Barretto Robert, Habib Naomi, Hsu Patrick D., et al. 2013. “Multiplex Genome Engineering Using CRISPR/Cas Systems.” Science 339: 819–823. 10.1126/science.1231143 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70. Wang, Haoyi , Yang Hui, Shivalila Chikdu S., Dawlaty Meelad M., Cheng Albert W., Zhang Feng, and Jaenisch Rudolf. 2013. “One‐Step Generation of Mice Carrying Mutations in Multiple Genes by CRISPR/Cas‐Mediated Genome Engineering.” Cell 153: 910–918. 10.1016/j.cell.2013.04.025 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71. Barrangou, Rodolphe , Fremaux Christophe, Deveau Hélène, Richards Melissa, Boyaval Patrick, Moineau Sylvain, Romero Dennis A., and Horvath Philippe. 2007. “CRISPR Provides Acquired Resistance Against Viruses in Prokaryotes.” Science 315: 1709–1712. 10.1126/science.1138140 [DOI] [PubMed] [Google Scholar]
  • 72. Jinek, Martin , East Alexandra, Cheng Aaron, Lin Steven, Ma Enbo, and Doudna Jennifer. 2013. “RNA‐Programmed Genome Editing in Human Cells.” eLife 2: e00471. 10.7554/eLife.00471 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73. Hsu, Patrick D. , Lander Eric S., and Zhang Feng. 2014. “Development and Applications of CRISPR‐Cas9 for Genome Engineering.” Cell 157: 1262–1278. 10.1016/j.cell.2014.05.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74. Sander, Jeffry D. , and Joung J. Keith. 2014. “CRISPR‐Cas Systems for Editing, Regulating and Targeting Genomes.” Nature Biotechnology 32: 347–355. 10.1038/nbt.2842 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75. Mali, Prashant , Yang Luhan, Esvelt Kevin M., Aach John, Guell Marc, DiCarlo James E., Norville Julie E., and Church George M.. 2013. “RNA‐Guided Human Genome Engineering Via Cas9.” Science 339: 823–826. 10.1126/science.1232033 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76. Gilbert, Luke A. , Larson Matthew H., Morsut Leonardo, Liu Zairan, Brar Gloria A., Torres Sandra E., Stern‐Ginossar Noam, et al. 2013. “CRISPR‐Mediated Modular RNA‐Guided Regulation of Transcription in Eukaryotes.” Cell 154: 442–451. 10.1016/j.cell.2013.06.044 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77. Gilbert, Luke A. , Horlbeck Max A., Adamson Britt, Villalta Jacqueline E., Chen Yuwen, Whitehead Evan H., Guimaraes Carla, Panning Barbara, et al. 2014. “Genome‐Scale CRISPR‐Mediated Control of Gene Repression and Activation.” Cell 159: 647–661. 10.1016/j.cell.2014.09.029 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78. Alerasool, Nader , Segal Dmitri, Lee Hunsang, and Taipale Mikko. 2020. “An Efficient KRAB Domain for CRISPRi Applications in Human Cells.” Nature Methods 17: 1093–1096. 10.1038/s41592-020-0966-x [DOI] [PubMed] [Google Scholar]
  • 79. Blomen, Vincent A. , Májek Peter, Jae Lucas T., Bigenzahn Johannes W., Nieuwenhuis Joppe, Staring Jacqueline, Sacco Roberto, et al. 2015. “Gene Essentiality and Synthetic Lethality in Haploid Human Cells.” Science 350: 1092–1096. 10.1126/science.aac7557 [DOI] [PubMed] [Google Scholar]
  • 80. Hart, Traver , Chandrashekhar Megha, Aregger Michael, Steinhart Zachary, Brown Kevin R., MacLeod Graham, Mis Monika, et al. 2015. “High‐Resolution CRISPR Screens Reveal Fitness Genes and Genotype‐Specific Cancer Liabilities.” Cell 163: 1515–1526. 10.1016/j.cell.2015.11.015 [DOI] [PubMed] [Google Scholar]
  • 81. Ng, Sarah B. , Turner Emily H., Robertson Peggy D., Flygare Steven D., Bigham Abigail W., Lee Choli, Shaffer Tristan, et al. 2009. “Targeted Capture and Massively Parallel Sequencing of 12 Human Exomes.” Nature 461: 272–276. 10.1038/nature08250 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82. Lek, Monkol , Karczewski Konrad J., Minikel Eric V., Samocha Kaitlin E., Banks Eric, Fennell Timothy, O'Donnell‐Luria Anne H., et al. 2016. “Analysis of Protein‐Coding Genetic Variation in 60,706 Humans.” Nature 536: 285–291. 10.1038/nature19057 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83. Fagen, Jennie R. , Leonard Michael T., McCullough Connor M., Edirisinghe Janaka N., Henry Christopher S., Davis Michael J., and Triplett Eric W.. 2014. “Comparative Genomics of Cultured and Uncultured Strains Suggests Genes Essential for Free‐Living Growth of Liberibacter.” PLOS ONE 9: e84469. 10.1371/journal.pone.0084469 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84. Rout, Subhashree , Warhurst David Charles, Suar Mrutyunjay, and Mahapatra Rajani Kanta. 2015. “In Silico Comparative Genomics Analysis of Plasmodium falciparum for the Identification Of Putative Essential Genes and Therapeutic Candidates.” Journal of Microbiological Methods 109: 1–8. 10.1016/j.mimet.2014.11.016 [DOI] [PubMed] [Google Scholar]
  • 85. Yang, Xiaowen , Li Yajie, Zang Juan, Li Yexia, Bie Pengfei, Lu Yanli, and Wu Qingmin. 2016. “Analysis of Pan‐Genome to Identify the Core Genes and Essential Genes of Brucella spp.” Molecular Genetics and Genomics 291: 905–912. 10.1007/s00438-015-1154-z [DOI] [PubMed] [Google Scholar]
  • 86. Liu, Wei , Fang Liurong, Mao, Li , Li Sha, Shaohua, Guo , Luo Rui, Feng Zhixin, et al. 2012. “Comparative Genomics of Mycoplasma: Analysis of Conserved Essential Genes and Diversity of the Pan‐Genome.” PLoS One 7(4): e35698. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87. Peng, Chong , Lin Yan, Luo Hao, and Gao Feng. 2017. “A Comprehensive Overview of Online Resources to Identify and Predict Bacterial Essential Genes.” Frontiers in Microbiology 8: 2331. 10.3389/fmicb.2017.02331 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88. Koonin, Eugene V . 2003. “Comparative Genomics, Minimal Gene‐Sets and the Last Universal Common Ancestor.” Nature Reviews Microbiology 1: 127–136. 10.1038/nrmicro751 [DOI] [PubMed] [Google Scholar]
  • 89. Thiele, Ines , and Palsson Bernhard Ø. 2010. “A Protocol for Generating a High‐Quality Genome‐Scale Metabolic Reconstruction.” Nature Protocols 5: 93–121. 10.1038/nprot.2009.203 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90. Raman, Karthik , and Chandra Nagasuma. 2009. “Flux Balance Analysis of Biological Systems: Applications and Challenges.” Briefings in Bioinformatics 10: 435–449. 10.1093/bib/bbp011 [DOI] [PubMed] [Google Scholar]
  • 91. Orth, Jeffrey D. , Thiele Ines, and Palsson Bernhard Ø. 2010. “What is Flux Balance Analysis?” Nature Biotechnology 28: 245–248. 10.1038/nbt.1614 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92. Basler, Georg . 2015. Computational Prediction of Essential Metabolic Genes Using Constraint‐Based Approaches. New York, USA: Springer. 10.1007/978-1-4939-2398-4_12 [DOI] [PubMed] [Google Scholar]
  • 93. Joyce, Andrew R. , and Bernhard Ø Palsson. 2008. Predicting Gene Essentiality Using Genome‐Scale in Silico Models. New Jersey, USA: Humana Press. 10.1007/978-1-59745-321-9_30 [DOI] [PubMed] [Google Scholar]
  • 94. Mahadevan, Radhakrishnan , Edwards Jeremy S., and Doyle Francis J.. 2002. “Dynamic Flux Balance Analysis of Diauxic Growth in Escherichia coli .” Biophysical Journal 83: 1331–1340. 10.1016/S0006-3495(02)73903-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95. Zomorrodi, Ali R. , Suthers Patrick F., Ranganathan Sridhar, and Maranas Costas D.. 2012. “Mathematical Optimization Applications in Metabolic Networks.” Metabolic Engineering 14: 672–686. 10.1016/j.ymben.2012.09.005 [DOI] [PubMed] [Google Scholar]
  • 96. Ishihama, Yasushi , Schmidt Thorsten, Rappsilber Juri, Mann Matthias, Hartl F. Ulrich, Kerner Michael J., and Frishman Dmitrij. 2008. “Protein Abundance Profiling of the Escherichia coli Cytosol.” BMC Genomics 9: 102. 10.1186/1471-2164-9-102 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97. Lloyd, John P. , Seddon Alexander E., Moghe Gaurav D., Simenc Matthew C., and Shiu Shin‐Han. 2015. “Characteristics of Plant Essential Genes Allow for Within‐ and Between‐Species Prediction of Lethal Mutant Phenotypes.” The Plant Cell 27: 2133–2147. 10.1105/tpc.15.00051 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98. Hershberg, Ruth , and Petrov Dmitri A.. 2008. “Selection on Codon Bias.” Annual Review of Genetics 42: 287–299. 10.1146/annurev.genet.42.110807.091442 [DOI] [PubMed] [Google Scholar]
  • 99. Wang, Jianxin , Peng Wei, and Wu Fang‐Xiang. 2013. “Computational Approaches to Predicting Essential Proteins: A Survey.” Proteomics Clinical Applications 7: 181–192. 10.1002/prca.201200068 [DOI] [PubMed] [Google Scholar]
  • 100. Gustafson, Adam M. , Snitkin Evan S., Parker Stephen C. J., DeLisi Charles, and Kasif Simon. 2006. “Towards the Identification of Essential Genes Using Targeted Genome Sequencing and Comparative Analysis.” BMC Genomics 7: 265. 10.1186/1471-2164-7-265 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101. Liu, Tao , Luo Hao, and Gao Feng. 2021. “Position Preference of Essential Genes in Prokaryotic Operons.” PLOS ONE 16: e0250380. 10.1371/journal.pone.0250380 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102. Peng, Chong , and Gao Feng. 2014. “Protein Localization Analysis of Essential Genes in Prokaryotes.” Scientific Reports 4: 6001. 10.1038/srep06001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 103. Zhou, Qian , and Yu Yong‐ming. 2014. “Comparative Analysis of Bacterial Essential and Nonessential Genes With Hurst Exponent Based on Chaos Game Representation.” Chaos, Solitons & Fractals 69: 209–216. 10.1016/j.chaos.2014.10.003 [DOI] [Google Scholar]
  • 104. Luo, Hao , Gao Feng, and Lin Yan. 2015. “Evolutionary Conservation Analysis Between the Essential and Nonessential Genes in Bacterial Genomes.” Scientific Reports 5: 13210. 10.1038/srep13210 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 105. Jansen, Ronald , Greenbaum Dov, and Gerstein Mark. 2002. “Relating Whole‐Genome Expression Data With Protein‐Protein Interactions.” Genome Research 12: 37–46. 10.1101/gr.205602 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 106. Gao, Feng , and Zhang Randy Ren. 2011. “Enzymes are Enriched in Bacterial Essential Genes.” PLOS ONE 6: e21683. 10.1371/journal.pone.0021683 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107. Lin, Yan , Gao Feng, and Zhang Chun‐Ting. 2010. “Functionality of Essential Genes Drives Gene Strand‐Bias in Bacterial Genomes.” Biochemical and Biophysical Research Communications 396: 472–476. 10.1016/j.bbrc.2010.04.119 [DOI] [PubMed] [Google Scholar]
  • 108. Chen, Hebing , Zhang Zhuo, Jiang Shuai, Li Ruijiang, Li Wanying, Zhao Chenghui, Hong Hao, Huang Xin, Li Hao, and Bo Xiaochen. 2020. “New Insights on Human Essential Genes Based on Integrated Analysis and the Construction of the HEGIAP Web‐Based Platform.” Briefings in Bioinformatics 21: 1397–1410. 10.1093/bib/bbz072 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109. Costanzo, Michael , Kuzmin Elena, van Leeuwen Jolanda, Mair Barbara, Moffat Jason, Boone Charles, and Andrews Brenda. 2019. “Global Genetic Networks and the Genotype‐to‐Phenotype Relationship.” Cell 177: 85–100. 10.1016/j.cell.2019.01.033 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 110. Beacon, Tasnim H. , Delcuve Geneviève P., López Camila, Nardocci Gino, Kovalchuk Igor, van Wijnen Andre J., and Davie James R.. 2021. “The Dynamic Broad Epigenetic (H3K4me3, H3K27ac) Domain as a Mark of Essential Genes.” Clinical Epigenetics 13: 138. 10.1186/s13148-021-01126-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 111. Goodacre, Norman F. , Dietlind L. Gerloff, and Uetz Peter. 2013. “Protein Domains of Unknown Function are Essential in Bacteria.” mBio 5: 00744–13. 10.1128/mBio.00744-13 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 112. Baştanlar, Yalin , and Özuysal Mustafa. 2014. Introduction to Machine Learning. New Jersey, USA: Humana Press. 10.1007/978-1-62703-748-8_7 [DOI] [PubMed] [Google Scholar]
  • 113. Evers, Bastiaan , Jastrzebski Katarzyna, Heijmans Jeroen P. M., Grernrum Wipawadee, Beijersbergen Roderick L., and Bernards Rene. 2016. “CRISPR Knockout Screening Outperforms shRNA and CRISPRi in Identifying Essential Genes.” Nature Biotechnology 34: 631–633. 10.1038/nbt.3536 [DOI] [PubMed] [Google Scholar]
  • 114. Fan, Yetian , Tang Xiwei, Hu Xiaohua, Wu Wei, and Ping Qing. 2017. “Prediction of Essential Proteins Based on Subcellular Localization and Gene Expression Correlation.” BMC Bioinformatics 18: 470. 10.1186/s12859-017-1876-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 115. Zhang, Ying‐Ying , Liang De‐Min, and Du Pu‐Feng. 2023. “iEssLnc: Quantitative Estimation of lncRNA Gene Essentialities With Meta‐Path‐Guided Random Walks on the lncRNA‐protein Interaction Network.” Briefings in Bioinformatics 24: bbad097. 10.1093/bib/bbad097 [DOI] [PubMed] [Google Scholar]
  • 116. Chen, Xiang‐Rong , Cui You‐Zhi, Li Bing‐Zhi, and Yuan Ying‐Jin. 2023. “Genome Engineering on Size Reduction and Complexity Simplification: A Review.” Journal of Advanced Research. In press. 10.1016/j.jare.2023.07.006 [DOI] [PubMed] [Google Scholar]
  • 117. Liu, Xiao , Wang Bao‐Jin, Xu Luo, Tang Hong‐Ling, and Xu Guo‐Qing. 2017. “Selection of Key Sequence‐Based Features for Prediction of Essential Genes in 31 Diverse Bacterial Species.” PLOS ONE 12: e0174638. 10.1371/journal.pone.0174638 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 118. Hasan, Md Abid , and Lonardi Stefano. 2020. “DeeplyEssential: A Deep Neural Network for Predicting Essential Genes in Microbes.” BMC Bioinformatics 21: 367. 10.1186/s12859-020-03688-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 119. Mobegi, Fredrick M. , Zomer Aldert, de Jonge Marien I., and van Hijum Sacha A. F. T.. 2017. “Advances and Perspectives in Computational Prediction of Microbial Gene Essentiality.” Briefings in Functional Genomics 16: 70–79. 10.1093/bfgp/elv063 [DOI] [PubMed] [Google Scholar]
  • 120. Zeng, Min , Li Min, Fei Zhihui, Wu Fangxiang, Li Yaohang, Pan Yi, and Wang Jianxin. 2019. “A Deep Learning Framework for Identifying Essential Proteins by Integrating Multiple Types of Biological Information.” IEEE/ACM Transactions on Computational Biology and Bioinformatics 18: 296–305. 10.1109/TCBB.2019.2897679 [DOI] [PubMed] [Google Scholar]
  • 121. Aromolaran, Olufemi , Aromolaran Damilare, Isewon Itunuoluwa, and Oyelade Jelili. 2021. “Machine Learning Approach to Gene Essentiality Prediction: A Review.” Briefings in Bioinformatics 22: bbab128. 10.1093/bib/bbab128 [DOI] [PubMed] [Google Scholar]
  • 122. Gurumayum, Sanathoi , Jiang Puzi, Hao Xiaowen, Campos Tulio L., Young Neil D., Korhonen Pasi K., Gasser Robin B., et al. 2021. “OGEE v3: Online GEne Essentiality Database With Increased Coverage of Organisms and Human Cell Lines.” Nucleic Acids Research 49: D998–D1003. 10.1093/nar/gkaa884 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 123. Wattam, Alice R. , Davis James J., Assaf Rida, Boisvert Sébastien, Brettin Thomas, Bun Christopher, Conrad Neal, et al. 2017. “Improvements to PATRIC, the All‐Bacterial Bioinformatics Database and Analysis Resource Center.” Nucleic Acids Research 45: D535–D542. 10.1093/nar/gkw1017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 124. Lin, Yan , and Zhang Randy Ren. 2011. “Putative Essential and Core‐Essential Genes in Mycoplasma Genomes.” Scientific Reports 1: 53. 10.1038/srep00053 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 125. Liu, Shuo , Wang Shu‐Xuan, Liu Wei, Wang Chen, Zhang Fa‐Zhan, Ye Yuan‐Nong, Wu Candy‐S., Zheng Wen‐Xin, Rao Nini, and Guo Feng‐Biao. 2020. “CEG 2.0: An Updated Database of Clusters of Essential Genes Including Eukaryotic Organisms.” Database 2020: baaa112. 10.1093/database/baaa112 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 126. Senthamizhan, Vimaladhasan , Ravindran Balaraman, and Raman Karthik. 2021. “NetGenes: A Database of Essential Genes Predicted Using Features From Interaction Networks.” Frontiers in Genetics 12: 722198. 10.3389/fgene.2021.722198 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 127. Kong, Xiangzhen , Zhu Bin, Stone Victoria N., Ge Xiuchun, El‐Rami Fadi E., Donghai Huangfu, and Xu Ping. 2019. “ePath: An Online Database Towards Comprehensive Essential Gene Annotation for Prokaryotes.” Scientific Reports 9: 12949. 10.1038/s41598-019-49098-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 128. Wen, Qing‐Feng , Liu Shuo, Dong Chuan, Guo Hai‐Xia, Gao Yi‐Zhou, and Guo Feng‐Biao. 2019. “Geptop 2.0: An Updated, More Precise, and Faster Geptop Server for Identification of Prokaryotic Essential Genes.” Frontiers in Microbiology 10: 1236. 10.3389/fmicb.2019.01236 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 129. Hua, Zhi‐Gang , Lin Yan, Yuan Ya‐Zhou, Yang De‐Chang, Wei Wen, and Guo Feng‐Biao. 2015. “ZCURVE 3.0: Identify Prokaryotic Genes With Higher Accuracy As Well As Automatically and Accurately Select Essential Genes.” Nucleic Acids Research 43: W85–W90. 10.1093/nar/gkv491 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 130. Ning, Lu‐Wen , Lin Hao, Ding Hui, Huang Jian, Rao Ni‐Ni, and Guo Feng‐Biao. 2014. “Predicting Bacterial Essential Genes Using Only Sequence Composition Information.” Genetics and Molecular Research 13: 4564–4572. 10.4238/2014.June.17.8 [DOI] [PubMed] [Google Scholar]
  • 131. Juhas, Mario . 2015. “On the Road To Synthetic Life: The Minimal Cell and Genome‐Scale Engineering.” Critical Reviews in Biotechnology 3: 416–423. 10.3109/07388551.2014.989423 [DOI] [PubMed] [Google Scholar]
  • 132. Mushegian, Arcady . 1999. “The Minimal Genome Concept.” Current Opinion in Genetics & Development 5: 709–714. 10.1016/S0959-437X(99)00023-4 [DOI] [PubMed] [Google Scholar]
  • 133. Chi, Haotian , Wang Xiaoli, Shao Yue, Qin Ying, Deng Zixin, Wang Lianrong, and Chen Shi. 2019. “Engineering and Modification Of Microbial Chassis for Systems and Synthetic Biology.” Synthetic and Systems Biotechnology 4: 25–33. 10.1016/j.synbio.2018.12.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 134. Ikeda, Haruo , Shin‐ya Kazuo, and Omura Satoshi. 2014. “Genome Mining of the Streptomyces avermitilis Genome and Development of Genome‐Minimized Hosts for Heterologous Expression Of Biosynthetic Gene Clusters.” Journal of Industrial Microbiology and Biotechnology 41: 233–250. 10.1007/s10295-013-1327-x [DOI] [PubMed] [Google Scholar]
  • 135. Michalik, Stephan , Reder Alexander, Richts Björn, Faßhauer Patrick, Mäder Ulrike, Pedreira Tiago, Poehlein Anja, et al. 2021. “The Bacillus subtilis Minimal Genome Compendium.” ACS Synthetic Biology 10: 2767–2771. 10.1021/acssynbio.1c00339 [DOI] [PubMed] [Google Scholar]
  • 136. Leprince, Audrey , van Passel Mark W. J., and dos Santos Vitor A. P. Martins. 2012. “Streamlining Genomes: Toward the Generation Of Simplified and Stabilized Microbial Systems.” Current Opinion in Biotechnology 23: 651–658. 10.1016/j.copbio.2012.05.001 [DOI] [PubMed] [Google Scholar]
  • 137. Pal, Csaba , Papp Balazs, Lercher Martin J., Csermely Peter, Oliver Stephen G., and Hurst Laurence D.. 2006. “Chance and Necessity in the Evolution of Minimal Metabolic Networks.” Nature 440: 667–670. 10.1038/nature04568 [DOI] [PubMed] [Google Scholar]
  • 138. Rees‐Garbutt, Joshua , Chalkley Oliver, Landon Sophie, Purcell Oliver, Marucci Lucia, and Grierson Claire. 2020. “Designing Minimal Genomes Using Whole‐Cell Models.” Nature Communications 11: 836. 10.1038/s41467-020-14545-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 139. Wang, Lin , and Maranas Costas D.. 2018. “MinGenome: An In Silico Top‐Down Approach for the Synthesis of Minimized Genomes.” ACS Synthetic Biology 7: 462–473. 10.1021/acssynbio.7b00296 [DOI] [PubMed] [Google Scholar]
  • 140. Moya, Andrés , Gil Rosario, Latorre Amparo, Peretó Juli, Pilar Garcillán‐Barcia Maria, and De La Cruz Fernando. 2009. “Toward Minimal Bacterial Cells: Evolution vs. Design.” FEMS Microbiology Reviews 33: 225–235. 10.1111/j.1574-6976.2008.00151.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 141. Hutchison, Clyde A. , Chuang Ray‐Yuan, Noskov Vladimir N., Assad‐Garcia Nacyra, Deerinck Thomas J., Ellisman Mark H., Gill John, et al. 2016. “Design and Synthesis of a Minimal Bacterial Genome.” Science 351: aad6253. 10.1126/science.aad6253 [DOI] [PubMed] [Google Scholar]
  • 142. Breuer, Marian , Earnest Tyler M., Merryman Chuck, Wise Kim S., Sun Lijie, Lynott Michaela R., Hutchison Clyde A., et al. 2019. “Essential Metabolism for a Minimal Cell.” eLife 8: e36842. 10.7554/eLife.36842 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 143. Matteau, Dominick , Lachance Jean‐Christophe, Grenier Frédéric, Gauthier Samuel, Daubenspeck James M., Dybvig Kevin, Garneau Daniel, Knight Thomas F., Jacques Pierre‐Étienne, and Rodrigue Sébastien. 2020. “Integrative Characterization of the Near‐Minimal Bacterium Mesoplasma Florum.” Molecular Systems Biology 16: e9844. 10.15252/msb.20209844 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 144. Rappuoli, Rino . 2000. “Reverse Vaccinology.” Current Opinion in Microbiology 3: 445–450. [DOI] [PubMed] [Google Scholar]
  • 145. Clatworthy, Anne E. , Pierson Emily, and Hung Deborah T.. 2007. “Targeting Virulence: A New Paradigm for Antimicrobial Therapy.” Nature Chemical Biology 3: 541–548. [DOI] [PubMed] [Google Scholar]
  • 146. Huang, Alan , Garraway Levi A., Ashworth Alan, and Weber Barbara. 2020. “Synthetic Lethality as an Engine for Cancer Drug Target Discovery.” Nature Reviews Drug Discovery 19: 23–38. [DOI] [PubMed] [Google Scholar]
  • 147. Behan, Fiona M. , Iorio Francesco, Picco Gabriele, Gonçalves Emanuel, Beaver Charlotte M., Migliardi Giorgia, Santos Rita, et al. 2019. “Prioritization of Cancer Therapeutic Targets Using CRISPR–Cas9 Screens.” Nature 568: 511–516. 10.1038/s41586-019-1103-9 [DOI] [PubMed] [Google Scholar]
  • 148. Narasimhan, Vagheesh M. , Hunt Karen A., Mason Dan, Baker Christopher L., Karczewski Konrad J., Barnes Michael R., Barnett Anthony H., et al. 2016. “Health and Population Effects of Rare Gene Knockouts in Adult Humans With Related Parents.” Science 352: 474–477. 10.1126/science.aac8624 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 149. Shohat, Shahar , and Shifman Sagiv. 2019. “Genes Essential for Embryonic Stem Cells are Associated With Neurodevelopmental Disorders.” Genome Research 29: 1910–1918. 10.1101/gr.250019.119 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 150. Cacheiro, Pilar , Muñoz‐Fuentes Violeta, Murray Stephen A., Dickinson Mary E., Bucan Maja, Nutter Lauryl M. J., Peterson Kevin A., et al. 2020. “Human and Mouse Essentiality Screens as a Resource for Disease Gene Discovery.” Nature Communications 11: 655. 10.1038/s41467-020-14284-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 151. Fahnoe, Kelly C. , Flanagan Mark E., Gibson Glenn, Shanmugasundaram Veerabahu, Che Ye, and Tomaras Andrew P.. 2012. “Non‐Traditional Antibacterial Screening Approaches for the Identification of Novel Inhibitors of the Glyoxylate Shunt in Gram‐Negative Pathogens.” PLOS One 7: e51732. 10.1371/journal.pone.0051732 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 152. Pethe, Kevin , Bifani Pablo, Jang Jichan, Kang Sunhee, Park Seijin, Ahn Sujin, Jiricek Jan, et al. 2013. “Discovery of Q203, a Potent Clinical Candidate for the Treatment of Tuberculosis.” Nature Medicine 19: 1157–1160. 10.1038/nm.3262 [DOI] [PubMed] [Google Scholar]
  • 153. Jeske, Lisa , Placzek Sandra, Schomburg Ida, Chang Antje, and Schomburg Dietmar. 2019. “BRENDA in 2019: A European ELIXIR Core Data Resource.” Nucleic Acids Research 47: D542–D549. 10.1093/nar/gky1048 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 154. Kanehisa, Minoru , Sato Yoko, Furumichi Miho, Morishima Kanae, and Tanabe Mao. 2019. “New Approach for Understanding Genome Variations in KEGG.” Nucleic Acids Research 47: D590–D595. 10.1093/nar/gky962 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 155. The Gene Ontology Consortium . 2019. “The Gene Ontology Resource: 20 Years and Still GOing Strong.” Nucleic Acids Research 47: D330–D338. 10.1093/nar/gky1055 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 156. Ye, Jian , McGinnis Scott, and Madden Thomas L.. 2006. “BLAST: Improvements for Better Sequence Analysis.” Nucleic Acids Research 34: W6–W9. 10.1093/nar/gkl164 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 157. Dong, Mei‐Jing , Luo Hao, and Gao Feng. 2023. “DoriC 12.0: An Updated Database of Replication Origins in Both Complete and Draft Prokaryotic Genomes.” Nucleic Acids Research 51: D117–D120. 10.1093/nar/gkac964 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 158. Luo, Hao , and Gao Feng. 2019. “DoriC 10.0: An Updated Database of Replication Origins in Prokaryotic Genomes Including Chromosomes and Plasmids.” Nucleic Acids Research 47: D74–D77. 10.1093/nar/gky1014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 159. Gao, Feng , Luo Hao, and Zhang Chun‐Ting. 2013. “DoriC 5.0: An Updated Database of Oric Regions in Both Bacterial and Archaeal Genomes.” Nucleic Acids Research 41: D90–D93. 10.1093/nar/gks990 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 160. Gao, Feng , and Zhang Chun‐Ting. 2007. “DoriC: A Database of oriC Regions in Bacterial Genomes.” Bioinformatics 23: 1866–1867. 10.1093/bioinformatics/btm255 [DOI] [PubMed] [Google Scholar]
  • 161. Galperin, Michael Y. , Rigden Daniel J., and Fernández‐Suárez Xosé M.. 2015. “The 2015 Nucleic Acids Research Database Issue and Molecular Biology Database Collection.” Nucleic Acids Research 43: D1–D5. 10.1093/nar/gku1241 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 162. Hao, Tong , Han Binbin, Ma Hongwu, Fu Jing, Wang Hui, Wang Zhiwen, Tang Bincai, Chen Tao, and Zhao Xueming. 2013. “In Silico Metabolic Engineering of Bacillus subtilis for Improved Production of Riboflavin, Egl‐237, (R,R)−2,3‐butanediol and Isobutanol.” Molecular BioSystems 9: 2034–2044. 10.1039/c3mb25568a [DOI] [PubMed] [Google Scholar]
  • 163. Bu, Qing‐Ting , Yu Pin, Wang Jue, Li Zi‐Yue, Chen Xin‐Ai, Mao Xu‐Ming, and Li Yong‐Quan. 2019. “Rational Construction of Genome‐Reduced and High‐Efficient Industrial Streptomyces Chassis Based on Multiple Comparative Genomic Approaches.” Microbial Cell Factories 18: 16. 10.1186/s12934-019-1055-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 164. Fan, Xu , Zhang Yiting, Zhao Fengjie, Liu Yujie, Zhao Yuxin, Wang Shufang, Liu Ruihua, and Yang Chao. 2020. “Genome Reduction Enhances Production of Polyhydroxyalkanoate and Alginate Oligosaccharide in Pseudomonas mendocina .” International Journal of Biological Macromolecules 163: 2023–2031. 10.1016/j.ijbiomac.2020.09.067 [DOI] [PubMed] [Google Scholar]
  • 165. Zhang, Fang , Huo Kaiyue, Song Xingyi, Quan Yufen, Wang Shufang, Zhang Zhiling, Gao Weixia, and Yang Chao. 2020. “Engineering of a Genome‐Reduced Strain Bacillus amyloliquefaciens for Enhancing Surfactin Production.” Microbial Cell Factories 19: 223. 10.1186/s12934-020-01485-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 166. Liu, Jiaqi , Zhou Haibo, Yang Zhiyu, Wang Xue, Chen Hanna, Zhong Lin, Zheng Wentao, et al. 2021. “Rational Construction of Genome‐Reduced Burkholderiales Chassis Facilitates Efficient Heterologous Production of Natural Products From Proteobacteria.” Nature Communications 12: 4347. 10.1038/s41467-021-24645-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 167. Xu, Xin , Meier Felix, Blount Benjamin A., Pretorius Isak S., Ellis Tom, Paulsen Ian T., and Williams Thomas C.. 2023. “Trimming the Genomic Fat: Minimising and Re‐Functionalising Genomes Using Synthetic Biology.” Nature Communications 14: 1984. 10.1038/s41467-023-37748-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 168. Zhou, Jianuan , Ma Hongmei, and Lianhui Zhang. 2023. “Mechanisms of Virulence Reprogramming in Bacterial Pathogens.” Annual Review of Microbiology 77: 561–581. 10.1146/annurev-micro-032521-025954 [DOI] [PubMed] [Google Scholar]
  • 169. Ma, Shuai , Su Tianyuan, Xuemei Lu, and Qi Qingsheng. 2023. “Bacterial Genome Reduction for Optimal Chassis of Synthetic Biology: A Review.” Critical Reviews in Biotechnology, 1–14. 10.1080/07388551.2023.2208285 [DOI] [PubMed] [Google Scholar]
  • 170. Basharat, Zarrin , Jahanzaib Muhammad, Yasmin Azra, and Khan Ishtiaq Ahmad. 2021. “Pan‐Genomics, Drug Candidate Mining and ADMET Profiling of Natural Product Inhibitors Screened Against Yersinia pseudotuberculosis .” Genomics 113: 238–244. 10.1016/j.ygeno.2020.12.015 [DOI] [PubMed] [Google Scholar]
  • 171. Mondal, Shakhinur Islam , Ferdous Sabiha, Akter Arzuba, Mahmud Zabed, Karim Nurul, Islam Md. Muzahidul, Jewel Nurnabi Azad, and Afrin Tanzila. 2015. “Identification of Potential Drug Targets by Subtractive Genome Analysis of Escherichia coli O157:H7: An In‐Silico Approach.” Advances and Applications in Bioinformatics and Chemistry 8: 49–63. 10.2147/AABC.S88522 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 172. Rahman, Atikur , Sarker Md. Takim, Islam Md Ashiqul, Hossain Mohammad Uzzal, Hasan Mahmudul, and Susmi Tasmina Ferdous. 2023. “Targeting Essential Hypothetical Proteins of Pseudomonas aeruginosa PAO1 for Mining of Novel Therapeutics: An In Silico Approach.” BioMed Research International 2023: 1–28. 10.1155/2023/1787485 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 173. Basharat, Zarrin , Jahanzaib Muhammad, and Rahman Noor. 2021. “Therapeutic Target Identification Via Differential Genome Analysis of Antibiotic Resistant Shigella sonnei and Inhibitor Evaluation Against a Selected Drug Target.” Infection, Genetics and Evolution 94: 105004. 10.1016/j.meegid.2021.105004 [DOI] [PubMed] [Google Scholar]
  • 174. Khan, Kanwal , Jalal Khurshid, and Uddin Reaz. 2023. “Pangenome Profiling of Novel Drug Target Against Vancomycin‐Resistant Enterococcus faecium .” Journal of Biomolecular Structure and Dynamics. In press. 10.1080/07391102.2023.2191134 [DOI] [PubMed] [Google Scholar]
  • 175. Khan, Kanwal , Jalal Khurshid, Khan Ajmal, Al‐Harrasi Ahmed, and Uddin Reaz. 2022. “Comparative Metabolic Pathways Analysis and Subtractive Genomics Profiling to Prioritize Potential Drug Targets Against Streptococcus pneumoniae .” Frontiers in Microbiology 12: 796363. 10.3389/fmicb.2021.796363 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 176. Araújo, Carlos Leonardo , Alves Jorianne, Nogueira Wylerson, Pereira Lino César, Gomide Anne Cybelle, Ramos Rommel, Azevedo Vasco, Silva Artur, and Folador Adriana. 2019. “Prediction of New Vaccine Targets in the Core Genome of Corynebacterium Pseudotuberculosis Through Omics Approaches and Reverse Vaccinology.” Gene 702: 36–45. 10.1016/j.gene.2019.03.049 [DOI] [PubMed] [Google Scholar]
  • 177. Rodríguez‐Alvarez, Mauricio , Palomec‐Nava Iliana D., Mendoza‐Hernández Guillermo, and López‐Vidal Yolanda. 2010. “The Secretome of a Recombinant BCG Substrain Reveals Differences in Hypothetical Proteins.” Vaccine 28: 3997–4001. 10.1016/j.vaccine.2010.01.064 [DOI] [PubMed] [Google Scholar]
  • 178. Rahman, Noor , Ajmal Amar, Ali Fawad, and Rastrelli Luca. 2020. “Core Proteome Mediated Therapeutic Target Mining and Multi‐Epitope Vaccine Design for Helicobacter pylori .” Genomics 112: 3473–3483. 10.1016/j.ygeno.2020.06.026 [DOI] [PubMed] [Google Scholar]
  • 179. Khan, Kanwal , Alhar Munirah Sulaiman Othman, Abbas Muhammad Naseer, Abbas Syed Qamar, Kazi Mohsin, Khan Saeed Ahmad, Sadiq Abdul, Hassan Syed Shams ul, Bungau Simona, and Jalal Khurshid. 2022. “Integrated Bioinformatics‐Based Subtractive Genomics Approach to Decipher the Therapeutic Drug Target and Its Possible Intervention Against Brucellosis.” Bioengineering 9: 633. 10.3390/bioengineering9110633 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 180. Khan, Kanwal , Jalal Khurshid, and Uddin Reaz. 2022. “An Integrated in Silico Based Subtractive Genomics and Reverse Vaccinology Approach for the Identification of Novel Vaccine Candidate and Chimeric Vaccine Against XDR Salmonella typhi H58.” Genomics 114: 110301. 10.1016/j.ygeno.2022.110301 [DOI] [PubMed] [Google Scholar]
  • 181. Tahir ul Qamar, Muhammad , Ahmad Sajjad, Fatima Israr, Ahmad Faisal, Shahid Farah, Naz Anam, Abbasi Sumra Wajid, et al. 2021. “Designing Multi‐Epitope Vaccine Against Staphylococcus aureus by Employing Subtractive Proteomics, Reverse Vaccinology and Immuno‐Informatics Approaches.” Computers in Biology and Medicine 132: 104389. 10.1016/j.compbiomed.2021.104389 [DOI] [PubMed] [Google Scholar]
  • 182. Aparicio‐Puerta, Ernesto , Lebrón Ricardo, Rueda Antonio, Gómez‐Martín Cristina, Giannoukakos Stavros, Jaspez David, Medina José María, et al. 2019. “sRNAbench and sRNAtoolbox 2019: Intuitive Fast Small RNA Profiling and Differential Expression.” Nucleic Acids Research 47: W530–W535. 10.1093/nar/gkz415 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 183. Naz, Kanwal , Naz Anam, Ashraf Shifa Tariq, Rizwan Muhammad, Ahmad Jamil, Baumbach Jan, and Ali Amjad. 2019. “PanRV: Pangenome‐Reverse Vaccinology Approach for Identifications of Potential Vaccine Candidates in Microbial Pangenome.” BMC Bioinformatics 20: 123. 10.1186/s12859-019-2713-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 184. Overbeek, Ross , Olson Robert, Pusch Gordon D., Olsen Gary J., Davis James J., Disz Terry, Edwards Robert A., et al. 2014. “The SEED and the Rapid Annotation of Microbial Genomes Using Subsystems Technology (RAST).” Nucleic Acids Research 42: D206–D214. 10.1093/nar/gkt1226 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 185. King, Zachary A. , Lu Justin, Dräger Andreas, Miller Philip, Federowicz Stephen, Lerman Joshua A., Ebrahim Ali, Palsson Bernhard O., and Lewis Nathan E.. 2016. “BiGG Models: A Platform for Integrating, Standardizing and Sharing Genome‐Scale Models.” Nucleic Acids Research 44: D515–D522. 10.1093/nar/gkv1049 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 186. Magnúsdóttir, Stefanía , Heinken Almut, Kutt Laura, Ravcheev Dmitry A., Bauer Eugen, Noronha Alberto, Greenhalgh Kacy, et al. 2017. “Generation of Genome‐Scale Metabolic Reconstructions for 773 Members of the Human Gut Microbiota.” Nature Biotechnology 35: 81–89. 10.1038/nbt.3703 [DOI] [PubMed] [Google Scholar]
  • 187. Guo, Feng‐Biao , Dong Chuan, Hua Hong‐Li, Liu Shuo, Luo Hao, Zhang Hong‐Wan, Jin Yan‐Ting, and Zhang Kai‐Yue. 2017. “Accurate Prediction of Human Essential Genes Using Only Nucleotide Composition and Association Information.” Bioinformatics 33: 1758–1764. 10.1093/bioinformatics/btx055 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 188. Zhang, Xue , Xiao Wangxin, and Xiao Weijia. 2020. “DeepHE: Accurately Predicting Human Essential Genes Based on Deep Learning.” PLOS Computational Biology 16: e1008229. 10.1371/journal.pcbi.1008229 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 189. Shi, Hua , Wu Chenjin, Bai Tao, Chen Jiahai, Li Yan, and Wu Hao. 2023. “Identify Essential Genes Based on Clustering Based Synthetic Minority Oversampling Technique.” Computers in Biology and Medicine 153: 106523. 10.1016/j.compbiomed.2022.106523 [DOI] [PubMed] [Google Scholar]
  • 190. Zhou, Qian , Qi Saibing, and Ren Cong. 2021. “Gene Essentiality Prediction Based on Chaos Game Representation and Spiking Neural Networks.” Chaos, Solitons & Fractals 144: 110649. 10.1016/j.chaos.2021.110649 [DOI] [Google Scholar]
  • 191. Gao, Feng , Luo Hao, Zhang Chun‐Ting, and Zhang Ren. 2015. Gene Essentiality Analysis Based on DEG 10, an Updated Database of Essential Genes. New York, USA: Springer. 10.1007/978-1-4939-2398-4_14 [DOI] [PubMed] [Google Scholar]
  • 192. Chen, Tong , Liu Yong‐Xin, and Huang Luqi. 2022. “ImageGP: An Easy‐to‐Use Data Visualization Web Server for Scientific Researchers.” iMeta 1: e5. 10.1002/imt2.5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 193. Tettelin, Hervé , Masignani Vega, Cieslewicz Michael J., Donati Claudio, Medini Duccio, Ward Naomi L., Angiuoli Samuel V., et al. 2005. “Genome Analysis of Multiple Pathogenic Isolates of Streptococcus agalactiae: Implications for the Microbial “Pan‐Genome”.” Proceedings of the National Academy of Sciences 102: 13950–13955. 10.1073/pnas.0506758102 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 194. Medini, Duccio , Donati Claudio, Tettelin Hervé, Masignani Vega, and Rappuoli Rino. 2005. “The Microbial Pan‐Genome.” Current Opinion in Genetics & Development 15: 589–594. 10.1016/j.gde.2005.09.006 [DOI] [PubMed] [Google Scholar]
  • 195. Saxena, Priya , Rauniyar Shailabh, Thakur Payal, Singh Ram Nageena, Bomgni Alain, Alaba Mathew O., Tripathi Abhilash Kumar, Gnimpieba Etienne Z., Carol Lushbough, and Sani Rajesh K.. 2023. “Integration of Text Mining and Biological Network Analysis: Identification of Essential Genes in Sulfate‐Reducing Bacteria.” Frontiers in Microbiology 14: 1086021. 10.3389/fmicb.2023.1086021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 196. Wu, Hao , Wang Dan, and Gao Feng. 2021. “Toward a High‐Quality Pan‐Genome Landscape of Bacillus subtilis by Removal of Confounding Strains.” Briefings in Bioinformatics 22: 1951–1971. 10.1093/bib/bbaa013 [DOI] [PubMed] [Google Scholar]
  • 197. Konstantinidis, Konstantinos T. , and Tiedje James M.. 2005. “Genomic Insights That Advance the Species Definition for Prokaryotes.” Proceedings of the National Academy of Sciences 102: 2567–2572. 10.1073/pnas.0409727102 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Figure S1: The proportion of essential and non‐essential gene categories in COG classification.

IMT2-3-e157-s001.docx (84.6KB, docx)

Table S1: Distribution of essential genes under the pan‐genome classification.

IMT2-3-e157-s002.xlsx (19.5KB, xlsx)

Data Availability Statement

The DEG database is available at https://tubic.org/deg. Supporting Information materials (figures, tables, graphical abstract, slides, videos, Chinese translated version, and update materials) may be found in the online DOI or iMeta Science https://www.imeta.science/. The data that supports the findings of this study are available in the supplementary material of this article.


Articles from Imeta are provided here courtesy of Wiley

RESOURCES