Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2011 Nov 9;40(Database issue):D901–D906. doi: 10.1093/nar/gkr986

OGEE: an online gene essentiality database

Wei-Hua Chen 1, Pablo Minguez 1, Martin J Lercher 2, Peer Bork 1,3,*
PMCID: PMC3245054  PMID: 22075992

Abstract

OGEE is an Online GEne Essentiality database. Its main purpose is to enhance our understanding of the essentiality of genes. This is achieved by collecting not only experimentally tested essential and non-essential genes, but also associated gene features such as expression profiles, duplication status, conservation across species, evolutionary origins and involvement in embryonic development. We focus on large-scale experiments and complement our data with text-mining results. Genes are organized into data sets according to their sources. Genes with variable essentiality status across data sets are tagged as conditionally essential, highlighting the complex interplay between gene functions and environments. Linked tools allow the user to compare gene essentiality among different gene groups, or compare features of essential genes to non-essential genes, and visualize the results. OGEE is freely available at http://ogeedb.embl.de.

INTRODUCTION

Large-scale efforts to link genotypes to phenotypes belong to the most important and challenging tasks in the post-omics era. Essential genes, whose removal results in inviability or infertility, are of particular interests because of their theoretical and practical applications, for example, in studying the robustness of a biological system (1), defining a minimal set of genes for a free living organism (2) and identifying effective drug targets (3).

Essentiality often depends on the environment (4), especially for bacterial genes, or for eukaryotic genes that were tested in cell lines. For example, genes coding for proteins involved in the biosynthesis of amino acids, nucleic acids and vitamins are essential for cell survival in minimal media, but not in rich media where the corresponding metabolites are supplied (4). However, so far the concept of ‘conditional essentiality’ has not been widely adopted by existing essential gene databases.

Gene essentiality does not only depend on individual gene functions, but can also be affected by global factors. Duplicated genes are typically less essential than the genomic average because they often overlap in gene function and expression profile; genes forming hubs in PPI networks (those connected to many direct neighbors) are more often essential (5); and genes involved in development and tissue differentiation in higher eukaryotes are also more likely to be essential (6). However, given the complex nature of biological systems, gene essentiality is often affected by multiple factors simultaneously; studying one factor at a time may generate conflicting results among species. For example, in a biased data set, mouse duplicates and singletons were reported to be equally essential (7), which disagreed with theoretical expectations and experimental findings in yeast (8). Experimental biases could only partially explain the contradiction (6). In a previous study, we showed that considering both the duplication status of genes and their evolutionary origins could solve the discrepancies (Chen, W.-H., Trachana, K., Lercher, M.J., and Bork, P., unpublished data).

Our understanding of gene essentiality is still limited. Progress can be enhanced by collecting the following information into a central database: (i) tested essential and non-essential genes, allowing comparisons between the two groups; (ii) essentiality information obtained from large-scale studies, facilitating genome-wide analyses, as well as more precise information from small-scale studies, more suited for gene-centered biological research; (iii) additional gene features that are either known or hypothesized to influence gene essentiality. Ideally, such a database should come with a set of tools that allow the user to systematically explore and analyze the raw data.

Existing essential gene databases either only include data for a specific species (9) or contain only essential genes (10). This provided the motivation to develop OGEE, an online gene essentiality database that combines points a–d with a set of tools for large-scale data analysis. This should make OGEE useful to both biologists and bioinformaticians.

DATA GENERATION

Collection and organization of genes tested for essentiality

We collected 91 436 protein-coding genes from 8 eukaryotic and 16 prokaryotic organisms tested for essentiality in genome-wide studies (2,3,9,11–37). For data sets that both essential and non-essential genes are publicly available, the genomic proportion of essential genes (PE) ranges from ∼2% [data set 347 (11) of Drosophila melanogaster] to 66.04% [Aspergillus fumigatus Af293, data set 361 (3)] in eukaryotes and from 5.46% [Bacilus subtilis 168, data set 352 (17)] to 80% [Mycoplasma genitalium G37, data set 357 (2)] in prokaryotes. It seems that overall PE in eukaryotes is most strongly influenced by organism complexity and by the methods employed for testing, in particular by the experimental conditions surveyed. Gene knockout techniques [data sets 349 (14) and 350 (37) of Mus musculus and Saccharomyces cerevisiae, respectively] generate higher PE than siRNA-based methods [data sets 348 (12) and 347 (11) of Homo sapiens and D. melanogaster, respectively]. Multi-cellular organisms have higher PE than single-celled eukaryotes (M. musculus versus S. cerevisiae) if similar techniques were used. Cell lines generate lower PE than in vivo if the same multi-cellular organism is used [data sets 347 (11) and 363 (25) of D. melanogaster]. In prokaryotes, overall PE is affected by details of the survey technology as well as by genome size and life style (free living versus parasitic).

In addition to the collection of large-scale data, we also employed text-mining to obtain 3543 genes from 38 species that were tested in small-scale studies. We applied a customized text-mining pipeline based on the one used for data collection by the STRING database (38). We searched for a set of terms related to essentiality (Supplementary Table S1) in PubMed abstracts (as published February 2011) and manually checked the results and removed some false positives. We divided identified genes into essential and non-essential genes according to their associated terms. Due to a strong reporting bias, most genes identified in this way were essential. Among those, 3168 (89.4% of 3543) genes overlapped with those tested in genome-wide studies. Please note that although substantial efforts have been made to improve the quality of the text-mining data, there might still be significant fraction of false-positive results; please use with caution.

We organized genes in each organism into distinct data sets according to the data source; a gene can have multiple entries within a data set or in different data sets. Two entries of a gene would be included in two distinct data sets if the gene was tested in a large-scale study as well as in a small-scale study; if a gene was tested by several small-scale studies, multiple entries of this gene would be included in the text-mining data set, with each entry corresponding to a distinct PubMed record. A gene was marked as ‘conditionally essential’ if multiple entries for this gene exist in OGEE but essentiality status varies among entries (see, e.g. the essentiality status of gene ‘FBgn0001112’ in Figure 1 and the supporting evidence in Figure 2).

Figure 1.

Figure 1.

Interface of the ‘Browse’ module.

Figure 2.

Figure 2.

Extra gene features shown in a popup window. This window will show up when clicking locus IDs in the ‘Browse’ or ‘Search’ modules.

Collection of gene features influencing gene essentiality

We collected several gene features that are known to influence gene essentiality, encompassing duplication status, connectivity in protein–protein interaction (PPI) networks (defined as the number of direct neighbors) (5) and evolutionary origins of genes (defined as the age of the evolutionarily most distant species group where homologs can be found (39); see the web Q&As for more details).

We also collected several extra features that might influence gene essentiality, including the number of homologous genes (family size) in the same genome, and the earliest expression stage during embryonic development [for multi-cellular organisms only; data was obtained from the NCBI UniGene database (40)]. It is known that duplicates are often less essential than singletons. This may be due to a range of factors, including the ability of duplicates to provide a functional backup for each other, lower expression abundances of duplicates (41,42), or a lower duplicability of the genes in certain important functional classes (43). It is thus conceivable that duplicates in large gene families are even less likely to be essential than duplicates in smaller families. In multicellular organisms, embryonic development is a tightly regulated chain of events. Disruption of genes expressed earlier may affect all subsequent events, thereby causing more severe phenotypes in the host. Both gene family size and earliest expression in development are indeed correlated with PE in mouse (Figure 3A and B).

Figure 3.

Figure 3.

Screen shots taken from the ‘Analyze’ module. With integrated tools, the user can easily explore and analyze the collected data, including the visualization of results. Shown here are the results of the following analyses: (A) the proportion of essential genes (PE) as a function of family size (number of homologous genes within the genome) in mouse, (B) PE as a function of the earliest expression stage during mouse development, (C) the effects of gene duplication status and involvement in development on gene essentiality in Caenorhabditis elegans and (D) the effects of gene connectivity and involvement in development on gene essentiality in C. elegans.

USAGE OF OGEE

The functionalities of OGEE have been divided into six different modules (tabs): ‘Summary’, ‘Browse’, ‘Search’, ‘Analyze’, ‘Download’ and ‘Q&As’. We provide inline help messages displayed as ‘tooltips’ within each module; we also provide detailed help contents and answers to frequently asked questions in ‘Q&As’. Below, we introduce several of the most interesting features of OGEE.

Viewing details of individual genes

In the ‘Browse’ and ‘Search’ modules, by default only some gene features such as essentiality, duplication status and data sources will be displayed (Figure 1). To view more details of individual genes, the user can simply mouse over or click the locus names; a popup window containing all available information for the corresponding gene will appear. As shown in Figure 2, extra information including gene description, type of evidence for gene essentiality and corresponding links to original data sources, involvement in development, evolutionary origin (phyletic age), connectivity in the PPI network, as well as nucleotide and protein sequences are available. Links to other databases, including Gene Ontology (44), EGGNOG2 (45), NCBI taxonomy, as well as NCBI BLAST (40) are also integrated (Figure 2). For example, if the gene of interests is involved in development, several corresponding GO IDs and terms will be shown; clicking each GO ID, the user will be redirected to the corresponding page at the Gene Ontology website. Similarly, the user will be redirected to the corresponding NCBI taxonomy page if clicking on the organism name. The NCBI BLAST website will be opened in a new window if clicking on the BLAST NCBI links.

The popup window also features in-site data integration. For example, if a query gene has orthologs in other species collected by OGEE, not only the corresponding orthologs [based on EGGNOG2 (45)], but also their essentiality status will be shown (Figure 2). This way, the conservation of a gene as well as the conservation of its essentiality across species can be checked easily.

Analyzing collected gene features using linked tools

One of the most interesting features of OGEE is that users can analyze the data systematically and visualize the results with integrated tools from the ‘Analyze’ module. With ‘Analyze’, the user can divide genes into distinct groups according to one of the available features, calculate the proportion of essential genes (PE) in each group and then plot the results as either a bar-chart or line plot. To illustrate this feature, Figures 3A and B show average mouse PE values as functions of the earliest expression stage during development and gene family size, respectively; both factors affect PE values globally.

Users can also investigate two gene features simultaneously to study their effects individually or in combination. For example, the user can divide genes first into developmental and non-developmental genes, and then further divide each group into duplicates and singletons (Figure 3C). Similarly, on could first divide genes according to the connectivity in PPI network and then according to their involvement in development (Figure 3D).

By default, predefined breaks by which genes can be divided into distinct groups and matching labels are used. However, if desired, the user can change the default settings by providing customized breaks and labels.

Open access to all data contained in OGEE

Our data are freely accessible to all academic users. We provide an SQL-dump file of the whole database as well as several selected data sections as tab-delimited flat files in the ‘Download’ module. Users can also download individual gene essentiality data sets for a selected species in ‘Browse’ and raw data used in data analysis in ‘Analyze’.

CONCLUSIONS

OGEE introduces several unique and novel features compared with existing gene essentiality databases. For example (i) OGEE provides both essential and non-essential genes from large-scale as well as small-scale studies; (ii) OGEE introduces ‘conditional essentiality’ to reflect the complexity of biological systems and the interplay between gene functions and environments; (iii) OGEE lists a variety of gene features known or suspected to influence gene essentiality; and (iv) OGEE provides a set of online tools to explore and analyze the data and to visualize the results. We thus believe that OGEE should be highly useful to biologists and bioinformaticians studying gene essentiality, whether focusing on individual genes or on genome-wide analyses.

FUTURE DIRECTIONS

Future development of OGEE will include the incorporation of essential non-coding genes, and the possibility for users to submit additional essentiality data.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online: Supplementary Table S1. Key words used to search for essential and non-essential genes in PubMed abstracts.

FUNDING

Funding for open access charge: BMBF (Bundesministerium für Bildung und Forschung) MedSys grant #0315450C to Peer Bork.

Conflict of interest statement. None declared.

REFERENCES

  • 1.Keller PJ, Knop M. Evolution of mutational robustness in the yeast genome: a link to essential genes and meiotic recombination hotspots. PLoS Genet. 2009;5:e1000533. doi: 10.1371/journal.pgen.1000533. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Glass JI, Assad-Garcia N, Alperovich N, Yooseph S, Lewis MR, Maruf M, Hutchison CA, III, Smith HO, Venter JC. Essential genes of a minimal bacterium. Proc. Natl Acad. Sci. USA. 2006;103:425–430. doi: 10.1073/pnas.0510013103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Hu W, Sillaots S, Lemieux S, Davison J, Kauffman S, Breton A, Linteau A, Xin C, Bowman J, Becker J, et al. Essential gene identification and drug target prioritization in Aspergillus fumigatus. PLoS Pathog. 2007;3:e24. doi: 10.1371/journal.ppat.0030024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.D'Elia MA, Pereira MP, Brown ED. Are essential genes really essential? Trends Microbiol. 2009;17:433–438. doi: 10.1016/j.tim.2009.08.005. [DOI] [PubMed] [Google Scholar]
  • 5.Jeong H, Mason SP, Barabasi AL, Oltvai ZN. Lethality and centrality in protein networks. Nature. 2001;411:41–42. doi: 10.1038/35075138. [DOI] [PubMed] [Google Scholar]
  • 6.Makino T, Hokamp K, McLysaght A. The complex relationship of gene duplication and essentiality. Trends Genet. 2009;25:152–155. doi: 10.1016/j.tig.2009.03.001. [DOI] [PubMed] [Google Scholar]
  • 7.Liao B-Y, Zhang J. Mouse duplicate genes are as essential as singletons. Trends Genet. 2007;23:378–381. doi: 10.1016/j.tig.2007.05.006. [DOI] [PubMed] [Google Scholar]
  • 8.Gu Z, Steinmetz LM, Gu X, Scharfe C, Davis RW, Li WH. Role of duplicate genes in genetic robustness against null mutations. Nature. 2003;421:63–66. doi: 10.1038/nature01198. [DOI] [PubMed] [Google Scholar]
  • 9.Hashimoto M, Ichimura T, Mizoguchi H, Tanaka K, Fujimitsu K, Keyamura K, Ote T, Yamakawa T, Yamazaki Y, Mori H, et al. Cell size and nucleoid organization of engineered Escherichia coli cells with a reduced genome. Mol. Microbiol. 2005;55:137–149. doi: 10.1111/j.1365-2958.2004.04386.x. [DOI] [PubMed] [Google Scholar]
  • 10.Zhang R, Lin Y. DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes. Nucleic Acids Res. 2009;37:D455–D458. doi: 10.1093/nar/gkn858. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Boutros M, Kiger AA, Armknecht S, Kerr K, Hild M, Koch B, Haas SA, Paro R, Perrimon N. Genome-wide RNAi analysis of growth and viability in Drosophila cells. Science. 2004;303:832–835. doi: 10.1126/science.1091266. [DOI] [PubMed] [Google Scholar]
  • 12.Silva JM, Marran K, Parker JS, Silva J, Golding M, Schlabach MR, Elledge SJ, Hannon GJ, Chang K. Profiling essential genes in human mammary cells by multiplex RNAi screening. Science. 2008;319:617–620. doi: 10.1126/science.1149185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Kamath RS, Fraser AG, Dong Y, Poulin G, Durbin R, Gotta M, Kanapin A, Le Bot N, Moreno S, Sohrmann M, et al. Systematic functional analysis of the Caenorhabditis elegans genome using RNAi. Nature. 2003;421:231–237. doi: 10.1038/nature01278. [DOI] [PubMed] [Google Scholar]
  • 14.Blake JA, Bult CJ, Kadin JA, Richardson JE, Eppig JT, Mouse Genome Database G. The Mouse Genome Database (MGD): premier model organism resource for mammalian genomics and genetics. Nucleic Acids Res. 2011;39:D842–D848. doi: 10.1093/nar/gkq1008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Dwight SS, Harris MA, Dolinski K, Ball CA, Binkley G, Christie KR, Fisk DG, Issel-Tarver L, Schroeder M, Sherlock G, et al. Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO) Nucleic Acids Res. 2002;30:69–72. doi: 10.1093/nar/30.1.69. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.de Berardinis V, Vallenet D, Castelli V, Besnard M, Pinet A, Cruaud C, Samair S, Lechaplais C, Gyapay G, Richez C, et al. A complete collection of single-gene deletion mutants of Acinetobacter baylyi ADP1. Mol. Syst. Biol. 2008;4:174. doi: 10.1038/msb.2008.10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Kobayashi K, Ehrlich SD, Albertini A, Amati G, Andersen KK, Arnaud M, Asai K, Ashikaga S, Aymerich S, Bessieres P, et al. Essential Bacillus subtilis genes. Proc. Natl Acad. Sci. USA. 2003;100:4678–4683. doi: 10.1073/pnas.0730515100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Langridge GC, Phan MD, Turner DJ, Perkins TT, Parts L, Haase J, Charles I, Maskell DJ, Peters SE, Dougan G, et al. Simultaneous assay of every Salmonella Typhi gene using one million transposon mutants. Genome Res. 2009;19:2308–2316. doi: 10.1101/gr.097097.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Thanassi JA, Hartman-Neumann SL, Dougherty TJ, Dougherty BA, Pucci MJ. Identification of 113 conserved essential genes using a high-throughput gene disruption system in Streptococcus pneumoniae. Nucleic Acids Res. 2002;30:3152–3162. doi: 10.1093/nar/gkf418. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Chaudhuri RR, Allen AG, Owen PJ, Shalom G, Stone K, Harrison M, Burgis TA, Lockyer M, Garcia-Lara J, Foster SJ, et al. Comprehensive identification of essential Staphylococcus aureus genes using Transposon-Mediated Differential Hybridisation (TMDH) BMC Genomics. 2009;10:291. doi: 10.1186/1471-2164-10-291. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.French CT, Lao P, Loraine AE, Matthews BT, Yu H, Dybvig K. Large-scale transposon mutagenesis of Mycoplasma pulmonis. Mol. Microbiol. 2008;69:67–76. doi: 10.1111/j.1365-2958.2008.06262.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Salama NR, Shepherd B, Falkow S. Global transposon mutagenesis and essential gene analysis of Helicobacter pylori. J. Bacteriol. 2004;186:7926–7935. doi: 10.1128/JB.186.23.7926-7935.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Trepod CM, Mott JE. Elucidation of essential and nonessential genes in the Haemophilus influenzae Rd cell wall biosynthetic pathway by targeted gene disruption. Antimicrob. Agents Chemother. 2005;49:824–826. doi: 10.1128/AAC.49.2.824-826.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Gallagher LA, Ramage E, Jacobs MA, Kaul R, Brittnacher M, Manoil C. A comprehensive transposon mutant library of Francisella novicida, a bioweapon surrogate. Proc. Natl Acad. Sci. USA. 2007;104:1009–1014. doi: 10.1073/pnas.0606713104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Chen S, Zhang YE, Long M. New genes in Drosophila quickly become essential. Science. 2010;330:1682–1685. doi: 10.1126/science.1196380. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Meinke D, Muralla R, Sweeney C, Dickerman A. Identifying essential genes in Arabidopsis thaliana. Trends Plant Sci. 2008;13:483–491. doi: 10.1016/j.tplants.2008.06.003. [DOI] [PubMed] [Google Scholar]
  • 27.Amsterdam A, Nissen RM, Sun Z, Swindell EC, Farrington S, Hopkins N. Identification of 315 genes essential for early zebrafish development. Proc. Natl Acad Sci. USA. 2004;101:12792–12797. doi: 10.1073/pnas.0403929101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, Datsenko KA, Tomita M, Wanner BL, Mori H. Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol. Syst. Biol. 2006;2 doi: 10.1038/msb4100050. 2006 0008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Gerdes SY, Scholle MD, Campbell JW, Balazsi G, Ravasz E, Daugherty MD, Somera AL, Kyrpides NC, Anderson I, Gelfand MS, et al. Experimental determination and system level analysis of essential genes in Escherichia coli MG1655. J. Bacteriol. 2003;185:5673–5684. doi: 10.1128/JB.185.19.5673-5684.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Kraemer PS, Mitchell A, Pelletier MR, Gallagher LA, Wasnick M, Rohmer L, Brittnacher MJ, Manoil C, Skerett SJ, Salama NR. Genome-wide screen in Francisella novicida for genes required for pulmonary and systemic infection in mice. Infect. Immun. 2009;77:232–244. doi: 10.1128/IAI.00978-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Akerley BJ, Rubin EJ, Novick VL, Amaya K, Judson N, Mekalanos JJ. A genome-scale analysis for identification of genes required for growth or survival of Haemophilus influenzae. Proc. Natl Acad. Sci. USA. 2002;99:966–971. doi: 10.1073/pnas.012602299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Chalker AF, Minehart HW, Hughes NJ, Koretke KK, Lonetto MA, Brinkman KK, Warren PV, Lupas A, Stanhope MJ, Brown JR, et al. Systematic identification of selective essential genes in Helicobacter pylori by genome prioritization and allelic replacement mutagenesis. J. Bacteriol. 2001;183:1259–1268. doi: 10.1128/JB.183.4.1259-1268.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Sassetti CM, Rubin EJ. Genetic requirements for mycobacterial survival during infection. Proc. Natl Acad. Sci. USA. 2003;100:12989–12994. doi: 10.1073/pnas.2134250100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Liberati NT, Urbach JM, Miyata S, Lee DG, Drenkard E, Wu G, Villanueva J, Wei T, Ausubel FM. An ordered, nonredundant library of Pseudomonas aeruginosa strain PA14 transposon insertion mutants. Proc. Natl Acad. Sci. USA. 2006;103:2833–2838. doi: 10.1073/pnas.0511100103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Knuth K, Niesalla H, Hueck CJ, Fuchs TM. Large-scale identification of essential Salmonella genes by trapping lethal insertions. Mol. Microbiol. 2004;51:1729–1744. doi: 10.1046/j.1365-2958.2003.03944.x. [DOI] [PubMed] [Google Scholar]
  • 36.Lamichhane G, Zignol M, Blades NJ, Geiman DE, Dougherty A, Grosset J, Broman KW, Bishai WR. A postgenomic method for predicting essential genes at subsaturation levels of mutagenesis: application to Mycobacterium tuberculosis. Proc. Natl Acad. Sci. USA. 2003;100:7213–7218. doi: 10.1073/pnas.1231432100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Cherry JM, Ball C, Weng S, Juvik G, Schmidt R, Adler C, Dunn B, Dwight S, Riles L, Mortimer RK, et al. Genetic and physical maps of Saccharomyces cerevisiae. Nature. 1997;387:67–73. [PMC free article] [PubMed] [Google Scholar]
  • 38.Szklarczyk D, Franceschini A, Kuhn M, Simonovic M, Roth A, Minguez P, Doerks T, Stark M, Muller J, Bork P, et al. The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res. 2011;39:D561–D568. doi: 10.1093/nar/gkq973. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Wolf YI, Novichkov PS, Karev GP, Koonin EV, Lipman DJ. Inaugural article: the universal distribution of evolutionary rates of genes and distinct characteristics of eukaryotic genes of different apparent ages. Proc. Natl Acad. Sci. USA. 2009;106:7273–7280. doi: 10.1073/pnas.0901808106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Sayers EW, Barrett T, Benson DA, Bolton E, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Federhen S, et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2011;39:D38–D51. doi: 10.1093/nar/gkq1172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Qian W, Liao B-Y, Chang AY-F, Zhang J. Maintenance of duplicate genes and their functional redundancy by reduced expression. Trends Genet. 2010;26:425–430. doi: 10.1016/j.tig.2010.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Schrimpf SP, Weiss M, Reiter L, Ahrens CH, Jovanovic M, Malmstrom J, Brunner E, Mohanty S, Lercher MJ, Hunziker PE, et al. Comparative functional analysis of the Caenorhabditis elegans and Drosophila melanogaster proteomes. PLoS Biol. 2009;7:e48. doi: 10.1371/journal.pbio.1000048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.He X, Zhang J. Higher duplicability of less important genes in yeast genomes. Mol. Biol. Evol. 2006;23:144–151. doi: 10.1093/molbev/msj015. [DOI] [PubMed] [Google Scholar]
  • 44.Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 2000;25:25–29. doi: 10.1038/75556. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Muller J, Szklarczyk D, Julien P, Letunic I, Roth A, Kuhn M, Powell S, von Mering C, Doerks T, Jensen LJ, et al. eggNOG v2.0: extending the evolutionary genealogy of genes with enhanced non-supervised orthologous groups, species and functional annotations. Nucleic Acids Res. 2010;38:D190–D195. doi: 10.1093/nar/gkp951. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES