Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2010 Sep 22;39(Database issue):D1095–D1102. doi: 10.1093/nar/gkq811

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© The Author(s) 2010. Published by Oxford University Press.

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

PMC Copyright notice

Figure 1. — Flowchart of the GreenPhylDB analyses. The input file is a multi-fasta file containing complete plant proteomes. In a first step, an automatic clustering aggregates all proteins in previously defined families. Sequences are classified as orphans if they cannot be regrouped in a cluster. Sequences composing the clusters are analyzed in order to overlay clusters with cross-references (e.g. UniProtKB, Pubmed, InterPro, MEME motifs, KEGG pathways data). Based on this information, clusters are manually curated in order to identify gene families. Finally, gene family sequences are analyzed via a phylogenetic-based pipeline to infer ortholog relationships. The procedure can be iterated for each new released genome using a lighter procedure. This ensures a cumulative and safe growth of the database. The data are stored in the database and can be easily accessed using dedicated visualizing tools including a gene tree viewer, a gene family browser and ortholog extracting tools.