Skip to main content
. 2005 Jun 27;33(Web Server issue):W284–W288. doi: 10.1093/nar/gki418

Table 1.

The databases used by the FFAS03 server

Database Source of data Preparation
NR85S (sequences) NCBI, SEED Protein sequences from the NCBI NR database and predicted open reading frames from unfinished bacterial genomes (kindly provided by Ross Overbeek) are clustered at 85% of sequence identity with the CD-HIT program (15). Regions of low complexity are masked with SEG (16).
PDB (profiles) Protein Data Bank FFAS profiles of all unique proteins (clustered at 99% identity level) from the PDB (17), including prereleased entries.
PFAM (profiles) PFAM website FFAS profiles of all PFAM (18) domains longer than 25 residues.
COG (profiles) NCBI FFAS profiles of all domains from COG database longer than 25 residues (19).
SCOP (profiles) SCOP–ASTRAL website FFAS profiles of SCOP domain sequences with <40% sequence identity to each other. SCOP protein sequences clustered at 40% of sequence identity have been downloaded from the Astral website (20).
JCSG (profiles) JCSG website FFAS profiles of all sequences of active targets of the Joint Center for Structural Genomics (21).