Skip to main content
. Author manuscript; available in PMC: 2016 Aug 1.
Published in final edited form as: Biochim Biophys Acta. 2015 Apr 18;1854(8):1019–1037. doi: 10.1016/j.bbapap.2015.04.015

Table 3.

Sequence annotations included as node attributes in SSNs produced by EFI-EST.

Node Attribute Description
ACC1 UniProt accession(s)
Uniprot_ID UniProt ID(s)
GN gene name(s)
GI GI numbers
STATUS reviewed – manually annotated, in Swiss-Prot; unreviewed automatically annotated, in TrEMBL
Description protein name(s)/annotation(s) in UniProtKB
SwissProt_Description protein name(s)/annotation(s) in UniProtKB for SwissProt reviewed entries
IPRO InterPro family(ies)
PFAM Pfam family(ies)
PDB Protein Data Bank entry
CAZY Carbohydrate-Active enZYmes (CAZy) family name(s)
EC EC number(s)
GO Gene Ontology classification(s)
Sequence_Length number(s) of amino acid residues
Domain domain of life to which the organism(s) belong(s)
PHYLUM Phylogenetic phylum/phyla of the organism(s)
CLASS Phylogenetic class(es) of the organism(s)
ORDER Phylogenetic order(s) of the organism(s)
FAMILY Phylogenetic family(ies) of the organism(s)
GENUS Phylogenetic genus/genera of the organism(s)
SPECIES Phylogenetic species of the organism(s)
Organism organism genus/genera and species
Taxonomy_ID NCBI taxonomy identifier(s)
HMP_Body_Site location(s) of organism(s) in/on the body, if human microbiome organism
HMP_Oxygen oxygen requirement(s), if human microbiome organism
EFI_ID Enzyme Function Initiative database ID(s)
GDNA availability of gDNA(s) at EFI Protein Core
Shared name Full network – UniProt accession; Rep Node network – UniProt accession for the longest sequence in the representative node
name UniProt accession for the longest sequence in the representative node
Cluster Size1 number of proteins represented by the representative node
1

Representative node SSNs only