Skip to main content
. 2009 Aug 27;10(Suppl 8):S4. doi: 10.1186/1471-2105-10-S8-S4

Table 3.

Biological catagories for the interpretation of functional annotations. The interpretation of extracted annotations is based on the automatic assignment of semantic labels to the arguments of a PAS. Because a comprehensive ontology is not available two categorisation schema are tested in this study. The first is the design of a scheme (MAN) based on an analysis of relevant MEDLINE sentences for residue annotation (bottom-up approach). Alternatively, the categories in the feature table of UniProtKb (FEAT) can be reused (top-down approach). Both categorisation schemes reflect concepts of biological interest. However the bottom-up approach has the advantage that proposed categories are data-driven, while in a top-down approach examples of listed categories may not be present in natural language text, or other categories are missing in the scheme.

MAN FEAT


Category Defintion Category Defintion
STR_COMP Structure component. Class denoting concepts that represent pieces and parts of the protein structure. DOMAIN Extent of a domain, which is defined as a specific combination of secondary structures organised into a characteristic three-dimensional structure of fold.

MOTIF Short (up to 20 amino acids) sequence motif of biological interest.

TOPO_DOM Topological domain.

CHAIN Extent of a polypeptide chain in the mature protein.

TRANSMEM Extent of a transmembrane region.

COILED Extent of a coiled-coil region.

CHEM_MOD Chemical modification. Class denoting changes to the protein sequence and the chemical composition. VARIANT Authors report that sequence variants exist.

MOD_RES Posttranslational modification of a residue.

PEPTIDE Extent of a released active peptide.

VAR_SEQ Description of sequence variants produced by alternative splicing, alternative promoter usage, alternative initiation and ribosomal frameshifting.

LIPID Covalent binding of a lipid moiety.

CARBOHYD Glycosylation site.

STR_MOD Structural modification. Class denoting the changes to the protein structure without changes to the chemical composition. REGION Extent of a region of interest in the sequence.

SITE Any interesting single amino-acid site on the sequence, that is not defined by another feature key.

BINDING Binding type. Class denoting different physico-chemical forces leading to a bond formation between a protein structure component and a chemical entity. BINDING Binding site for any chemical group (co-enzyme, prosthetic group, etc.).

METAL Binding site for a metal ion.

DISULFID Disulfide bond.

CROSSLNK Posttranslationally formed amino acid bonds.

DNA_BIND Extent of a DNA-binding region.

NP_BIND Extent of a nucleotide phosphate-binding region.

ZN_FING Extent of a zinc finger region.

CA_BIND Extent of a calcium-binding region.

ENZ_ACT Enzymatic activity. Types of enzymatic reactions as a subpart to protein functions. ACT_SITE Amino acid(s) involved in the activity of an enzyme.

CELL Cellular phenotype. Class denoting different cellular phenotypes that can be affected by structural or compositional changes of a protein. N/A