Skip to main content
. 2015 Jan 5;6:1. doi: 10.1186/2041-1480-6-1

Table 1.

Extraction patterns and contextual cues for databases

Database Patterns Contextual cues
ENA [A-Z][0–9]{5}; [A-Z]{2}[0–9]{6}; [A-Z]{3}[0–9]{5}; [A-Z]{4}[0–9]{8,10}; [A-Z]{5}[0–9]{7} genbank, gen, ddbj, embl
UniProt [A-N,R-Z][0–9][A-Z][A-Z, 0–9][A-Z, 0–9][0–9]; [O,P,Q][0–9][A-Z, 0–9][A-Z, 0–9][A-Z, 0–9][0–9] swissprot, sprot, uniprot
PDBe [0–9][A-Z, 0–9]{3} pdb
InterPro IPR[0–9]{6} interpro
Pfam PF(AM)?[0–9]{5} hmm, family, pfam
ArrayExpress E-[A-Z]{4}-[0–9]+ arrayexpress
OMIM [0–9]{6} omim
Ensembl ENS[A-Z]*G[0–9]{11}+ ensembl
RefSeq (AC|AP|NC|NG|NM|NP|NR|NT|NW|NZ|XM|XP|XR|YP|ZP|NS)_([A-Z]{4})*[0–9]{6,9}(?:[.][0–9]+)? refseq
RefSNP RS[0–9]{5,9} snp