Skip to main content
. 2015 Feb 21;16:57. doi: 10.1186/s12859-015-0487-2

Table 1.

Ambiguity of headwords for gene/protein names in SemCat

Gene/protein Ambiguity Headwords
Yes No gene, protein, kinase, receptor, transporter, pseudogene, enzyme, peptide, polypeptide, glycoprotein, lipoprotein, symporter, antiporter, collagen, polyprotein, cotransporter, crystallin, lectin, globin, tubulin, oncogene, phosphoprotein, ferredoxin, opsin, antibody, porin, flavoprotein, homeobox, actin, adhesin, isoenzyme, integrin, lysozyme, chaperonin, globulin, ribonucleoprotein, immunoglobulin, isozyme, cadherin, transcript, myosin, apoprotein, cyclin, autoantigen, hemoglobin, spectrin, cytochrome, flagellin, tropomyosin, kinesin, adaptin, keratin, peroxiredoxin, pilin, chemokine, casein, catenin, ferritin, enkephalin, histone, giardin, interferon, albumin, trypsin, glutaredoxin, metallothionein, cyclophilin, proteolipid, mucin, vasopressin, proteoglycan
Ambiguous Low -ase (i.e. terms ending in “ase”), regulator, antigen, isoform, inhibitor, repressor, hormone, toxin, ras, carrier, suppressor, ligand, translocator, phosphate, thioredoxin, neurotoxin
High Greek letters (e.g. alpha, beta,...), Roman numerals, short strings (e.g. psi, orf, ib,...), precursor, subunit, homolog, chain, factor, component, family, product, channel, activator, system, variant, chaperone, superfamily, molecule, pump, exchanger, element, sequence, resistance, construct, allergen, exporter, transducer, sensor, finger, modulator, effector, antiterminator, fusion, defective, antagonist, locus, wing, acid, receiver, para, cofactor, spot, tail, pigment, class, coma, exon, interactor, coactivator
Rarely used content, percentage, gain, frame, length, ratio, response, yield, defect, fiber, resistant
No No region, domain, complex, form, fragment, binding, weight, transport, member, cell, containing, fluid, related, associated, syndrome, putative, biosynthesis, repeat, activity, segment, preparation, smear, subfamily, dependent, terminus, substrate, determinant, site, level, motif, specific, subtype, mrna, dna, synthesis, fibroblast, cdna, cluster, assembly, membrane, mutant, transmembrane, virus, terminal, group, hybrid, flip, urine, function, number, periplasmic, yield, rich, plasmid, rate, metabolism, fold

For each term, either the last word or the word before a preposition was considered as a headword. The uniqueness and the ambiguity for being a gene/protein name were judged by an annotator.