Table 1.
Gene/protein | Ambiguity | Headwords |
---|---|---|
Yes | No | gene, protein, kinase, receptor, transporter, pseudogene, enzyme, peptide, polypeptide, glycoprotein, lipoprotein, symporter, antiporter, collagen, polyprotein, cotransporter, crystallin, lectin, globin, tubulin, oncogene, phosphoprotein, ferredoxin, opsin, antibody, porin, flavoprotein, homeobox, actin, adhesin, isoenzyme, integrin, lysozyme, chaperonin, globulin, ribonucleoprotein, immunoglobulin, isozyme, cadherin, transcript, myosin, apoprotein, cyclin, autoantigen, hemoglobin, spectrin, cytochrome, flagellin, tropomyosin, kinesin, adaptin, keratin, peroxiredoxin, pilin, chemokine, casein, catenin, ferritin, enkephalin, histone, giardin, interferon, albumin, trypsin, glutaredoxin, metallothionein, cyclophilin, proteolipid, mucin, vasopressin, proteoglycan |
Ambiguous | Low | -ase (i.e. terms ending in “ase”), regulator, antigen, isoform, inhibitor, repressor, hormone, toxin, ras, carrier, suppressor, ligand, translocator, phosphate, thioredoxin, neurotoxin |
High | Greek letters (e.g. alpha, beta,...), Roman numerals, short strings (e.g. psi, orf, ib,...), precursor, subunit, homolog, chain, factor, component, family, product, channel, activator, system, variant, chaperone, superfamily, molecule, pump, exchanger, element, sequence, resistance, construct, allergen, exporter, transducer, sensor, finger, modulator, effector, antiterminator, fusion, defective, antagonist, locus, wing, acid, receiver, para, cofactor, spot, tail, pigment, class, coma, exon, interactor, coactivator | |
Rarely used | content, percentage, gain, frame, length, ratio, response, yield, defect, fiber, resistant | |
No | No | region, domain, complex, form, fragment, binding, weight, transport, member, cell, containing, fluid, related, associated, syndrome, putative, biosynthesis, repeat, activity, segment, preparation, smear, subfamily, dependent, terminus, substrate, determinant, site, level, motif, specific, subtype, mrna, dna, synthesis, fibroblast, cdna, cluster, assembly, membrane, mutant, transmembrane, virus, terminal, group, hybrid, flip, urine, function, number, periplasmic, yield, rich, plasmid, rate, metabolism, fold |
For each term, either the last word or the word before a preposition was considered as a headword. The uniqueness and the ambiguity for being a gene/protein name were judged by an annotator.