Skip to main content
. 2001 Apr 15;29(8):1750–1764. doi: 10.1093/nar/29.8.1750

Table 1. All the attributes ranked by PartsList.

A        
Category
Symbol
Definition of symbol
Attributes in category
Reference
Genome Occurrence G(x) Number of times a particular PART occurs in genome x. (These are based on PSI-BLAST comparisons between PDB and the genomes with an e-value cutoff in these comparisons of 0.0001.) 20 (H.Hegyi, J.Lin and M.Gerstein, manuscript submitted; 19,35)
Expression L(e) Average expression level of a particular PART. This is the average expression level over all genes that contain this PART. 8 (46)
  C(e) PART composition of the yeast transcriptome in expression level experiment e. This refers to the fraction of the mRNA population with this PART as opposed to all other parts. (This is only applicable to expression experiments, such as SAGE and GeneChips, that measure absolute mRNA levels in copies per cell.) 8 (46)
  E(e) Transcriptome enrichment compared to genome in experiment e. [Transcriptome enrichment is defined as percentage difference of PART composition in the transcriptome and the genome. In symbols: E(e) = [C(e)-G(Scer)] / G(Scer).] 8 (46
  F(r) Expression level fluctuation in experiment r. [This is the standard deviation in the expression ratio measurement R(i,t) over a timecourse, for example, <(R(i,t)–<R(i,t)>)2> where one averages over all times t and genes i that have a particular PART.] 7 (67)
Alignments V(f) The number of aligned pairs in pair-set f. 2 (39)
  U(f) RMS deviation in Cα atoms averaged over all alignments in pair-set f 2 (39)
  R(f) Similar to U(f) for pair-set f but only the best fitting half of the atoms are included in the calculation 2 (39)
  S(f) Average percentage identity between pairs of aligned proteins in pair-set f 2 (39)
  P(f) Average sequence P value for pair-set f 2 (39)
  Q(f) Average structural P value for pair-set f 2 (39)
Compositions N(p) The number of structures associated with a particular PART in dataset p. 2  
  B(a,p) Composition of amino acid a in a particular PART where one averages over all structures in dataset p associated with the PART 40  
Motion M(s,d) The maximum value of statistic s derived from surveying set of motions d in the Macromolecular Motions Database for a particular PART, where s is only calculated from the entries in the database that are associated with the PART. 7 (56,57)
  A(s,d) Similar to M(s,d) but now we take the average instead of the maximum. 7 (56,57)
Interaction I(y,c) For a given PART, the number of types of protein–protein interactions in interaction dataset y subject to the restriction c regarding whether or not the proteins are on the same chain. The number of interaction types is the number of distinctly different PARTs that interacts with a given PART. 24 (51,68)
  J(y,c) For a given PART, the total number of types of interactions in interaction dataset y subject to the restriction c regarding whether or not the proteins are on the same chain. Here we show all interactions observed not just the number of distinct PART-PART interactions tabulated in I(y,c). 24 (52,68)
Transposon T(b) The sensitivity of the cell to a transposon inserted into genes containing a particular PART under different growth condition b. The sensitivity was indicated by negative logarithm of a P value, which measures the degree to which the observations for one particular gene could have resulted from wild-type cells that randomly change their phenotype. 20 (58)
Miscelleneous X(q) Various miscellaneous ranks. 5  
Total
 
 
182
 
B        
Attributes
Value
Description
Reference
 
Genome x = aful Archaeoglobus fulgidus (H.Hegyi, J.Lin and M.Gerstein, manuscript submitted; 19,35)  
  mjan Methanococcus jannaschii (H.Hegyi, J.Lin and M.Gerstein, manuscript submitted; 19,35)  
  mthe Methanobacterium thermoautotrophicum (H.Hegyi, J.Lin and M.Gerstein, manuscript submitted; 19,35)  
  phor Pyrococcus horikoshii (H.Hegyi, J.Lin and M.Gerstein, manuscript submitted; 19,35)  
  scer Saccharomyces cerevisiae (H.Hegyi, J.Lin and M.Gerstein, manuscript submitted; 19,35)  
  cele Caenorhabditis elegans (H.Hegyi, J.Lin and M.Gerstein, manuscript submitted; 19,35)  
  aaeo Aquifex aeolicus (H.Hegyi, J.Lin and M.Gerstein, manuscript submitted; 19,35)  
  syne Synechocystis sp. (H.Hegyi, J.Lin and M.Gerstein, manuscript submitted; 19,35)  
  ecol Escherichia coli (H.Hegyi, J.Lin and M.Gerstein, manuscript submitted; 19,35)  
  bsub Bacillus subtilis (H.Hegyi, J.Lin and M.Gerstein, manuscript submitted; 19,35)  
  mtub Mycobacterium tuberculosis (H.Hegyi, J.Lin and M.Gerstein, manuscript submitted; 19,35)  
  hinf Haemophilus influenza Rd (H.Hegyi, J.Lin and M.Gerstein, manuscript submitted; 19,35)  
  hpyl Helicobacter pylor (H.Hegyi, J.Lin and M.Gerstein, manuscript submitted; 19,35)  
  mgen Mycoplasma genitalium (H.Hegyi, J.Lin and M.Gerstein, manuscript submitted; 19,35)  
  mpne Mycoplasma pneumoniae (H.Hegyi, J.Lin and M.Gerstein, manuscript submitted; 19,35)  
  bbur Borrelia burgdorferi (H.Hegyi, J.Lin and M.Gerstein, manuscript submitted; 19,35)  
  tpal Treponema pallidum (H.Hegyi, J.Lin and M.Gerstein, manuscript submitted; 19,35)  
  ctra Chlamydia trachomatis (H.Hegyi, J.Lin and M.Gerstein, manuscript submitted; 19,35)  
  cpne Chlamydia pneumoniae (H.Hegyi, J.Lin and M.Gerstein, manuscript submitted; 19,35)  
  rpro Rickettsia prowazekii (H.Hegyi, J.Lin and M.Gerstein, manuscript submitted; 19,35)  
Absolute Expression Experimente = vegsam GeneChip mRNA expression analysis of 6200 yeast ORFs under vegetative growth conditions. (48)  
  vegyou GeneChip mRNA expression analysis of 5455 yeast ORFs under vegetative growth conditions. (49)  
  sage mRNA expression analysis of 3788 yeast ORFs determined by SAGE. (43)  
  matea GeneChip mRNA expression analysis of yeast mating type a strain grown on glucose. (50)  
  mateal GeneChip mRNA expression analysis of yeast mating type α strain grown on glucose. (50)  
  gal GeneChip mRNA expression analysis of yeast mating type a strain grown on galactose. (50)  
  heat GeneChip mRNA analysis of yeast mating type a strain grown on glucose at 30°C before a 39°C heat shock. (50)  
  ref Reference transcriptome. This is a scaling and merging of the above experiments. (46)  
Microarray Experimentr = cdc28 cDNA microarray genome-wide characterization of mRNA transcript levels for CDC28 synchronized yeast cells during the cell cycle. (69)  
  cdc15 cDNA microarray genome-wide characterization of mRNA transcript levels for CDC15 synchronized yeast cells during the cell cycle. (69)  
  alpha Analysis using cDNA microarrays of yeast mRNA levels after synchronization of cell cycle via α arrest factor. (69)  
  diaux Genome-wide cDNA microarray analysis of the temporal program of yeast mRNA expression accompanying the metabolic shift from fermentation to respiration. (70)  
  spor cDNA microarray genome-wide analysis to assay changes in gene expression during sporulation. (71)  
  heatec cDNA microarray experiment and analysis on 4290 E.coli ORFs after exposure of the bacteria to heat shock. (72)  
  deve Analysis of genome wide changes during successive larval stages using cDNA microarrays of ∼12 000 C.elegans ORFs. (73)  
Pair-setf = all All pairs within a PART included in the calculations in Wilson et al. (For example, for fold rankings this would be the total number of pairs within a fold.) (39)  
  foldonly A subset of the pair-set ‘all’ that only includes pairs between structures that are in the same PART but different sub-PART. (If PART is fold, then sub-PART is superfamily; If PART is superfamily, then sub-PART is family.) (39)  
Amino acid a=   Ala, Cys, Asp, Glu, Phe, Gly, His, Ile, Lys, Leu, Met, Asn, Pro, Gln, Arg, Ser, Thr, Val, Trp, Tyr. (31)  
Datasetp = pdb100 All structures within the fold (as defined by SCOP pdb100d). (31)  
  pdb40 Similar to pdb100 but now using a version of the PDB clustered at 40% similarity (as defined by SCOP pdb40d) (31)  
Interaction typey = pdball Interactions for a PART are computed with all other PARTS in the PDB databank based on the distances between atoms in the coordinate files. Five or more contacts between atoms separated by <5 Å was considered a valid PART–PART contact. (9,51,55)  
  pdba A subset of ‘pdball’. Interactions for a PART are computed just with all-α proteins (SCOP class 1) in the PDB. (9,51,55)  
  pdbb Similar to ‘pdba’ but now just with all-β proteins (SCOP class 2). (9,51,55)  
  pdbab Similar to ‘pdba’ but now just with mixed helix-sheet proteins (SCOP class 3 and 4) (9,51,55)  
  scerall Interactions for a PART are computed with all other PARTS based on the yeast two-hybrid experimental data. In particular, interactions between structural domains in the yeast genome were obtained by assigning protein structures to the yeast proteins. Structural domains contained within the same ORF that were within 30 amino acids were assumed to interact in an intramolecular fashion. To derive intermolecular interactions, we combined three sets of protein–protein interactions: (i) the MIPS web pages on complexes and pairwise interactions (February 2000) (9), (ii) the global yeast two-hybrid experiments by Uetz et al. (51) and (iii) large-scale yeast two-hybrid experiments by Ito et al. (52). Out of all these pairwise interactions known for yeast ORFs, there is a limited set in which both partners are completely covered by one structural domain (to within 100 residues). (9,51,55)  
  scera A subset of ‘scerall’. Interactions for a PART are computed just with all-α proteins (SCOP class 1) in the yeast experiment. (9,51,55)  
  scerb Similar to ‘scera’ but now just with all-β proteins (SCOP class 2). (9,51,55)  
  scerab Similar to ‘scera’ but now just with mixed helix-sheet proteins (SCOP class 3 and 4). (9,51,55)  
Interaction restrictionc = inter The interaction must occur between PARTS in different chains (9,51,55)  
  intra The interaction must occur between PARTS in the same chain. (9,51,55)  
  none The union of ‘inter’ and ‘intra’. Interactions can occur in PARTS on the same or different chains. (9,51,55)  
Motion statistics = nresidue Number of residues. (56,57)  
  maxcadev Maximal displacement of a Cα atom, in Å, of any residue during the motion (after fitting on the first core). (56,57)  
  rmsoverall Overall RMS of two structures after they are superimposed by a sieve-fit technique. Note that they are larger than traditionally used RMS. (56,57)  
  nhinges Number of hinges involved in the motion. (56,57)  
  kappa The rotation (in degrees) around the screw axis necessary to superimpose two domains of motion. (56,57)  
  transe Transition energy of the motion (maximum energy less minimum energy over the motion) (in kcal/mol). (56,57)  
  deltae Absolute value of energy difference between the ‘starting’ and ‘ending’ conformations of a motion (in kcal/mol). (56,57)  
Motion datasetd = goldstd List of approximately 220 ‘gold-standard’ manually curated motions (56,57)  
  auto List of approximately 4000 conformational different proteins based on analyzing the SCOP database for similar proteins with large conformational differences (as measured by RMS) but close sequence similarity. (56,57)  
Transposon conditionsb = caff YPD + 8mM caffeine. (58)  
  cyss Cyclohexmide hypersensitivity: YPD + 0.08 µg/ml cycloheximide at 30°C. (58)  
  wr White/red colour on YPD. (58)  
  ypg YPGlycerol. (58)  
  calcs Calcofluor hypersensitivity: YPD+12µg/ml calcoluor at 30°C. (58)  
  hyg YPD + 46µg/ml hygromycin at 30°C. (58)  
  sds YPD + 0.003% SDS. (58)  
  bens Benomyl hypersensitivity: YPD + 10 µg/ml benomyl. (58)  
  bcip YPD + 5-bromo-4-chloro-3-indolyl phosphate at 37°C (58)  
  mb YPD + 0.001% methylene blue at 30°C. (58)  
  benr Benomyl resistance: YPD + 20 µg/ml benomyl. (58)  
  ypd37 YPD at 37°C. (58)  
  egta YPD + 2mM EGTA (58)  
  mms YPD + 0.008% MMS. (58)  
  hu YPD + 75mM hydroxyurea. (58)  
  ypd11 YPD at 11°C. (58)  
  calcr Calcofluor resistance: YPD + 0.3 µg/ml calcofluor at 30°C. (58)  
  cycr Cyclohexmide resistance: YPD + 0.3 µg/ml cycloheximide. (58)  
  hhig Hyperhaploid invasive growth mutants. (58)  
  nacl YPD + 0.9 M NaCl. (58)  
Misc. quantitiesq = pseu Number of pseudogenes in worm genome matching a particular PART. (59)  
  func Total number of functions associated with this PART. (In this survey all non-enzyme functions were lumped into a single category.) (60)  
  enz Total number of enzymatic functions associated with this PART. (60)  
  size Average length of a PART in the pdb40d clustering of the PDB.    
  age The year of the first structure that is part of the PART was determined.    

The formalism for specifying an attribute has two parts: an overall category, denoted by a single uppercase symbol, and some parameter choices, which are denoted by lower-case arguments to the first symbol. Some examples for folds will suffice to make this clear: G(aful) is genome occurrence of a particular fold in A.fulgidus; M(nhinges,goldstd) is the maximum value of the number of hinges statistic from surveying a set of motions in the gold-standard subset of the Macromolecular Motions Database, where this statistic is only calculated for the entries in the motions database that are associated with a particular fold; and I(pdball,inter) is the number of distinct types of protein-protein interactions found in a survey of the PDB, subject to the restriction that the interactions must be between folds on different chains.