Skip to main content
. 1997 May 27;94(11):5831–5836. doi: 10.1073/pnas.94.11.5831

Table 1.

Previously undetected domains and motifs in human positionally cloned disease gene products

Disease gene product/Segmentation/Signal peptide* GenBank/OMIM Previously unknown domains or motifs Method(s) of detection and significance; additional evidence† Predicted function/activity Other domains, motifs, and functions Ortholog candidate‡ and paralogs in C. elegans (C), yeast (Y), bacteria (B); matches with ESTs (E)§
Autosomal chronic granulomatosis disease protein/390:202-170-18 M55067M55067/306400 Two conserved domains: i. in several phosphatidylinositol 3-kinases (U52192U52192; Z69660Z69660); ii. in yeast BEM1 and SCD2 proteins (not detected previously) blast2 P = 3 × 10−6 (SCD2_SCHPO) Intracellular signal transduction SH3 domains; NADPH oxidase activator C: ortholog not found; paralogs of SH3 domains only (e.g. CE01784) and of BEM1-SCD2 domain only (CE05832) Y: no ortholog; paralogs: BEM1  (Bem/Scd2- and SH3 domains are swapped as compared to human disease gene), several proteins with SH3 domain only, e.g. YHL002w B: not found E: Human +; mostly SH3 domains
Barth syndrome (tafazzin)/292: Signal peptide 32-33 X92762/302060 A conserved motif in bacterial RadC blast2 P = 0.31 (E. coli RadC, gil78795) HMM built from the alignment of tafazzin with nematode and yeast homologs—14 bits with RADC_BACSU compared to 11.5 bits for first false positive; conservation of putative active Glu and His Hydrolytic enzyme; RadC is involved in DNA repair but such a role is unlikely for tafazzin given the presence of a signal peptide Predicted secreted protein C: CE03830 Y: P9659.5 B: ortholog not found; paralogs—  RadC family E: Human +
Hereditary nonpolyposis colon cancer (mutL homolog)/756: 330-78-348 U07418/120436 Putative ATP-binding domain shared with HSP90, signal transduction His kinases, and type II topoisomerases (Fig. 2A) blast2 P = 0.055 (CHVG_AGRTU) psiblastP ≈ 10−4 (numerous histidine kinases, HSP90, and topoisomerases) MoST ATPase, possibly with autophos- phorylating activity Protein involved in mismatch repair C: ortholog not found Y: MLH1_YEAST, PMS1_YEAST B: MutL/HexB family E: Human + Mouse + C. elegans +
Ocular albinism/424: 104-320 Z48804/300500 7TM receptor (previously thought to contain 6 transmembrane segments) blast2 P = 0.006 (VIPS_HUMAN) P = 0.01 (CAR1_DICDI, cAMP receptor) PHDhtm (transmembrane helix prediction) Putative G-protein-coupled receptor None C: ortholog not found; weak similarity to  CE03862 Y: no ortholog or paralogs B: ortholog not found; limited similarity to  YSCS_YERPE E: human +
Obesity factor (leptin)/167:96-71 Signal peptide 21-22 U18915/164160 C-terminal motif conserved in inositol-phosphate synthases blastP P = 0.91 (INO1_YEAST) MoST—motif from inositol-phosphate synthases detects leptins without false positives, r = 0.001 (P ≈ 0.003) Possible involvement in inositol signaling Secreted protein; helical cytokine C: ortholog not found; conserved motif in inositol phosphate synthase (C47D12.9, gi e225658) Y: no ortholog; conserved motif in inositol-  phosphate synthase (YJL153c) B: ortholog or paralogs not found E: Human +
Spinal muscular atrophy/294: 171-95-28 U18423/253300 2× repeat also found in Drosophila TUD (10× repeat) and HLS proteins, and in human p100 transcriptional coactivator blast2 P = 0.012 (TUD_DROME) MoST—tudor repeat motif detects the spinal muscular atrophy protein without false positives, r = 0.01 Repeat motif may be involved in regulatory interactions None C: ortholog not found; CE02626 (ortholog  of p100) Y: no ortholog or paralogs B: ortholog or paralogs not found E:  human + mouse +
Spinocerebral ataxia type-1 protein Domain conserved in rat HBP1 transcription regulator blast2 P = 10−5 (HBP1, gi|1488627) Role in transcription regulation? None
Thomsen disease 988: 118-85-413-177-88-107 Z25884/160800 A domain, in the cytoplasmic portion, also found in inositol-5-phosphate dehydrogenases (IMPDH) as a separate, noncatalytic subdomain in their known three-dimensional structure, in cystathionine β synthases, AMP-regulated protein kinases, and several other enzymes and uncharacterized proteins psiblastP = 2 × 10−5 (hypothetical Methanococcus jannaschii protein, gi|1591551) Unknown; effector binding? Voltage-gated chloride channel C: orthologous family of chloride channels,  including CE01212; IMPDH-associated  domain in several enzymes Y: YJR040w, the orthologous candidate, is  a putative chloride channel with modified  IMPDH-associated domain B: ortholog not found; YADQ_ECOLI is a  putative chloride channel without  IMDPH-associated domain E: Human +
Werner syndrome/1432: 274-47-57-148-528-106-272 L76937/277700 N-terminal nuclease domain related to PM-SCL autoantigen, bacterial RNase D, and 3′-5′ proofreading exonuclease domain of bacterial DNA polymerase I (Fig. 2B) blast2 P = 6 × 10−5 (RNAase D, Synechocystis sp., gi|1001530); psiblastP = 5 × 10−5 (DPO1_HAEIN, DNA polymerase I from H. influenzae) Nucleasehelicase involved in repair RecQ-like helicase domain C: ortholog not found; both domains  present, but in separate large proteins  (highest similarity to YO63_CAEEL   and YMR1_CAEEL) Y: no ortholog; RecQ-like helicases   (e.g. SGS1_YEAST) B: ortholog not found; RecQ helicases   and RNaseD E: Human + plants + (helicase domain  only)
*

The total length (number of amino acid residues) of the protein and the lengths of the predicted globular and nonglobular (underlined) domains are indicated; the position of the cleavage site for predicted signal peptides is shown; the GenBank accession number and the NCBI ID are indicated for each disease gene product. 

The swiss-prot name (with underline) or the NCBI ID is indicated for homologs. 

Orthologs are shown by bold type. 

§

+ indicates 1–20 homologous expressed sequence tags (ESTs) from the given taxon; ++ indicates >20 ESTs. 

Fanconi anemia gene codes for a protein with similar functions and domain structure (FACC_HUMAN); it is more likely to be the ortholog of YJR040w than the Thomsen disease gene product; the conserved domain, designated CBS (after cystathionine β synthase), has been recently described independently (30).