Table 1.
Disease gene product/Segmentation/Signal peptide* GenBank/OMIM | Previously unknown domains or motifs | Method(s) of detection and significance; additional evidence† | Predicted function/activity | Other domains, motifs, and functions | Ortholog candidate‡ and paralogs in C. elegans (C), yeast (Y), bacteria (B); matches with ESTs (E)§ |
---|---|---|---|---|---|
Autosomal chronic granulomatosis disease protein/390:202-170-18 M55067M55067/306400 | Two conserved domains: i. in several phosphatidylinositol 3-kinases (U52192U52192; Z69660Z69660); ii. in yeast BEM1 and SCD2 proteins (not detected previously) | blast2 P = 3 × 10−6 (SCD2_SCHPO) | Intracellular signal transduction | SH3 domains; NADPH oxidase activator | C: ortholog not found; paralogs of SH3 domains only (e.g. CE01784) and of BEM1-SCD2 domain only (CE05832) Y: no ortholog; paralogs: BEM1 (Bem/Scd2- and SH3 domains are swapped as compared to human disease gene), several proteins with SH3 domain only, e.g. YHL002w B: not found E: Human +; mostly SH3 domains |
Barth syndrome (tafazzin)/292: Signal peptide 32-33 X92762/302060 | A conserved motif in bacterial RadC | blast2 P = 0.31 (E. coli RadC, gil78795) HMM built from the alignment of tafazzin with nematode and yeast homologs—14 bits with RADC_BACSU compared to 11.5 bits for first false positive; conservation of putative active Glu and His | Hydrolytic enzyme; RadC is involved in DNA repair but such a role is unlikely for tafazzin given the presence of a signal peptide | Predicted secreted protein | C: CE03830 Y: P9659.5 B: ortholog not found; paralogs— RadC family E: Human + |
Hereditary nonpolyposis colon cancer (mutL homolog)/756: 330-78-348 U07418/120436 | Putative ATP-binding domain shared with HSP90, signal transduction His kinases, and type II topoisomerases (Fig. 2A) | blast2 P = 0.055 (CHVG_AGRTU) psiblastP ≈ 10−4 (numerous histidine kinases, HSP90, and topoisomerases) MoST | ATPase, possibly with autophos- phorylating activity | Protein involved in mismatch repair | C: ortholog not found Y: MLH1_YEAST, PMS1_YEAST B: MutL/HexB family E: Human + Mouse + C. elegans + |
Ocular albinism/424: 104-320 Z48804/300500 | 7TM receptor (previously thought to contain 6 transmembrane segments) | blast2 P = 0.006 (VIPS_HUMAN) P = 0.01 (CAR1_DICDI, cAMP receptor) PHDhtm (transmembrane helix prediction) | Putative G-protein-coupled receptor | None | C: ortholog not found; weak similarity to CE03862 Y: no ortholog or paralogs B: ortholog not found; limited similarity to YSCS_YERPE E: human + |
Obesity factor (leptin)/167:96-71 Signal peptide 21-22 U18915/164160 | C-terminal motif conserved in inositol-phosphate synthases | blastP P = 0.91 (INO1_YEAST) MoST—motif from inositol-phosphate synthases detects leptins without false positives, r = 0.001 (P ≈ 0.003) | Possible involvement in inositol signaling | Secreted protein; helical cytokine | C: ortholog not found; conserved motif in inositol phosphate synthase (C47D12.9, gi e225658) Y: no ortholog; conserved motif in inositol- phosphate synthase (YJL153c) B: ortholog or paralogs not found E: Human + |
Spinal muscular atrophy/294: 171-95-28 U18423/253300 | 2× repeat also found in Drosophila TUD (10× repeat) and HLS proteins, and in human p100 transcriptional coactivator | blast2 P = 0.012 (TUD_DROME) MoST—tudor repeat motif detects the spinal muscular atrophy protein without false positives, r = 0.01 | Repeat motif may be involved in regulatory interactions | None | C: ortholog not found; CE02626 (ortholog of p100) Y: no ortholog or paralogs B: ortholog or paralogs not found E: human + mouse + |
Spinocerebral ataxia type-1 protein | Domain conserved in rat HBP1 transcription regulator | blast2 P = 10−5 (HBP1, gi|1488627) | Role in transcription regulation? | — | None |
Thomsen disease¶ 988: 118-85-413-177-88-107 Z25884/160800 | A domain, in the cytoplasmic portion, also found in inositol-5-phosphate dehydrogenases (IMPDH) as a separate, noncatalytic subdomain in their known three-dimensional structure, in cystathionine β synthases, AMP-regulated protein kinases, and several other enzymes and uncharacterized proteins | psiblastP = 2 × 10−5 (hypothetical Methanococcus jannaschii protein, gi|1591551) | Unknown; effector binding? | Voltage-gated chloride channel | C: orthologous family of chloride channels, including CE01212; IMPDH-associated domain in several enzymes Y: YJR040w¶, the orthologous candidate, is a putative chloride channel with modified IMPDH-associated domain B: ortholog not found; YADQ_ECOLI is a putative chloride channel without IMDPH-associated domain E: Human + |
Werner syndrome/1432: 274-47-57-148-528-106-272 L76937/277700 | N-terminal nuclease domain related to PM-SCL autoantigen, bacterial RNase D, and 3′-5′ proofreading exonuclease domain of bacterial DNA polymerase I (Fig. 2B) | blast2 P = 6 × 10−5 (RNAase D, Synechocystis sp., gi|1001530); psiblastP = 5 × 10−5 (DPO1_HAEIN, DNA polymerase I from H. influenzae) | Nucleasehelicase involved in repair | RecQ-like helicase domain | C: ortholog not found; both domains present, but in separate large proteins (highest similarity to YO63_CAEEL and YMR1_CAEEL) Y: no ortholog; RecQ-like helicases (e.g. SGS1_YEAST) B: ortholog not found; RecQ helicases and RNaseD E: Human + plants + (helicase domain only) |
The total length (number of amino acid residues) of the protein and the lengths of the predicted globular and nonglobular (underlined) domains are indicated; the position of the cleavage site for predicted signal peptides is shown; the GenBank accession number and the NCBI ID are indicated for each disease gene product.
The swiss-prot name (with underline) or the NCBI ID is indicated for homologs.
Orthologs are shown by bold type.
+ indicates 1–20 homologous expressed sequence tags (ESTs) from the given taxon; ++ indicates >20 ESTs.
Fanconi anemia gene codes for a protein with similar functions and domain structure (FACC_HUMAN); it is more likely to be the ortholog of YJR040w than the Thomsen disease gene product; the conserved domain, designated CBS (after cystathionine β synthase), has been recently described independently (30).