Yamato et al. 10.1073/pnas.0609054104. |
Fig. 4. Schematic illustration of the M. polymorpha Y chromosome and alignment of the sequenced PAC clones. In the 470-kb contig of YR1, PAC clones pMM4G7 and pMM2D3 were previously sequenced (black bars).
Fig. 5. Alignment of repetitive elements in M19B4.1-3 and human ODF3 (UniProt accession no. Q96PU9). Amino acid residues identical to the first element of M19B4.1-3 are indicated by :, and sequence gaps are indicated by -. Regions conserved among the elements are boxed.
Fig. 6. Insertions of mitochondrial DNA into the Y chromosome. Horizontal bars represent the collective sequence of YR1 (except the 2.2-kb BamHI repeats), the mitochondrial sequence (GenBank accession no. NC_001660), and the combined sequence of contig A and contig B in YR2, respectively, as indicated. The size of the mitochondrial sequence is exaggerated for better presentation. The bars for YR1 and YR2 are shown with tiling and representative clones. Each similarity pair with a BLASTN E value <10-20 is connected by a line, its color reflecting percent identity, i.e., red: 95% or higher, magenta: 90% or higher, and green: < 90%.
Fig. 7. Pairwise alignment of the deduced amino acid sequences of the M104E4.1 and F62B12.1 genes. Amino acid residues of F62B12.1 (GenBank accession no. AB272580) that are identical to those of M104E4.1 (GenBank accession no. AB272579) are indicated by colons. Alignment gaps are indicated by dashes. The boxed region shows the LOV domain. Sites of introns are indicated by arrowheads, and their sizes are given in nucleotides above or below the arrowheads, respectively.
Table 3. Statistics of the M. polymorpha Y chromosome
Statistic | YR1 |
| YR2 |
|
No. of PAC clones sequenced | 28* |
| 58 |
|
Length, bp | 3,200,899 |
| 5,998,135 |
|
GC content, % | 45 |
| 43 |
|
No. of elements (occurrence per 100 kb) |
|
|
| |
Gene | 9 |
| 55 | (0.9) |
Pseudogene§ | 3 |
| 48 | (0.8) |
EST homolog¶ | 0 |
| 20 | (0.3) |
Transposable element|| | 10** |
| 507 | (8.5) |
Mitochondrial DNAÂÂ | 10** |
| 90 | (1.5) |
Plastid DNAÂÂ | 0 |
| 3 |
|
*Including previously sequenced PAC clones, pMM4G7 and pMM2D3.
 Sequences with BLASTX similarity (E value < 10-5) to known amino acid sequences (excluding transposable elements) or tagged by M. polymorpha EST(s).
ÂThe number of gene families is given, not the actual copy number.
§ Sequences with BLASTX similarity (E value < 10-5) to known amino acid sequences (excluding transposable elements), containing in-frame stop codons or frame-shifts.
¶ Sequences similar to M. polymorpha EST(s) but not identical.
Table 4. Summary of sequenced clones
Clone | Insert size, kb* | Sequence obtained, bp | Phase | No. of contigs | Accession no. | Note | ||||||
YR1 |
|
|
|
|
|
|
|
|
|
| ||
| 470-kb Contig |
|
|
|
|
|
|
|
|
| ||
| pMM24-58G10 | 100 |
| 191,821 |
| 1 | 91 |
| AP009097 |
| ||
| pMM2D3 | 90 |
| 83,249 |
| 1 | 6 |
| AF542555 - AF542560 | Ref. 1 | ||
| pMM23-101C6 | 125 |
| 171,791 |
| 1 | 84 |
| AP009098 |
| ||
| pMM23-360C7 | 139 |
| 273,343 |
| 1 | 147 |
| AP009099 |
| ||
| pMM23-118G8 | 124 |
| 183,849 |
| 1 | 89 |
| AP009100 |
| ||
| pMM23-300E6 | 100 |
| 150,988 |
| 1 | 80 |
| AP009101 |
| ||
| pMM4G7 | 35 |
| 30,156 |
| 1 | 2 |
| AB062742, AB062743 | Ref. 2 | ||
|
|
|
|
|
|
|
|
|
|
| ||
| Representative |
|
|
|
|
|
|
|
|
| ||
| pMM8H2 | 114 |
| 131,890 |
| 1 | 56 |
| AP009102 |
| ||
| pMM23-205B1 | 131 |
| 157,789 |
| 1 | 36 |
| AP009103 |
| ||
| pMM23-205E6 | 80 |
| 87,633 |
| 1 | 39 |
| AP009104 |
| ||
| pMM23-348H6 | 140 |
| 150,661 |
| 1 | 84 |
| AP009105 |
| ||
| pMM23-108B3 | 70 |
| 69,315 |
| 1 | 43 |
| AP009106 |
| ||
| pMM23-145D11 | 115 |
| 177,070 |
| 1 | 94 |
| AP009107 |
| ||
| pMM23-287F10 | 120 |
| 126,991 |
| 1 | 64 |
| AP009108 |
| ||
| pMM23-372H9 | 120 |
| 192,643 |
| 1 | 24 |
| AP009109 |
| ||
| pMM23-123D8 | 125 |
| 145,717 |
| 1 | 79 |
| AP009110 |
| ||
| pMM23-437F9 | 135 |
| 114,822 |
| 1 | 65 |
| AP009111 |
| ||
| pMM16A2 | 70 |
| 121,291 |
| 1 | 66 |
| AP009112 |
| ||
| pMM23-198A1 | 125 |
| 87,433 |
| 1 | 35 |
| AP009113 |
| ||
| pMM23-144H6 | 110 |
| 71,658 |
| 1 | 26 |
| AP009114 |
| ||
| pMM23-77F9 | 70 |
| 54,469 |
| 1 | 19 |
| AP009115 |
| ||
| pMM23-200H8 | 100 |
| 48,897 |
| 1 | 18 |
| AP009116 |
| ||
| pMM23-277H2 | 150 |
| 93,343 |
| 1 | 32 |
| AP009117 |
| ||
| pMM23-863F12 | 70 |
| 38,375 |
| 1 | 15 |
| AP009118 |
| ||
| pMM23-63B5 | 120 |
| 84,022 |
| 1 | 41 |
| AP009119 |
| ||
| pMM23-173B4 | 80 |
| 110,944 |
| 1 | 42 |
| AP009120 |
| ||
| pMM23-291C10 | 45 |
| 22,655 |
| 1 | 9 |
| AP009121 |
| ||
| pMM23-70G5 | 35 |
| 28,084 |
| 1 | 14 |
| AP009122 |
| ||
|
|
|
|
|
|
|
|
|
|
| ||
YR2 |
|
|
|
|
|
|
|
|
|
| ||
| Contig-A |
|
|
|
|
|
|
| AP009095 |
| ||
| pMM23-431A8 | 110 |
| 90,102 |
| 1 | 13 |
|
| Problematic repeats | ||
| pMM23-338F12 | 140 |
| 148,010 |
| 2 | 1 |
|
|
| ||
| pMM23-354E2 | 130 |
| 128,026 |
| 2 | 1 |
|
|
| ||
| pMM23-477F3 | 140 |
| 132,018 |
| 2 | 1 |
|
|
| ||
| pMM23-480H6 | 130 |
| 135,906 |
| 2 | 1 |
|
|
| ||
| pMM23-284C9 | 140 |
| 136,311 |
| 2 | 1 |
|
|
| ||
| pMM19B4 | 150 |
| 141,379 |
| 2 | 2 |
|
|
| ||
| pMM23-90G5 | 125 |
| 104,980 |
| 2 | 7 |
|
|
| ||
| pMM23-355D5 | 140 |
| 139,361 |
| 2 | 1 |
|
|
| ||
| pMM23-350E4 | 150 |
| 159,948 |
| 2 | 1 |
|
|
| ||
| pMM23-166C5 | 125 |
| 123,886 |
| 2 | 1 |
|
|
| ||
| pMM23-97D8 | 115 |
| 108,933 |
| 2 | 1 |
|
|
| ||
| pMM23-165G8 | 120 |
| 122,404 |
| 2 | 1 |
|
|
| ||
| pMM24-26B6 | 100 |
| 103,594 |
| 2 | 1 |
|
|
| ||
| pMM23-627E2 | 140 |
| 145,557 |
| 2 | 2 |
|
|
| ||
| pMM23-636F5 | 130 |
| 118,877 |
| 1 | 8 |
|
| Problematic repeats | ||
| pMM23-729D7 | 150 |
| 127,651 |
| 2 | 4 |
|
| Problematic repeats | ||
| pMM24-38C9 | 110 |
| 125,205 |
| 1 | 3 |
|
| Extensive deletion | ||
| pMM23-547D3 | 110 |
| 80,900 |
| 2 | 2 |
|
|
| ||
| pMM23-666C5 | 100 |
| 108,374 |
| 2 | 1 |
|
|
| ||
| pMM23-217D8 | 120 |
| 117,581 |
| 2 | 1 |
|
|
| ||
| pMM23-354D1 | 110 |
| 105,780 |
| 2 | 1 |
|
|
| ||
| pMM23-212C1 | 115 |
| 132,531 |
| 2 | 1 |
|
|
| ||
| pMM23-507A1 | 105 |
| 105,648 |
| 2 | 1 |
|
|
| ||
| pMM23-589F12 | 140 |
| 121,325 |
| 1 | 26 |
|
| Problematic repeats | ||
| pMM23-513H5 | 115 |
| 148,559 |
| 2 | 1 |
|
|
| ||
| pMM23-104E4 | 140 |
| 133,635 |
| 3 | 1 |
|
|
| ||
| pMM23-179A5 | 150 |
| 132,013 |
| 1 | 13 |
|
| Problematic repeats | ||
| pMM23-46E1 | 170 |
| 177,142 |
| 2 | 1 |
|
|
| ||
| pMM23-414C2 | 85 |
| 110,328 |
| 2 | 3 |
|
|
| ||
| pMM23-222E12 | 100 |
| 95,727 |
| 1 | 7 |
|
|
| ||
| pMM23-657H8 | 110 |
| 101,118 |
| 1 | 4 |
|
|
| ||
| pMM23-537D7 | 90 |
| 97,830 |
| 1 | 9 |
|
|
| ||
|
|
|
|
|
|
|
|
|
|
| ||
| Contig-B |
|
|
|
|
|
|
| AP009096 |
| ||
| pMM23-359F1 | 130 |
| 149,396 |
| 2 | 1 |
|
|
| ||
| pMM24-95E6 | 90 |
| 96,919 |
| 2 | 1 |
|
|
| ||
| pMM23-635B8 | 130 |
| 140,583 |
| 2 | 1 |
|
|
| ||
| pMM23-408G1 | 135 |
| 135,387 |
| 3 | 1 |
|
|
| ||
| pMM23-265H6 | 130 |
| 123,253 |
| 3 | 1 |
|
|
| ||
| pMM23-420F5 | 135 |
| 132,477 |
| 3 | 1 |
|
|
| ||
| bridgeM420F5-M286B9 | - |
| 3,354 |
| - | - |
|
|
| ||
| pMM23-286B9 | 150 |
| 152,281 |
| 3 | 1 |
|
|
| ||
| pMM23-578C3 | 130 |
| 128,465 |
| 3 | 1 |
|
|
| ||
| pMM23-88B7 | 125 |
| 123,297 |
| 3 | 1 |
|
|
| ||
| gap |
|
| (70 kb) |
|
|
|
|
|
| ||
| pMM23-47H4 | 150 |
|
|
|
|
|
|
| Not sequenced due to extensive deletion during culture | ||
| pMM23-169B8 | 70 |
| 66,859 |
| 2 | 1 |
|
|
| ||
| bridgeM169B8-M402H5 | - |
| 5,158 |
| - | - |
|
|
| ||
| pMM23-402H5 | 130 |
| 122,996 |
| 2 | 1 |
|
|
| ||
| pMM23-508E5 | 160 |
| 152,005 |
| 2 | 1 |
|
|
| ||
| pMM23-435B3 | 90 |
| 97,507 |
| 2 | 2 |
|
|
| ||
| pMM23-526D2 | 180 |
| 185,531 |
| 2 | 1 |
|
|
| ||
| pMM23-54C8 | 105 |
| 116,519 |
| 2 | 1 |
|
|
| ||
| pMM23-516B5 | 125 |
| 135,413 |
| 2 | 5 |
|
|
| ||
| pMM23-468B3 | 75 |
| 72,725 |
| 2 | 2 |
|
|
| ||
| pMM23-530A10 | 145 |
| 146,690 |
| 1 | 5 |
|
| Problematic repeats | ||
| pMM24-22H6 | 155 |
| 106,302 |
| 1 | 5 |
|
| Problematic repeats | ||
| pMM23-313C1 | 115 |
| 121,246 |
| 2 | 1 |
|
|
| ||
| gap |
|
| (4 kb) |
|
|
|
|
|
| ||
| pMM23-546H7 | 110 |
| 126,876 |
| 1 | 5 |
|
| Problematic repeats | ||
| pMM23-41H12 | 120 |
| 122,191 |
| 2 | 1 |
|
|
| ||
| pMM53F7 | 120 |
| 38,431 |
| 1 | 6 |
|
| Problematic repeats | ||
| pMM23-45F5 | 85 |
| 75,384 |
| 2 | 1 |
|
|
| ||
| bridgeM45F5-M632E5 | - |
| 3,587 |
| - | - |
|
|
| ||
| pMM23-632E5 | 125 |
| 119,747 |
| 1 | 4 |
|
| Problematic repeats |
*Estimated by pulse-field gel electrophoresis.
ÂTotal length of contigs after assembling. Sequences of singlets are also included for YR1 clones.
ÂDefinition by Venter et al. (3).
1. Ishizaki K, Shimizu-Ueda Y, Okada S, Yamamoto M, Fujisawa M, Yamato KT, Fukuzawa H, Ohyama K (2002) Nucleic Acids Res 30: 4675-4681.
2. Okada S, Sone T, Fujisawa M, Nakayama S, Takenaka M, Ishizaki K, Kono K, Shimizu-Ueda Y, Hanajiri T, Yamato KT, et al. (2001) Proc Natl Acad Sci USA 98:9454-9459.
3. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA, et al. (2001) Science 291: 1304-1351.
Table 5. Genes identified on the M. polymorpha Y chromosome
Name | Similarity to UniProt entries |
|
|
| Similarity to InterPro domains | Presence in female | Expression | ESTs | ||||
| Accession | Annotation | Score | E value | Organism |
|
| sexual organ | thallus |
| ||
|
|
|
|
|
|
|
|
|
|
| ||
YR1 |
|
|
|
|
|
|
|
|
|
| ||
M2D3.1 | Q9SB62 | Putative alliin lyase | 998 | 7e-97 | Arabidopsis thaliana | Alliinase EGF-like | + | + | + |
| ||
M2D3.2 | Q6L491 | Putative AP2 domain protein | 413 | 8e-35 | Oryza sativa | Transcriptional factor B3 | + | + | + |
| ||
M2D3.3 | Q9C950 | Hypothetical protein T7P1.17 | 247 | 3e-18 | Arabidopsis thaliana |
| + | - | - |
| ||
M2D3.4 | Q8S6P9 | Hypothetical protein OJ1004_F02.2 | 223 | 9e-16 | Oryza sativa |
| - | + | + |
| ||
M2D3.5 (ORF162) | Q5Z8R1 | Putative VIP2 protein | 174 | 2e-10 | Oryza sativa | Ring-finger | - | + | - |
| ||
M2D3.6 | Q658A2 | Transcriptional co-repressor-like | 620 | 5e-57 | Oryza sativa | Paired amphipathic helix | + | + | + |
| ||
M8H2.3 | Q7XIM6 | Glyoxalase | 235 | 2e-16 | Oryza sativa | Glyoxalase/extradiol ring-cleavage dioxygenase | + | + | + |
| ||
M205B1.4 | GLH4_CAEEL | Putative ATP-dependent RNA helicase | 167 | 1e-7 | Caenorhabditis elegans | Zn-finger, CCHC type | + | + | - |
| ||
M123D8.8 | Q7EZ84 | Putative calcium-transporting ATPase 8 | 107 | 8e-6 | Oryza sativa | Calcium ATPase, transduction domain A | + | + | + |
| ||
|
|
|
|
|
|
|
|
|
|
| ||
YR2, Contig-A |
|
|
|
|
|
|
|
|
|
| ||
M338F12.2 | Q9SU70 | Hypothetical protein | 135 | 7e-31 | Arabidopsis thaliana | Ring-finger, TonB box | - | + | + |
| ||
M338F12.1 | Q94G52 | Calcium-dependent protein kinase (Fragment) | 784 | 0.0 | Funaria hygrometrica | Serine/threonine protein kinase, Calcium-binding EF hand | - | + | + | rlwa06f07 | ||
M338F12.3 | Q9FZ45 | F6I1.14 protein (Hypothetical protein) | 490 | 1e-137 | Arabidopsis thaliana |
| + | + | + | rlwb31d23 | ||
M480H6.1 | Q9U4M4 | 7138.7 | 176 | 4e-8 | Leishmania major |
| - | + | - |
| ||
M19B4.1-1 | Q7QSN7 | GLP_327_32944_31097 | 226 | 6e-58 | Giardia lamblia ATCC 50803 | - | - | - |
| |||
M19B4.1-2 |
|
|
|
|
|
| - | - | - |
| ||
M19B4.1-3 |
|
|
|
|
|
| - | + | - | rlwb09e07, rlwb25b12 | ||
M19B4.2 | Q5B778 | Hypothetical protein | 173 | 2e-11 | Aspergillus nidulans | Leucine-rich repeat | - | + | - |
| ||
M355D5.3 |
| M. polymorpha EST, rlwb44o19 |
|
|
|
| + | + | + | rlwb44o19 | ||
M355D5.1 | O82677 | Retinoblastoma-related protein | 108 | 1e-23 | Chenopodium rubrum |
| + | + | - |
| ||
M355D5.4 |
| M. polymorpha EST, rlwb16g05 |
|
|
|
| - | + | - | rlwb16g05 | ||
M355D5.2 | M3K1_ARATH | Mitogen-activated protein kinase kinase kinase 1 | 84 | 4e-16 | Arabidopsis thaliana | Serine/threonine protein kinase | - | + | - |
| ||
M350E4.4 | Q9LFV5 | Hypothetical protein (Fimbriata-like) | 145 | 9e-34 | Arabidopsis thaliana | Cyclin-like F-box, Galactose oxidase | - | + | - |
| ||
M166C5.1 | SFR1_ARATH | Pre-mRNA splicing factor SF2 (SR1 protein) | 248 | 3e-65 | Arabidopsis thaliana | RNA-binding region RNP-1 | - | + | + | PTA2.1754.C1, PTA2.2367.C1, rlwb48e01 | ||
M166C5.2 | Q8W555 | Putative RNA-binding protein | 380 | 1e-104 | Arabidopsis thaliana | RNA-binding region RNP-1 | - | + | + | PTA2.842.C1 | ||
M166C5.5 | Q7XR01 | OSJNBa0015K02.12 protein | 482 | 1e-135 | Oryza sativa | ABC1 family (ly-rich, unrelated to ABC transporters.) | - | + | - | rlwb48j21 | ||
M97D8.2 | Q8H6S5 | CTV.2 | 188 | 2e-9 | Poncirus trifoliata |
| + | + | + |
| ||
M26B6.2 | Q9FEA1 | Anthocyanin 1 | 183 | 6e-9 | Petunia hybrida |
| - | + | + |
| ||
M636F5.4 | Q8L7G1 | Hypothetical protein At4g25330 | 190 | 1e-9 | Arabidopsis thaliana |
| + | + | + |
| ||
M547D3.2 | Q9FYC5 | Hypothetical protein | 498 | 1e-139 | Arabidopsis thaliana | Serine/threonine protein kinase | + | + | + | rlwa21f14, PTA2.1527.C1 | ||
M547D3.1 | Q9FJH0 | GTP-binding protein, ras-like (At5g60860) | 368 | 1e-101 | Arabidopsis thaliana | Ras GTPase | - | + | + | PTA2.1434.C1, PTA2.1434.C2 | ||
M666C5.4 | Q9FZJ2 | F17L21.22 | 180 | 1e-8 | Arabidopsis thaliana |
| + | + | + |
| ||
M666C5.1 | Q9GYB0 | Possible CG15429 protein | 168 | 6e-41 | Leishmania major | Cytochrome b5 | - | + | - | rlwb01o21, rlwb16a08 | ||
M217D8.1 | Q8LJ68 | Protein phosphatase 2C-like protein | 244 | 7e-64 | Oryza sativa | Protein phosphatase 2C-like | - | + | + | rlwa05l09, PTA2.1601.C1 | ||
M217D8.2 | O23057 | BAC IG005I10 | 991 | 0.0 | Arabidopsis thaliana |
| - | + | + | rlwb23f14, PTA2.3279.C1 | ||
M217D8.4 |
| M. polymorpha EST, rlwb18g14 |
|
|
|
| - | + | + | rlwb18g14 | ||
M217D8.3 |
| M. polymorpha EST, PTA2.2137.C1 |
|
|
|
| - | + | + | PTA2.2137.C1 | ||
M354D1.2 | O82317 | At2g25800 protein | 82 | 9e-28 | Arabidopsis thaliana |
| - | + | + |
| ||
M354D1.4 | O22792 | At2g33420 protein | 180 | 1e-8 | Arabidopsis thaliana |
| + | + | - |
| ||
M354D1.5 | O65224 | F7N22.7 protein | 156 | 9e-6 | Arabidopsis thaliana |
| + | + | - |
| ||
M212C1.4 | P54674 | Phosphatidylinositol 3-kinase 2 | 163 | 1e-6 | Dictyostelium discoideum | Phosphatidylinositol 3- and 4-kinase | - | + | - |
| ||
M104E4.1 | Q9ST26 | Nonphototrophic hypocotyl 1a | 192 | 2e-47 | Oryza sativa | PAC motif | - | + | + | rlwb06g04 | ||
M537D7.2-1 | Q761Z7 | BRI1-KD interacting protein 117 (Fragment) | 193 | 4e-10 | Oryza sativa | Zn-knuckle | + | + | + |
| ||
M537D7.2-2 | Q761Z7 | BRI1-KD interacting protein 117 (Fragment) | 193 | 4e-10 | Oryza sativa | Zn-knuckle | + | + | - |
| ||
|
|
|
|
|
|
|
|
|
|
| ||
YR2, Contig-B |
|
|
|
|
|
|
|
|
|
| ||
M359F1.1 | Q9FKL6 | Calmodulin-binding protein | 553 | 1e-156 | Arabidopsis thaliana |
| - | + | + |
| ||
M95E6.1 | HSF3_ARATH | Heat shock factor protein 3 | 376 | 1e-103 | Arabidopsis thaliana | Heat shock factor (HSF)-type (DNA-binding) | - | + | + | rlwb39a18 | ||
M408G1.2 | BSL2_ARATH | Serine/threonine protein phosphatase | 1419 | 0.0 | Arabidopsis thaliana | Serine/threonine protein phosphatase (BSU1 type), Kelch repeat, Metallophosphoesterase | - | + | + |
| ||
M286B9.1 | Q9FIG5 | Gb|AAF18661.1 | 200 | 2e-50 | Arabidopsis thaliana |
| - | + | + | rlwb43f22 | ||
M420F5.1 | Q9FIG5 | Putative C2 domain-containing protein | 159 | 3e-6 | Oryza sativa |
| - | + | + |
| ||
M286B9.2 | Q41102 | Phaseolin G-box binding protein PG2 (Fragment) | 169 | 3e-41 | Phaseolus vulgaris |
| - | + | + | M01F020 | ||
M578C3.1 | Q9LJ30 | EST AU082118(E20525) corresponds to a region of the predicted gene | 770 | 0.0 | Oryza sativa | ARM repeat, Importin-beta N-terminal domain | - | + | + |
| ||
M88B7.1 | Q94B66 | ZIM-like 1 protein | 256 | 1e-67 | Arabidopsis thaliana | ZIM motif, CCT motif, GATA-type Zn-finger | - | + | + |
| ||
M402H5.1 | Q8RWB1 | Hypothetical protein (At5g37370) | 421 | 1e-117 | Arabidopsis thaliana | PRP38 family | - | + | + |
| ||
M402H5.5 | Q84YI1 | Polyphenol oxidase (EC 1.10.3.1) | 90 | 2e-17 | Trifolium pratense | Tyrosinase, Di-copper centre-containing domain | + | + | + |
| ||
M402H5.6 |
| M. polymorpha EST, rlwb26I22 |
|
|
|
| + | + | + | rlwb26I22 | ||
M508E5.1 | PACG_MOUSE | Parkin coregulated gene protein homolog | 269 | 2e-71 | Mus musculus | ARM repeat | - | + | - | PTA2.3045.C1 | ||
M526D2.2 | Q7T0Y4 | MGC68877 protein | 192 | 4e-10 | Xenopus levis |
| - | + | - |
| ||
M468B3.1 | U7I1_HUMAN | Ubiquitin conjugating enzyme 7 interacting protein 1 | 69 | 2e-11 | Homo sapiens |
| - | + | - | PTA2.1947.C1 | ||
M530A10.2 | Q9FX33 | Hypothetical protein T9L24.41 (Hypothetical protein At1g73380) | 187 | 2e-9 | Arabidopsis thaliana |
| + | + | + |
| ||
M530A10.3 | ALF1_PEA | Fructose-bisphosphate aldolase, cytoplasmic isozyme 1 | 545 | 1e-154 | Pisum sativum | Fructose-bisphosphate aldolase class-I | - | + | - | M01D005 | ||
M22H6.1 | Q7RIA9 | Immediate early protein homolog (Fragment) | 82 | 2e-15 | Plasmodium yoelii yoelii |
| + | + | + |
| ||
M313C1.2 | O04892 | Cytochrome P450 monooxygenase (Fragment) | 166 | 5e-7 | Nicotiana tabacum |
| + | + | + |
| ||
M41H12.2 | Q83GU1 | Hypothetical protein | 158 | 4e-6 | Tropheryma whipplei |
| + | + | - |
| ||
M41H12.1 | Q8LHF0 | NADH-dependent glutamate synthase | 217 | 5e-56 | Oryza sativa | Adrenodoxin reductase, Pyridine nucleotide-disulphide oxidoreductase (class-II) | - | - | - |
| ||
M45F5.1 | Q8VZD4 | At1g28320/F3H9_2 | 115 | 2e-25 | Arabidopsis thaliana | Trypsin-like serine proteases | + | + | + |
|
Table 6. EST homologs
EST homologue | EST |
Contig-A |
|
M431A8.1 | rlwb11l09 |
M354E2.1 | rlwb32a18 |
M350E4.5 | PTA2.2869.C1 |
M627E2.1 | rlwa36n09 |
M636F5.2 | rlwa22o10 |
M666C5.2 | PTA2.1055.C1 |
M354D1.1 | PTA2.1712.C2 |
M212C1.3 | rlwb48d02 |
M513H5.3 | PTA2.989.C1 |
M222E12.2 | rlwb44o19 |
M657H8.1 | rlwb48e16 |
M537D7.1 | rlwb19i13 |
Contig-B |
|
M286B9.3 | PTA2.3014.C1 |
M313C1.1 | PTA2.2164.C1 |
M359F1.3 | rlwa04i13 |
M402H5.7 | rlwb01b07 |
M408G1.4 | rlwa19e23 |
M508E5.2 | PTA2.1386.C1 |
M516B5.1 | M01I026 |
M546H7.1 | rlwb44o19 |
Table 7. Putative spermatogenic genes identified on the Y chromosome of M. polymorpha
Marchantia | Chromosome | Human (UniGene ID) | Chromosome | Mouse (UniGene ID) | Chromosome | Chlamydomonas (JGI ID) |
| Description | Testis specific/biased expression in human and mouse* | |
|
|
|
|
|
|
|
|
|
| |
M19B4.1 | Y | ODF3 (Hs.350949) | 11 | shippo1 (Mm.56404) | 7 | C_490060 |
| Outer dense fiber of sperm tails 3 | Yes | |
M508E5.1 | Y | PACRG (Hs.25791) | 6 | Pacrg (Mm.18889) | 17 | C_20334 |
| PARK2 co-regulated | Yes | |
M666C5.1 | Y | FLJ32499 (Hs.27475) | 17 | Gm740 (Mm.371762) | 11 | C_220104 |
| Hypothetical protein with cytochrome b5 domain | No | |
M480H6.1 | Y | DKFZp434I099 (Hs.513635) | 16 | Gm770 (Mm.277112) | 8 | Not detected |
| Hypothetical protein | Yes | |
M526D2.2 | Y | NYD-SP28 (Hs.393714) | 12 | Gm1040 (Mm.256588) | 5 | C_1160041 |
| Hypothetical protein | Yes | |
M468B3.1 | Y | TRIAD3 (Hs.487458) | 7 | MGI:1344349 (Mm.362087) | 5 | Not detected |
| Ubiquitin conjugating enzyme 7 interacting protein 1 | No |
*Based on UniGene's EST Profile Viewer at National Center for Biotechnology Information.
Table 8. Primers used for degenerate PCR and X-linkage analysis
Gene | Degenerate primers |
|
| Primers for X-linkage analysis | ||
| Target sequence | Name | Sequence* |
| Name | Sequence |
M338F12.1 | PPFWAET | M338F12.1DF1 | CCNCCNTTYTGGGCNGARAC | M338F12.1FF1 | GAAGGCTTTGCGGATCATAG | |
| GTDWRKA | M338F12.1DR1 | GCYTTNCKCCARTCNGTNCC |
| M338F12.1FR1 | TATCATTCGGGCCTAAGTCG |
|
|
|
|
|
|
|
M547D3.1 | TIGVEFAT | M547D3.1DF1 | ACNATHGGNGTNGARTTYGCNAC | M547D3.1FF1 | GCATCAATGTGGACAGCAAA | |
| TFENVERW | M547D3.1DR1 | CCANCKYTCNACRTTYTCRAANGT | M547D3.1FR1 | TCATATACCAAGAGAGCTCCTACG | |
|
|
|
|
|
|
|
M408G1.2 | AAEAEAI | M408G1.2DF2 | GCNGCNGARGCNGARGCNAT |
| M408G1.2FF2 | TGTTGTAGCTGCGGAGTCTG |
| ECVMDGFE | M408G1.2DR3 | CRAANCCRTCCATNACRCAYTC |
| M408G1.2FR2 | CTGCGGTATCACAAAGCTCA |
|
|
|
|
|
|
|
M88B7.1 | PPEKVQAV | M88B7.1DF1 | CCNCCNGARAARGTNCARGCNGT | M88B7.1FF1 | ACTTCCAGCCCGCATGAATA | |
| GLMWANKG | M88B7.1DR1 | CCYTTRTTNGCCCACATNARNCC | M88B7.1FR1 | TCCGAACAGTGTATCGAATTTTT | |
|
|
|
|
|
|
|
M402H5.1 | MKLTVKQM | M402H5.1DF2 | ATGAARYTTACNGTNAARCARATG | M402H5.1FF1 | CACGAATTCCCGTACCTGTT | |
| FGQRAPHR | M402H5.1DR2 | CGATGAGGAGCACGTTGTCCRAA | M402H5.1FR1 | AAACCGACAATGCAGCTTTC | |
* N, A+C+G+T; H, A+C+T; R, A+G; Y, C+T; K, G+T. |
|
|
|
Table 9. Primers used for mapping Contig-A and Contig-B
| Forward |
| Reverse | Product |
| ||
| Name | Sequence (5'->3') |
| Name | Sequence (5'->3') | length, bp | Template |
Contig-A | CA-L02F | TTCCCAGGACTCATTCAAGC | CA-L02R | GAAAACCGCAAGAACAAGGA | 4,000 | pMM23-431A8 | |
| CA-L03F | TTCTCCACCGTTTCTGTTCA | CA-L03R | ATGGGTAACTGTTGCGCTTG | 4,011 | pMM23-338F12 | |
| CA-L05F | AAGCCGTAGAAAGGAGATAAGGA | CA-L05R | ACTTTGCATGAAAGCGGAAT | 4,009 | pMM23-354E2 | |
| CA-L07F | GATCCCTGATTTTTGCGTGT | CA-L07R | TCGAAAGCAACAATTTGACG | 3,002 | pMM23-338F12 | |
| CA-L08F | TCCAGGGGTATTGCTACAGG | CA-L08R | CCGAAGACCAAAACAACCTC | 3,001 | pMM23-338F12 | |
| CA-L09F | CCATGTACTTTTACCCCGTCA | CA-L09R | GGAGGAAACGTACCAAATCG | 3,004 | pMM23-338F12 | |
| CA-L10F | ATTCGCGCCTATGTTGAGTT | CA-L10R | TGAGGAAAAGTACGGATCACAA | 2,000 | pMM23-477F3 | |
| CA-L11F | CATTTCTCCTCCCCTAGCAA | CA-L11R | ATTCTTGGGCCTTGGATTCT | 2,000 | pMM23-338F12 | |
| CA-L12F | TGGACTGCATTCGATTTTGA | CA-L12R | GCGGCGTACAGAAGTACCTG | 2,000 | pMM23-338F12 | |
Contig-B | CB-L01F | GCCTTTAGCAAGTGCCTACG | CB-L01R | TCGCATGAAAGTCAGAGGTG | 6,002 | pMM23-408G1 | |
| CB-L03F | CCTGCGAATTCCAAGTTCAT | CB-L03R | CTAGCGCGAGTTACGGTGAT | 4,001 | pMM24-95E6 | |
| CB-L04F | ATTATTGAGCCGCCAATGTC | CB-L04R | TGTAGACTGCGCCACAAACT | 3,002 | pMM23-408G1 | |
| CB-L05F | TGTGAAAGTGGCATACGAGAA | CB-L05R | CACAAAAGCTTTCAATGACACA | 3,000 | pMM23-359F1 | |
| CB-L06F | AACCACGAGGTTCGTGAGAG | CB-L06R | GGATATCGGTGGCTGACTGT | 3,002 | pMM23-359F1 | |
| CB-L07F | GCAGTGCTTGCGAACTCTTA | CB-L07R | AAAGCTGGTTGAACGTAGCC | 3,000 | pMM23-408G1 | |
| CB-L08F | TTATCACACCAAGTGTCGCAAT | CB-L08R | TGATAGCATCAATCATGCAAGG | 3,000 | pMM24-95E6 | |
| CB-L10F | CACGCACACATGGTAATTGA | CB-L10R | TCAATGCCTTTTCATCTGCTT | 3,000 | pMM23-359F1 | |
| CB-L11F | TTTATCGTTCCCTTCTTGTGG | CB-L11R | CTTCGACGGTGTGAGTGAAA | 2,002 | pMM23-408G1 | |
| CB-L12F | TTTGCTTGTCCAAGTTGCAG | CB-L12R | TTGCCTCTAAAGCCCACAAC | 2,012 | pMM23-635B8 |
SI Text
The Sequence of YR1
The accumulation of the 2.4-kb BamHI repeat family in YR1 prevents construction of its physical map by chromosome walking and forced us to exploit alternative strategies. First, altogether 429 PAC clones were isolated by colony hybridization by using the 2.4-kb BamHI repeat as a probe. One of these clones, pMM4G7, was initially sequenced (1). Using pMM4G7 as a starting clone, we obtained a contig map of 470 kb (SI Fig. 4). The map, however, could not be further extended because of the extensive repetitive nature of YR1. Therefore, to investigate the net sequence of YR1 beyond this initial 470-kb contig, a set of clones that are derived from YR1 but do not overlap each other was selected as follows.
The 429 Y-chromosomal clones were first clustered into groups by a fingerprinting method to identify clones that cover different portions of YR1. Assembly of a group was verified by comparing restriction profiles of its member clones in the same gel, and one representative clone was selected from each group. Consequently, 271 of the 429 clones were assembled into 22 groups by fingerprinting with either BamHI or DraI. The remaining 158 clones were rejected because of limited insert lengths, possible chimerism, and similarities to the 470-kb contig (SI Fig. 4).
In one of the 22 groups, all of its 29 member clones predominantly consisted of one of the Y chromosome-specific repeat units, the 2.2-kb BamHI repeat, which differs from the 2.4-kb BamHI repeat by a few missing small reiterated sequence motifs (2). Because the sum of the insert sizes of these 29 clones was »1.9 Mb and coverage of the male genomic library constructed was »7 times (2), the 2.2-kb BamHI repeat clusters presumably account for »270 kb of YR1.
The other groups provided 21 representative clones with a total insert size of 2.1 Mb. The 470-kb contig, the 21 representative clones (2.1 Mb), and the clusters of the 2.2-kb BamHI repeat (270 kb) collectively cover 2.8 Mb of YR1, which is only 70% of its estimated physical size of 4 Mb (SI Fig. 4). This apparent shortfall in sequence coverage can be largely explained by convergence of highly conserved repeats into a smaller number of representative sequences, which then suggests that the sequence coverage for YR1 is much higher than it appears. Formally, we cannot rigorously exclude that entirely unrelated sequences which escaped our screening are additionally present in the YR1 region.
To gather the sequence of YR1, five clones of the 470-kb contig and the 21 representative clones were shotgun sequenced (SI Fig. 4, Table 1, SI Table 3, and SI Table 4). Our efforts to sequence the YR1 representative clones initially focused on acquisition of phase 1 sequences of each PAC clone, i.e., unordered assemblies of sequence contigs (3).
The average number of sequence contigs generated by the sequence assembler was 50 per PAC clone. The sum of the contigs was often inconsistent with the sizes estimated by gel electrophoresis, reflecting misassemblies caused by the multitude of repeats (SI Table 4).
Physical Mapping and Sequencing of YR2
In addition to the six Y-linked markers previously isolated by RDA (4), four more Y-linked RDA markers were obtained. Because these 10 markers showed no similarity to any PAC clone of YR1 by BLAST analyses and gave no products in the YR1 PCR assays, we concluded that all these RDA markers belong to YR2. With these markers we initiated chromosome walking and constructed two contigs, contig A (3.5 Mb) and contig B (2.6 Mb). The combined coverage of the two contigs is »6 Mb, and thus consistent with the cytologically estimated size of YR2 of 6 Mb (1), suggesting that contig A and contig B together cover most of YR2. None of the member clones of contig A or contig B appear in the 429 PAC clones of YR1.
contig A is an assembly of 245 PAC clones, whereas contig B consists of 168 PAC clones and one PCR-amplified fragment (which covers a gap in the genomic PACs). Four and six of the 10 Y-linked markers were assigned to contig A and contig B, respectively. During chromosome walking, 137 of >300 primer pairs turned out to be male-specific, confirming that contig A and contig B represent segments of the Y chromosome, and simultaneously indicating that YR2 is a composite of regions specific to the male genome and of sequences shared by other chromosomes. The two contigs could not be further extended for the following reasons: one of the terminal regions of contig A (the end terminating with pMM23-431A8 in SI Fig. 4) shows similarity to retroelements and is highly repetitive also on autosomes. The other terminal regions of contig A and contig B are unique Y-linked sequences, but no further sequences were present in the male genomic library.
Final assembly status for each PAC clone is summarized in SI Table 4. One of the 26 clones in contig B, pMM23-47H4 could not be sequenced, because its DNA was highly unstable and lost segments of its insert randomly, leaving an »70-kb unsequenced gap. The overall gene organization of YR2 is illustrated in Fig. 2.
Acquisition of ESTs
To facilitate gene identification in the Y chromosome, 32,277 5' ESTs were newly generated from thalli and sexual organs of male plants, in addition to previously collected 1,163 nonredundant EST sequences (5). These new ESTs were assembled to generate 4,074 clusters and 6,409 singlets. All ESTs were clustered and resulted in 10,483 nonredundant sequences (SI Data Set 1). In database searches with this set of nonredundant tags, 6,370 (61%) showed BLASTX similari (E value of 1 ´ 10-5 or lower) to amino acid sequences in the public databases. When the 10,483 ESTs were compared to the Y genomic sequences, none of the ESTs tagged YR1 sequences, whereas 31 tagged YR2 sequences (SI Table 5). Five of the 31 tagged sequences do not show similarity to sequences in the public databases. An additional 20 ESTs aligned to portions of the YR2 sequences and were classified as EST homologs because of obvious discrepancies in their alignments (SI Table 6).
About 39% of the M. polymorpha ESTs do not show similarities to known sequences registered in the public databases: assuming that these 5' tags contain at least part of bona fide coding sequences, this suggests that 39% of the coding sequences in M. polymorpha are not detectable by the BLAST approach. This assumption in turn suggests that the analogous similarity search with the genomic sequences of the Y chromosome may have left a similar proportion of the genes undetected. This extrapolation would then raise the total number of genes in YR2 to ~80, including the 48 genes for which similarities were positively identified by the BLAST search against all genes in other organisms .
Genes Potentially Involved in Male Reproductive Functions
The 13 YR2 genes that are present only in the male genome and in addition are expressed in sexual organs but not in thalli are: M19B4.1, M508E5.1, M666C5.1, M480H6.1, M526D2.2, M468B3.1, M355D5.1, M19B4.2, M355D5.2, M350E4.4, M166C5.5, M212C1.4, and M530A10.3. Six of these genes encode proteins for which homologs are found in animals but not in angiosperms as listed in SI Table 7 with the annotations for the homologous genes.
The animal counterparts of the M19B4.1 and M508E5.1 genes have been shown to play a role in spermatogenesis. Mouse and human homologs of M19B4.1, Shippo1 and ODF3, respectively, are localized in sperm flagella (6). The homolog in the flagellated green alga Chlamydomonas reinhardtii has also been assigned to the flagellar machinery (7), suggesting that the M19B4.1 protein is one of the components of sperm flagella. Three copies of the M19B4.1 gene are found in a region of »140 kb. M19B4.1-3 encodes 24 units of a reiterated motif that is also conserved in the mouse and human homologs (SI Fig. 5), while the others, M19B4.1-1 and -2, encode only the C-terminal 12 units of M19B4.1-3, and thus are presumably truncated copies of M19B4.1-3.
The M508E5.1 gene codes for a protein similar to the mouse Parkin-coregulated gene (Pacrg; ref. 8), which is present in mature sperm and required for spermiogenesis in the mouse. Homologs of Pacrg are found in a number of metazoa and flagellated protozoa, suggesting that Pacrg and its homologs, including M508E5.1, are required for flagella formation.
Four more genes have animal counterparts whose functions are as yet unknown (SI Table 7). Their distribution among other organisms and expression patterns suggest that at least three of them participate in male functions. M666C5.1, for example, has homologs in sperm-producing animals and flagellated protozoa, indicating that this group of genes plays some role in flagellar formation and thus in spermatogenesis. The mouse and human homologs of the M480H6.1 and M526D2.2 genes, annotated as hypothetical proteins, are preferentially expressed in testis (UniGene's EST ProfileViewer at the National Center for Biotechnology Information; ref. 9), suggesting that these proteins are also involved in spermatogenesis. Since homologs of these genes were not detected in the currently available C. reinhardtii sequences (Chlamydomonas reinhardtii v2.0, DOE Joint Genome Institute), they might participate in male functions other than flagella formation. It is unclear if the mouse and human homologs of M468B3.1 have male-fertility functions because they show nontestis specific expression patterns, and no homolog has been found in C. reinhardtii thus far. It should also be noted that all but one (M468B3.1) of the six putative spermatogenic genes are detected in the draft genome sequences of the moss Physcomitrella patens, which produces flagellated sperms like M. polymorpha (data not shown).
Repeats, Transposable Elements, and Insertion of Organellar DNAs
Among a variety of putative transposable elements, one class contains a DNA methyltransferase domain (DNMT) associated with the polyprotein typical for retrotransposons, and was designated as DNMT-containing repetitive element (DRE). The DNMT of DRE shows the highest similarity to that of the mammalian DNA methyltransferase 3A (DNMT3a), which is required for imprinting of germ cells (10). DRE is also detected in the female genome and is expressed both in thalli and sexual organs (data not shown). There are only few reports to date on DNA methyltransferases associated with transposable elements (11, 12). The activity of transposable elements is often affected by the degree of methylation (13), but it remains unclear at present whether the DNA methyltransferase domain of DRE is indeed beneficial for its transposition.
Insertions of mitochondrial DNA (14) were detected in both YR1 and YR2 (SI Table 3 and SI Fig. 6). In striking contrast, no insertion of plastid DNA (15) is detected in YR1, and only very few events are found in YR2. No such bias is observed in the completely sequenced genomes of two flowering plant species, Arabidopsis thaliana and Oryza sativa. Because the sequence information on M. polymorpha autosomes is limited, it is presently unclear whether this difference is a characteristic of the Y chromosome or of the species M. polymorpha.
In the YR1 sequences, a portion of the mitochondrial DNA, positions 136,796-136,911 (NC_001660; a fragment containing the intron 4 of the cox1 gene), appears most frequently with ~94% identity, suggesting that the insertion event had occurred before the amplification of these YR1 sequences. Another mitochondrial sequence spanning positions 176,138-176,302 (a fragment of ORF277 or ccmB) was found in fewer YR1 representative clones. This fragment has higher similarity (98%) to the original mitochondrial sequence, suggesting that this insertion event occurred later than the more highly amplified insertion event of the cox1 gene. Concerning the evolution of the Y chromosome, these observations imply that the YR1 domain has been increasing its repetitive nature progressively, inadvertently coamplifying the mitochondrial insertions. In contrast, no such bias in relative abundance, and in the degree of divergence is observed for insertions in YR2. The different stages of divergence of the mitochondrial sequences suggest that these insertion events occurred at various time points during the evolution of the Y chromosome.
Comparison of the Structure of the M104E4.1 Gene and Its X-Linked Homolog, the F62B12.1 Gene
Using one of the X-linked RDA markers, rbf73 (4), an X-linked PAC clone, pMF28-62B12, was isolated from the female genomic library (2) and sequenced (GenBank accession no. AB272581). Similarity search detected a gene containing LOV (Light, Oxygen, and Voltage) domain (16), F62B12.1. By RACE, cDNA sequences of the F62B12.1 and M104E4.1 genes were determined. The deduced amino acid sequences of the X- and Y-linked genes show an overall similarity of 44.5%, with two conserved regions of higher similarity, 91.0% for the first 64 amino acid residues and 83.7% for the LOV domain (SI Fig. 7). The four intron-insertion sites in the coding sequences are identical between the two genes.
Materials and Methods
DNA Fingerprinting.
DNAs of PAC clones derived from YR1 were digested with either BamHI or DraI and fractionated by gel electrophoresis, and the resulting gel images were digitized for band identification by the software Image (17). Restriction fragments <850 bp and >48.5 kb were ignored in the band identification process. Band identification data were collected and transferred to the software FPC for automated grouping. Both Image and FPC were downloaded from www.sanger.ac.uk/Software (18). The parameters for grouping of the BamHI- and DraI-digested PAC clones using FPC were: tolerance of 3 and 4; cutoff of 9 ´10-1 and 1 ´ 10-6, respectively. Assemblies of PAC DNA fingerprints were visually inspected and manually edited.Chromosome Walking.
Contig maps for YR2 were constructed by a combination of landmark content mapping and restriction digestion fingerprinting. Y-linked RDA markers (4) and PAC-end sequences were used to screen the gridded array of the male genomic PAC library (2). DNAs of isolated PAC clones were subjected to restriction digestion fingerprinting with BamHI and NotI to establish extents of overlaps among them.1. Okada S, Sone T, Fujisawa M, Nakayama S, Takenaka M, Ishizaki K, Kono K, Shimizu-Ueda Y, Hanajiri T, Yamato KT, et al. (2001) Proc Natl Acad Sci USA 98:9454-9459.
2. Okada S, Fujisawa M, Sone T, Nakayama S, Nishiyama R, Takenaka M, Yamaoka S, Sakaida M, Kono K, Takahama M, et al. (2000) Plant J 24:421-428.
3. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA, et al. (2001) Science 291:1304-1351.
4. Fujisawa M, Hayashi K, Nishio T, Bando T, Okada S, Yamato KT, Fukuzawa H, Ohyama K (2001) Genetics 159:981-985.
5. Nagai J, Yamato KT, Sakaida M, Yoda H, Fukuzawa H, Ohyama K (1999) DNA Res 6:1-11.
6. Egydio de Carvalho C, Tanaka H, Iguchi N, Ventela S, Nojima H, Nishimune Y (2002) Biol Reprod 66:785-795.
7. Li JB, Gerdes JM, Haycraft CJ, Fan Y, Teslovich TM, May-Simera H, Li H, Blacque OE, Li L, Leitch CC, et al. (2004) Cell 117:541-552.
8. Lorenzetti D, Bishop CE, Justice MJ (2004) Proc Natl Acad Sci USA 101:8402-8407.
9. Wheeler DL, Barrett T, Benson DA, Bryant SH, Canese K, Church DM, DiCuccio M, Edgar R, Federhen S, Helmberg W, et al. (2005) Nucleic Acids Res 33:D39-D45.
10. Kaneda M, Okano M, Hata K, Sado T, Tsujimoto N, Li E, Sasaki H (2004) Nature 429:900-903.
11. Lyko F, Whittaker AJ, Orr-Weaver TL, Jaenisch R (2000) Mech Dev 95:215-217.
12. Hsu MY, Inouye M, Inouye S (1990) Proc Natl Acad Sci USA 87:9454-9458.
13. Okamoto H, Hirochika H (2001) Trends Plant Sci 6:527-534.
14. Oda K, Yamato K, Ohta E, Nakamura Y, Takemura M, Nozato N, Akashi K, Kanegae T, Ogura Y, Kohchi T, et al. (1992) J Mol Biol 223:1-7.
15. Ohyama K, Fukuzawa H, Kohchi T, Shirai H, Sano T, Sano S, Umesono K, Shiki K, Takeuchi M, Chang Z, et al. (1986) Nature 322:572-574.
16. Crosson S, Rajagopal S, Moffat K (2003) Biochemistry 42:2-10.
17. Sulston J, Mallett F, Durbin R, Horsnell T (1989) Comput Appl Biosci 5:101-106.
18. Soderlund C, Longden I, Mott R (1997) Comput Appl Biosci 13:523-535.