Abstract
The moss Physcomitrella patens is a model for the study of plant cell biology and, by virtue of its basal position in land plant phylogeny, for comparative analysis of the evolution of plant gene function and development. It is ideally suited for ‘reverse genetic’ analysis by virtue of its outstanding ability to undertake targeted transgene integration by homologous recombination. However, gene identification through mutagenesis and map-based cloning has hitherto not been possible, due to the lack of a genetic linkage map. Using molecular markers [amplified fragment length polymorphisms (AFLP) and simple sequence repeats (SSR)] we have generated genetic linkage maps for Physcomitrella. One hundred and seventy-nine gene-specific SSR markers were mapped in 46 linkage groups, and 1574 polymorphic AFLP markers were identified. Integrating the SSR- and AFLP-based maps generated 31 linkage groups comprising 1420 markers. Anchorage of the integrated linkage map with gene-specific SSR markers coupled with computational prediction of AFLP loci has enabled its correspondence with the newly sequenced Physcomitrella genome. The generation of a linkage map densely populated with molecular markers and anchored to the genome sequence now provides a resource for forward genetic interrogation of the organism and for the development of a pipeline for the map-based cloning of Physcomitrella genes. This will radically enhance the potential of Physcomitrella for determining how gene function has evolved for the acquisition of complex developmental strategies within the plant kingdom.
Keywords: Physcomitrella patens, linkage map, AFLPs, SSRs, genome sequence
Introduction
The moss Physcomitrella patens is the first non-flowering land plant for which an annotated genome sequence has been determined (Rensing et al., 2008, http://genome.jgi-psf.org/Phypa1_1/Phypa1_1.home.html). A member of the bryophytes, the first group of plants to diverge from the modern land plant lineage, Physcomitrella retains many features typical of ancestral land plants (Kenrick and Crane, 1997; Rensing et al., 2008). As such, it represents a fruitful source for the comparative analysis of gene function within the plant kingdom. An anatomically simple plant, Physcomitrella is used as a model organism for the study of plant cell biology, particularly differentiation at the cellular level and growth responses to environmental stimuli by single cells (Cove, 2005; Reski, 1998). A remarkable feature of Physcomitrella is its ability to incorporate transforming DNA at targeted sites within the genome by homologous recombination (Hofmann et al., 1999; Schaefer and Zryd, 1997; Strepp et al., 1998). Targeting vectors containing sequences as short as a few hundred base pairs are preferentially integrated at the cognate loci with frequencies similar to those observed in yeast (Kamisugi et al., 2005, 2006). This has enabled the rapid construction and analysis of mutant strains containing precisely engineered allele replacements and gene knockouts for the functional analysis of candidate genes identified in the genome sequence by ‘reverse genetics’.
Reverse genetics permits the analysis of genes whose function may be postulated by virtue of their homology with genes previously identified in other organisms. However, the majority (63%) of predicted P. patens genes currently lack evidence of their function and homology (Rensing et al., 2008). Therefore, the full potential of Physcomitrella will only be realised once a ‘forward genetic’ analysis of the organism is possible. Forward genetic approaches – the mutagenic interrogation of an organism to uncover genes responsible for specific phenotypes – enables novel genetic functions to be identified without a priori assumptions. It is an approach that is both powerful and philosophically satisfying, and in Physcomitrella mutagenesis is facilitated by the haploid nature of the dominant (gametophyte) stage of the life cycle.
Numerous Physcomitrella mutants have been isolated and described, with phenotypes associated with metabolism, cellular differentiation and hormone responses, and cellular-level growth responses such as gravi- and phototropisms (Abel et al., 1989; Ashton and Cove, 1977; Ashton et al., 1979a,b; Engel, 1968; Jenkins and Cove, 1983; Jenkins et al., 1986; Knight et al., 1991; Wang et al., 1981). However, successful identification of the genes identified through a random mutagenic approach requires that a means exists whereby a researcher may move rapidly from an identified mutant phenotype to the underlying mutated DNA sequence. In most widely adopted model organisms, this requirement is satisfied by the existence of a completed genome sequence underpinned by a well-marked genetic linkage map populated by sequence-anchored genetic markers. This enables the rapid map-based cloning of genes responsible for mutant phenotypes (Jander et al., 2002; Lukowitz et al., 2000).
Remarkably, although genetic analysis of Physcomitrella mutants was established 40 years ago (Engel, 1968), no genetic linkage map for this species has previously been established. Moreover, cytogenetic studies have historically shown disagreement as to the number of chromosomes (which are small and difficult to visualise), but the current consensus is that for the internationally used laboratory strain ‘Gransden2004’n = 1x = 27 (Bryan, 1957; Reski et al., 1994). In order to enable the integration of forward genetic approaches with the unrivalled capacity for reverse genetics that characterises Physcomitrella, we have constructed a high-density linkage map populated with molecular markers that can be directly integrated with the underlying sequence of the genome.
Results
Establishment of mapping populations
Our laboratories embarked on the identification of two classes of molecular genetic marker: amplified fragment length polymorphisms (AFLPs) (Vos et al., 1995) and expressed sequence tag (EST)-derived simple sequence repeats (SSRs) (Tautz and Renz, 1984). The former provide a means of simultaneously determining the allelic status of hundreds of independent genetic loci without any prior knowledge of their DNA sequence, and are thus well suited to the rapid acquisition of mapped loci (Van Os et al., 2006). The latter generate a framework of gene sequence-linked markers that provides a means by which the AFLP-based linkage map can be anchored to the genome sequence.
A number of independently isolated and geographically diverse Physcomitrella accessions were first screened for their relative genetic diversity, using gene-linked SSR markers identified following a computational analysis of the large Physcomitrella EST database, as well as by analysis of ribosomal DNA internal transcribed spacer sequences (von Stackelberg et al., 2006). The genetic diversity of a collection of 21 worldwide Physcomitrella accessions and two related species in the Funariaceae was evaluated by testing 64 informative SSR markers, as well as by comparing ribosomal DNA internal transcribed spacer sequences (von Stackelberg et al., 2006, and manuscript in preparation; http://www.cosmoss.org/ecomap.content). This analysis identified strains that could contribute the genetically distinct parental genotypes necessary for the generation of a mapping population. Because the object of the programme was to provide linkage between the genetic and physical maps, we chose the sequenced genotype (Gransden2004).
The French accession Villersexel K3 exhibited the greatest genetic distance from Gransden among all tested European lines. The ability to hybridise and produce viable progeny with a nicB5/ylo6 male-sterile Gransden mutant (Ashton and Cove, 1977) was evaluated for the Villersexel strain and also for two genetically divergent Japanese and Australian accessions. However, crossing experiments with the accessions from Japan (P. patens ssp. californica) and Australia (P. patens ssp. readeri) failed to produce reasonable numbers of viable progeny. Consequently the Villersexel K3 strain was selected as the most suitable choice for the second parent.
Mapping populations were established in both Leeds and Freiburg. In Leeds, we used a Gransden2004 parent that had been transgenically marked by the targeted incorporation of a single copy of an nptII selection cassette to the PpLea-1 (AY870926) locus (Kamisugi and Cuming, 2005). When crossed with the Villersexel accession, hybrid sporophytes were identified through the 1:1 segregation of the G418 resistance phenotype in their resulting spores. In Freiburg, the Villersexel parent was crossed as father with the nicB5/ylo6 male-sterile Gransden mutant as mother. Successful crosses were identified by the appearance of segregating parental SSR alleles in F1 individuals.
Like all bryophytes, the dominant vegetative form of Physcomitrella is the haploid gametophyte. Gametes are produced by mitosis, and their fusion (the outcome of fertilization of a single egg cell within an archegonium by a spermatozoid) results in the formation of a diploid sporophyte that remains dependent on the gametophyte for its nourishment and development. Within the sporophyte, meiotic spore mother cells generate the haploid spores (typically 2000–5000 per spore capsule) (Reski, 1998). Recombination during these meiotic divisions therefore generates a population of segregating, recombinant haploid F1 progeny. The mapping lines used corresponded to the individual F1 spore-derived plants that were subsequently propagated vegetatively. Being haploid and self-fertile, these lines represent a population of plants in which the original recombination junctions are fixed in a single generation: equivalent to recombinant inbred lines in a diploid species.
SSR analysis
The SSR identification (Freiburg) was carried out as previously described (von Stackelberg et al., 2006). In total, 3723 microsatellites were identified in a non-redundant Physcomitrella EST database and their utility as molecular markers was tested. F1 progeny selected from six independent spore capsules resulting from Villersexel K3 × Gransden crosses were chosen as the mapping population. Initially, 1238 microsatellite loci were examined for polymorphisms by PCR. Subsequently, the allelic status of 256 SSR markers was scored in a mapping population of 94 F1 recombinant haploid lines. Computational linkage analysis and map construction was performed with MapMaker (Lander et al., 1987). We identified 46 linkage groups (S1–S46) containing two or more SSR markers at a logarithm of the odds (LOD) threshold of 6 (Figure 1). The largest number of markers per linkage group was 10, with a map length of 46.4 cM, while there were 20 linkage groups containing only two SSR markers with an average length of 8.2 cM. In total 179 markers were mapped to a total map length (McDaniel et al., 2007) of 1168–1401 cM and an average map length of 25–30 cM per linkage group. Details of these SSRs are listed in (Appendix S1). The average marker density of the maps is one every 4.3 cM. However, these values must be treated cautiously, due to the relatively small number of markers mapped, the calculated map length representing a 50% underestimate relative to the theoretical expectation of 2700 cM (based on an assumption of 27 chromosomes (Bryan, 1957; Reski et al., 1994) with one crossover per chromosome arm).
AFLP analysis
We determined the utility of AFLP markers (Leeds) to discriminate between the Gransden and Villersexel genotypes. We used genomic DNA digested with EcoRI and MseI and EcoRI and TaqI, respectively, for adapter ligation and amplification with primers containing two selective nucleotides each. In total, we tested 224 primer combinations, before selecting a subset of 39 primer pairs for genotyping a mapping population. High-resolution electrophoresis of amplicons on a capillary sequencer enabled the identification of approximately 10 000 amplified loci (about 250 amplicons per primer pair), of which 1574 were polymorphic (Table 1).
Table 1.
Primers (E-FAM) | Loci | Primers (E-HEX) | Loci | Primers (E-NED) | Loci |
---|---|---|---|---|---|
EGA_MGA | 48 | ECA_MCG | 40 | ECG_MCG | 12 |
EGA_MGC | 38 | ECA_MCC | 39 | ECG_MCC | 16 |
EAC_MCC | 47 | ECA_MGG | 56 | ECG_MTA | 38 |
EAC_MGA | 55 | EAG_MTA | 36 | ECG_MGA | 20 |
EAC_MTC | 89 | ECA_MCT | 59 | ECG_MTT | 32 |
EAC_MCG | 38 | EAG_MGG | 40 | ECG_MGC | 26 |
EAC_MGG | 67 | EAG_MCT | 47 | ECG_MGT | 21 |
EAC_MGC | 66 | EAG_MCA | 46 | ECG_MCA | 21 |
EAC_MGT | 43 | EAG_MTT | 46 | ECG_MGG | 26 |
EAC_TCC | 40 | EAG_TTA | 22 | ECG_TGT | 46 |
EAC_TAG | 61 | EAG_TCA | 39 | ECG_TAC | 25 |
EAC_TGA | 39 | EAG_TGC | 32 | ECG_TCA | 19 |
EAC_TTC | 60 | EAG_TAA | 32 | ECG_TGA | 47 |
Primer combinations are indicated by restriction enzyme site (E = EcoRI, M = MseI, T = TaqI) and selective dinucleotide. The Eco-primers were labelled with FAM, HEX and NED as indicated. The number of polymorphic loci identified with each primer combination is indicated.
The allelic status of each polymorphic locus was scored in a mapping population of 188 F1 recombinant haploid lines and entered into JoinMap3.0 (Van Ooijen and Voorrips, 2001) for map construction. We initially obtained 26 linkage groups containing 20 or more markers at a LOD threshold of 10, comprising a total of 1220 mapped loci in a total map length of 4401 cM (not shown). This map was then integrated with the SSR-based map to enable anchorage with the genome sequence.
Map integration
The AFLP loci are anonymous. However, because each SSR is derived from an expressed gene, each SSR marker can be identified within the genomic sequence scaffolds. The mapped SSR loci were found on 94 sequence scaffolds. These provided a starting point for integration between the two linkage maps and anchorage to the annotated genome sequence. For map integration we used primers corresponding to a single ‘signature’ SSR within each SSR linkage group to genotype the respective loci within the Leeds mapping lines, thereby placing these SSR loci within the AFLP linkage map. This resulted in an integrated map comprising 31 linkage groups (LG1–LG31), with an overall map length of 4410–4418 cM. Three pairs of linkage groups that were separate at LOD = 10 coalesced at lower LOD thresholds. (LG5 and LG16 coalesced at LOD = 7, but with insufficient linkage between the loci on the two groups to calculate a combined map. LG7 and LG30 and LG20 and LG23 coalesced at LOD = 8. The combined maps for these LGs are appended at the end of Figure S1. This would generate 28 linkage groups more closely approaching n = 27. The integrated map includes three linkage groups that contain fewer than 20 loci, but that include at least one signature SSR locus (Figure 2). We placed 1420 markers in this map, including 42 SSR markers. The largest linkage group comprised 143 markers with an aggregate linkage score of 367 cM. The estimated genome coverage is high, with 99.8% of the genome predicted to be within 10 cM of a mapped marker.
Anchorage with the genome sequence
Integration of the maps provided the opportunity to convert AFLP markers into sequence characterised amplified length polymorphisms (SCALPs). The AFLPs comprise sequence-uncharacterised bands on gels. To be useful, it is essential to identify the underlying polymorphic DNA sequence. Typically, this is achieved by physical excision of a band from a gel, reamplification and sequence analysis (Dussle et al., 2002). However, this is not easily achieved when bands are resolved by capillary sequencers. Alternatively, the sequences may be predicted by computational analysis of the sequenced genome. As polymorphic fragments, AFLPs occur (by definition) once in the genome sequence. They have ends defined by restriction sites plus the selective nucleotides, and are of a defined length. It is theoretically possible to identify such fragments within a well-characterised genome sequence, with a high level of confidence (Peters et al., 2001). However, the first-draft Physcomitrella sequence assembly is still incomplete. Currently comprising 2106 scaffolds, additional ambiguity is introduced by stretches of undetermined sequences, of indeterminate length, within some contigs, represented as lengths of ‘NNNNNNNNNNN’ comprising about 5% of the genome sequence assembly. This does not permit unambiguous identification of AFLP marker sequences within the whole genome sequence assembly. Nevertheless, definitive association of an AFLP with a sequence scaffold provides a much improved likelihood of correct assignation. Integration of the AFLP-based map with a sequence-anchored SSR-based map enables such associations to be made.
We developed a Perl program, ‘SCALPHunter’, to propose candidate sequences within the Physcomitrella genome assembly based on these criteria. Because accurate determination of AFLP fragment lengths carries with it uncertainties (variable accuracy of fragment length determination in different sequencing eletropherograms, addition of terminal bases to some PCR products by Taq polymerase), SCALPHunter identifies all possible length variants within the range ±4 bp. The sequence scaffolds containing mapped SSRs were therefore interrogated by SCALPHunter to identify candidate AFLP sequences corresponding to the Gransden-specific markers with which the mapped SSRs co-segregated. In many cases, this identified several candidate genomic loci corresponding to a cluster of AFLP markers, linked to the SSR locus, enabling an initial integration between genetic linkage and underlying DNA sequence. The candidate AFLP sequences identified are listed in Appendix S2, and their map locations are illustrated in Figure S1.
Verification of a SCALPHunter prediction
The confidence with which SCALPs may be identified requires experimental verification. The predicted anchorage between the sequence and AFLP markers was therefore tested. As a proof of concept, we undertook the mapping of a Physcomitrella gene. The G418-resistant phenotype in the Gransden parent used to construct the Leeds mapping lines results from the targeted integration of a selection cassette into the PpLEA-1 locus. This locus is present in sequence scaffold 12 of the v1.1 genome assembly. Segregation of the G418-resistant phenotype indicated that the gene was located at 106 cM on linkage group 11 of the integrated map (Figure 2). Coincidentally, SSR_253 defines a GT-dinucleotide repeat polymorphism within the 5′-untranslated region (UTR) of this gene, mapped in SSR-linkage group S34 (Figure 1). Linkage group 11 contains two additional signature SSR markers, SSR_765 and SSR_925, in S39 and S25, respectively, and 50 mapped AFLP markers. Of these, 29 derive from the Gransden genotype and are thus potentially identifiable by SCALPHunter. The three SSR-linkage groups contain 11 SSR markers, which correspond to sequences located on ten genome sequence scaffolds (scaffolds 12, 71, 92, 100, 121, 162, 225, 251, 266 and 275, respectively). Interrogation of these sequence scaffolds by SCALPHunter proposed candidate sequences for 21 of the 29 the Gransden-specific AFLP loci within the integrated LG11 (Figure 3).
The predicted sequences of candidate AFLP loci can be used to design primers with an extended selective sequence to amplify fragments from the parental genotypes. These fragments can be resolved electrophoretically to ascertain their genotype-specific nature, and sequenced to confirm their identity. We selected the locus EACTGA508G/EACTGA510V, at 115 cM on linkage group 11 (Figure 3), since this was represented by an allelic pair of fragments that differ by two base pairs and we anticipated that both the Gransden and Villersexel alleles would be amplifiable using AFLP-specific primers.
SCALPHunter identifies this locus as commencing at an EcoRI site that is the same as that identified for the closely linked locus EACMGT136G (118 cM). We designed +6 primers (Eco-ACAATT/Taq-GAGGAG) to amplify the allelic pair. Amplification generated a fragment of about 500 bp from each parental genotype (Figure 4a). Digestion with MseI revealed an additional MseI site in the fragment amplified from the Gransden genotype, the existence of which generates the Gransden-specific AFLP, EACMGT136G (Figure 4a,b). Sequence analysis of the genotype-specific amplified fragments revealed the causes of these polymorphisms (Figure 4c). A mutation A113G/G113V abolishes the MseI site in the Villersexel genotype, whilst a 2-bp indel in the stretch of T-residues commencing at position 124 accounts for the 2-bp difference in fragment length between the Villersexel and Gransden alleles. Three additional SNPs were also observed in this fragment: G160G/A162V, A377G/G379V and T470G/C472V.
Discussion
Genetic linkage maps are inevitably ‘works in progress’, their accuracy dependent on the resolution with which alleles may be scored and by the sizes of mapping populations. Their reliability increases as more loci are added, and only a comprehensive integration with a completely determined genome sequence will provide a true relationship between genetic and physical distance. We have initiated such an integration for P. patens. The incorporation of further sequence-anchored loci will refine this.
The overall integrated map length of 4410–4418 cM is almost certainly an overestimate. The molecular characterisation of two loci on LG11 (Figure 4) supports this view: the AFLP loci, EACTGA508G and EACMGT136G, share a common EcoRI site but are separated by 3 cM in the derived map. Because AFLPs are dominant markers, the allelic status of a locus being characterised by either the presence or absence of a band, map inflation is an inherent problem due to the difficulties of unambiguously scoring each locus (van Os et al., 2005). ‘Allele drop-out’– the failure to amplify a single fragment – in a small mapping population can introduce significant error: increasing the number of co-dominant markers in future iterations of the map should resolve this.
In many cases, the order in which candidate AFLP sequences occur in the genome sequence scaffolds is consistent with the order of these markers in the linkage map, but there are also several instances where marker position appears to be inverted, or intercalated markers are observed. This might arise either through incorrect assembly of the sequence scaffolds, through the identification of an incorrect sequence by SCALPHunter or through incorrect scoring of segregation. This last cause most likely results in ambiguities in marker order over short linkage and sequence distances [for example, the intercalation of EAGTGC349G between EACTGA508G and EACMGT136G in LG11 (Figure 4)]. Where a marker appears to be inverted relative to others by a long interval (e.g. EAGMCA596G in LG11 relative to other candidate loci identified in sequence scaffold 12, or EACMGC421G in the same linkage group, relative to the other candidate loci in sequence scaffold 251 (Figure 4)], then the cause is most likely a misidentification by SCALPHunter. Such misidentifications will be inevitable, due to the length variation (±4 bp) built in to SCALPHunter predictions.
The addition of further sequence-anchored markers will also be necessary to refine the accuracy of prediction of AFLP loci by SCALPHunter. Currently, only 94 of the 2106 sequence scaffolds comprising the Physcomitrella version 1.1 sequence assembly are represented by mapped markers. These comprise about 143 Mbp, approximately 30% of the estimated complete genome sequence (Rensing et al., 2008). Future development of the map will be based on additional SSR loci and single nucleotide polymorphisms (SNPs) located on unanchored sequence scaffolds. The ESTs generated in support of the first-draft genome sequence assembly include a catalogue of about 120 000 ESTs derived from the Villersexel parental genotype to provide a resource for the rapid identification of suitable SNP loci for future genotyping.
At present, the ease with which determination of gene function can be undertaken through homologous-recombination-mediated gene targeting recommends Physcomitrella as a powerful tool for ‘reverse genetic’ analysis of plant gene function. However, in order to identify novel genes essential for the normal development of Physcomitrella, which may be characteristic of the earliest land plants, and which may be expected to inform our understanding of land plant evolution, it is essential to be able to undertake ‘forward’ genetic analysis: the identification of genes from their mutant phenotypes. The availability of a well-marked genetic linkage map directly related to the underlying genome sequence will provide a platform from which the map-based cloning of genes identified through mutagenesis can be undertaken.
Experimental procedures
Plant material
Physcomitrella patens (Hedw.) B.S.G. plants were vegetatively propagated on agar medium as previously described [Knight et al., 2002; Bierfreund et al., 2003]. The ‘Gransden’ isolate was derived from a single spore collected by H. L. K. Whitehouse near Gransden Wood, Cambridgeshire, UK in 1962. The ‘Villersexel K3’ strain was collected from Villersexel, Villers la Ville, Haute Saône, France. Crosses between the two genotypes used the Villersexel strain as the male parent and defined mutant strains of the Gransden strain as the female parent. In Leeds, a transgenic strain of the Gransden isolate was constructed by transformation with an nptII cassette targeted to the PpLEA-1 locus (Kamisugi and Cuming, 2005), conferring resistance to G418. This strain was confirmed as containing a single copy of the selectable marker at this locus by Southern blot analysis (Kamisugi et al., 2005). In Freiburg, the Gransden parent was a self-sterile nicotinic acid-requiring auxotrophic mutant nicB5/ylo6 (Ashton and Cove, 1977). Four to five plantlets of each parent were inoculated in a mixed stand adjacent to one another on solid medium and grown under inductive conditions for sporophyte development (Ashton and Cove, 1977). Mature spore capsules from the Gransden parent were surface-sterilised and the spores were liberated by gently crushing in sterile water. Spores were germinated and the progeny screened for G-418 resistance (Leeds) or SSR segregation (Freiburg). A 1:1 segregation indicated that the progeny derived from a hybrid sporophyte. Individual plants (188 Leeds; 94 Freiburg) were maintained as mapping populations by vegetative propagation.
DNA isolation and analysis
Genomic DNA was isolated as described previously for AFLP and SSR analysis (AFLP, Knight et al., 2002; SSR, von Stackelberg et al., 2006). The SSR analysis was performed as previously described (von Stackelberg et al., 2006). The primers used and the designation of the individual SSRs are listed in Table S1. For AFLP analysis, we followed the protocols established by Myburg et al. (2001). DNA (100 ng) was digested with EcoRI and MseI, or with EcoRI and TaqI, prior to ligation with adapter oligonucleotides (Myburg et al., 2001). DNA was pre-amplified using primers containing an additional selective nucleotide (+1 primers: EcoRI, GACTGCGTACCAATTCN; MseI, GATGAGTCCTGAGTAAN; TaqI, GATGAGTCCTGAGCGAN). The PCR comprised 28 cycles of 15 sec at 94°C, 30 sec at 60°C, 1 min + 1 sec per cycle at 72°C followed by 1 cycle of 2 min at 72°C. Pre-amplified DNA (diluted 50-fold) was selectively amplified using the same primer sequences but containing two selective nucleotides (+2 primers). The EcoRI primer was labelled with one of three fluorescent dyes: FAM, HEX (Operon Biotechnologies, https://www.operon.com/) and NED (Applied Biosystems, http://www.appliedbiosystems.com/). The PCR was performed with 13 cycles of 10 sec at 94°C, 30 sec at 65 −0.7°C per cycle, 1 min at 72°C followed by 25 cycles of 10 sec at 94°C, 30 sec at 56°C, 1 min + 1 sec per cycle at 72°C. Following amplification, DNA was recovered by ethanol precipitation and redissolved in water. For electrophoretic analysis, three differently labelled sets of amplification products were multiplexed by mixing in the ratio FAM:HEX:NED = 1:2:2 supplemented with 0.5 μl of ROX-labelled size markers (MapMarker500, Cambio, http://www.cambio.co.uk/) and applied to an ABI 3130 Genetic Analyzer (Applied Biosystems). Chromatograms were analysed using the Applied Biosystems ‘Genemapper’ software, with the individual polymorphic peaks being scored manually and recorded in an Excel spreadsheet.
Linkage analysis
The allelic status of 256 SSR markers was scored in a mapping population of 94 F1 recombinant haploid lines (1 × 96-well plates, including DNA from the parental genotypes in two wells). Computational linkage analysis and map construction was performed with MapMaker (Lander et al.,1987) using Kosambi's mapping function for distance estimation, a LOD threshold of 6 and a maximal distance of 30 cM. The AFLPs were scored in a mapping population of 188 F1 recombinant haploid lines (2 × 96-well plates, each including DNA from the parental genotypes in two wells). Linkage analysis for the ALFPs was performed using the mapping software JoinMap 3.0 (Van Ooijen and Voorrips, 2001) with a LOD threshold of 8–10, a recombination threshold of 0.3, a ripple value of 1, a jump threshold of 5 and Kosambi's mapping function. Markers exhibiting segregation significantly different from a 1:1 ratio were excluded following χ2 testing. The same mapping parameters were used following scoring of signature SSR loci within the AFLP linkage mapping population, to generate an integrated map.
Map length calculation
We followed the approach of McDaniel et al. (2007) in calculating map lengths using two methods: that of Fishman et al. (2001) in which total map length L = Σ[(linkage group length) + 2(linkage group length/no. markers)] and Chakravarti et al. (1991), in which L = Σ[(linkage group length) × (no. markers + 1)/(no. markers − 1)]. Similarly, genome coverage (c) was calculated as c = 1 − exp(2dn/L) where d is distance in cM, n is the number of markers and L is map length.
SCALPHunter
SCALPHunter is a Perl script that, given a list of AFLPs, identifies the possible candidate sequences within a given set of scaffolds. It can run 800 AFLP sequences across the whole Physcomitrella genome in about 30 min (AMD dual core 4400+ 2 GB RAM). The basic process is as follows:
SCALPHunter reads in a list of codes describing AFLPs (length after subtraction of primers and end sequence), which provides the information for pattern matching against the scaffolds. The forward- and reverse-complement regular expressions are precompiled and placed upon a hash (using precompiled regular expressions greatly speeds up the process of searching for the existence of a list of AFLPs amongst a set of scaffolds). In addition, the regular expressions include a user-defined length variation (e.g. ±4 bp) to allow for the variable accuracy of fragment length determination.
Each scaffold is searched in turn using all of the precompiled AFLP regular expressions and all matches are stored.
The complete set of matches is sorted by size (allowing for variation of ±4 bp) and output in three different forms. Firstly, a set of matches-per-scaffold, showing all of the AFLPs found to match a scaffold given the pattern/length-variation criteria. In addition, a set of matches-per-AFLP, showing all of the scaffolds that an AFLP is found to match, and the corresponding sequences within those scaffolds. Finally, a list of the matches-per-length-variation, showing all of the AFLPs that matched scaffolds for a given sequence-length variation.
To identify candidate AFLPs within a scaffold identified as carrying an SSR, the matches-per-scaffold file was interrogated with each AFLP within the corresponding linkage group. The sequence of each candidate was obtained from the matches-per-AFLP file and its position was located within the scaffold by Blast search of the Physcomitrella genome.
Acknowledgments
ACC and YK are grateful to the Gatsby Charitable Foundation and the University of Leeds Strategic Research Fund for their initial support of this work, and to the UK Biotechnology and Biological Sciences Research Council (BBSRC) for their ongoing support. MC is the recipient of a BBSRC studentship. SAR, MvS, DL and RR thank the DFG, the BMBF (FRISYS) and the Wissenschaftliche Gesellschaft for funding. MvS would like to thank E. Esch for helpful comments. We are grateful to Professor David Cove for providing the nicB5/ylo6 mutant strain, and to Dr Stuart McDaniel for useful suggestions.
Supporting Information
Additional Supporting Information may be found in the online version of this article:
Figure S1. The anchored Physcomitrella patens linkage map.
Appendix S1. Candidate amplified fragment length polymorphism sequences.
Appendix S2. Segregation data.
Table S1. Simple sequence repeat marker information.
Please note: Wiley-Blackwell are not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.
References
- Abel WO, Knebel W, Koop HU, Marienfeld JR, Quader H, Reski R, Schnepf E, Spörlin B. A cytokinin-sensitive mutant of the moss, Physcomitrella patens, defective in chloroplast division. Protoplasma. 1989;152:1–13. [Google Scholar]
- Ashton NW, Cove DJ. The isolation and preliminary characterisation of auxotrophic and analogue resistant mutants of the moss, Physcomitrella patens. Mol. Gen. Genet. 1977;154:87–95. [Google Scholar]
- Ashton NW, Cove DJ, Featherstone DR. The isolation and physiological analysis of mutants of the moss, Physcomitrella patens, which over-produce gametophores. Planta. 1979a;144:437–442. doi: 10.1007/BF00380119. [DOI] [PubMed] [Google Scholar]
- Ashton NW, Grimsley NH, Cove DJ. Analysis of gametophytic development in the moss, Physcomitrella patens, using auxin and cytokinin resistant mutants. Planta. 1979b;144:427–435. doi: 10.1007/BF00380118. [DOI] [PubMed] [Google Scholar]
- Bierfreund N, Reski R, Decker EL. Use of an inducible reporter gene system for the analysis of auxin distribution in the moss Physcomitrella patens. Plant Cell Rep. 2003;21:1143–1152. doi: 10.1007/s00299-003-0646-1. [DOI] [PubMed] [Google Scholar]
- Bryan VS. Cytotaxonomic studies in the Ephemeraceae and Funariaceae. Bryologist. 1957;60:103–126. [Google Scholar]
- Chakravarti A, Lasher LK, Reefer JE. A maximum-likelihood method for estimating genome length using genetic linkage data. Genetics. 1991;128:175–182. doi: 10.1093/genetics/128.1.175. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cove DJ. The moss Physcomitrella patens. Annu. Rev. Genet. 2005;39:339–358. doi: 10.1146/annurev.genet.39.073003.110214. [DOI] [PubMed] [Google Scholar]
- Dussle CM, Quint M, Xu M, Melchinger A, Lübberstedt T. Conversion of AFLP fragments tightly linked to SCMV genes Scmv1 and Scmv2 into simple PCR-based markers. Theor. App. Genet. 2002;105:1190–1195. doi: 10.1007/s00122-002-0964-7. [DOI] [PubMed] [Google Scholar]
- Engel PP. The induction of biochemical and morphological mutants in the moss Physcomitrella patens. Am. J. Bot. 1968;55:438–446. [Google Scholar]
- Fishman L, Kelly AJ, Morgan E, Willis JH. A genetic map in the Mimulus guttatus species complex reveals transmission ratio distortion due to heterospecific interactions. Genetics. 2001;159:1701–1716. doi: 10.1093/genetics/159.4.1701. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hofmann A, Codon A, Ivascu C, Russo VEA, Knight CD, Cove DJ, Schaefer DG, Chakhparonian M, Zr JP. A specific member of the Cab multigene family can be efficiently targeted and disrupted in the moss, Physcomitrella patens. Mol. Gen. Genet. 1999;261:92–99. doi: 10.1007/s004380050945. [DOI] [PubMed] [Google Scholar]
- Jander G, Norris SR, Rounsley SD, Bush DF, Levin IM, Last RL. Arabidopsis map-based cloning in the post-genome era. Plant Physiol. 2002;129:440–450. doi: 10.1104/pp.003533. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jenkins GI, Cove DJ. Phototropism and polarotropism of primary chloronemata of the moss Physcomitrella patens: responses of mutant strains. Planta. 1983;159:432–438. doi: 10.1007/BF00392079. [DOI] [PubMed] [Google Scholar]
- Jenkins G, Courtice GRM, Cove DJ. Gravitropic responses of wild-type and mutant strains of the moss Physcomitrella patens. Plant Cell Envir. 1986;9:637–644. doi: 10.1111/j.1365-3040.1986.tb01621.x. [DOI] [PubMed] [Google Scholar]
- Kamisugi Y, Cuming AC. The evolution of the Abscisic acid-response in land plants: comparative analysis of group 1 LEA gene expression in moss and cereals. Plant Mol. Biol. 2005;59:723–737. doi: 10.1007/s11103-005-0909-z. [DOI] [PubMed] [Google Scholar]
- Kamisugi Y, Cuming AC, Cove DJ. Parameters determining the efficiency of gene targeting in the moss Physcomitrella patens. Nucleic Acids Res. 2005;33:e173. doi: 10.1093/nar/gni172. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kamisugi Y, Schlink K, Rensing SA, Schween G, von Stackelberg M, Cuming AC, Reski R, Cove DJ. The mechanism of gene targeting in Physcomitrella patens: homologous recombination, concatenation and multiple integration. Nucleic Acids Res. 2006;34:6205–6214. doi: 10.1093/nar/gkl832. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kenrick R, Crane PR. The origin and early evolution of plants on land. Nature. 1997;389:33–39. [Google Scholar]
- Knight CD, Futers TS, Cove DJ. Genetic analysis of a mutant class of Physcomitrella patens in which the polarity of gravitropism is reversed. Mol. Gen. Genet. 1991;230:12–16. doi: 10.1007/BF00290643. [DOI] [PubMed] [Google Scholar]
- Knight CD, Cove DJ, Cuming AC, Quatrano RS. Moss Gene Technology. In: Gilmartin PM, Bowler C, editors. Molecular Plant Biology. Vol. 2. Oxford: Oxford University Press; 2002. pp. 285–299. [Google Scholar]
- Lander ES, Green P, Abrahamson J, Barlow A, Daly MJ, Lincoln SE, Newburg L. MAPMAKER: An interactive computer package for constructing primary genetic linkage maps of experimental and natural populations. Genomics. 1987;1:174–181. doi: 10.1016/0888-7543(87)90010-3. [DOI] [PubMed] [Google Scholar]
- Lukowitz W, Gillmor CS, Scheible WR. Positional cloning in Arabidopsis. Why it feels good to have a genome initiative working for you. Plant Physiol. 2000;123:795–805. doi: 10.1104/pp.123.3.795. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McDaniel S, Willis JH, Shaw AJ. A linkage map reveals a complex basis for segregation distortion in an interpopulatiion cross in the moss Ceratodon purpureus. Genetics. 2007;176:2489–2500. doi: 10.1534/genetics.107.075424. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Myburg AA, Remington DL, O’Malley DM, Sederoff RR, Whetten RW. High-throughput AFLP analysis using infrared dye-labeled primers and an automated DNA sequencer. Biotechniques. 2001;30:348–357. doi: 10.2144/01302tt04. [DOI] [PubMed] [Google Scholar]
- van Os H, Stam P, Visser RGF, Van Eck HJ. RECORD: a novel method for ordering loci on a genetic linkage map. Theor. Appl. Genet. 2005;112:30–40. doi: 10.1007/s00122-005-0097-x. [DOI] [PubMed] [Google Scholar]
- Peters JL, Constandt H, Neyt P, Cnops G, Zethof J, Zabeau M, Gerats T. A physical amplified fragment-length polymorphism map of Arabidopsis. Plant Physiol. 2001;127:1579–1589. [PMC free article] [PubMed] [Google Scholar]
- Rensing SA, Lang D, Zimmer AD, et al. The Physcomitrella genome reveals insights into the conquest of land by plants. Science. 2008;319:64–69. doi: 10.1126/science.1150646. [DOI] [PubMed] [Google Scholar]
- Reski R. Development, genetics and molecular biology of mosses. Botanica Acta. 1998;111:1–15. [Google Scholar]
- Reski R, Faust M, Wang XH, Wehe M, Abel WO. Genome analysis of the moss Physcomitrella patens (Hedw) BSG. Mol. Gen. Genet. 1994;244:352–359. doi: 10.1007/BF00286686. [DOI] [PubMed] [Google Scholar]
- Schaefer DG, Zr JP. Efficient gene targeting in the moss Physcomitrella patens. Plant J. 1997;11:1195–1206. doi: 10.1046/j.1365-313x.1997.11061195.x. [DOI] [PubMed] [Google Scholar]
- von Stackelberg M, Rensing SA, Reski R. Identification of genic moss SSR markers and a comparative search in twenty-four algal and plant gene indices reveal species-specific rather than group-specific characteristics of microsatellites. BMC Plant Biol. 2006;6:9. doi: 10.1186/1471-2229-6-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Strepp R, Scholz S, Kruse S, Speth V, Reski R. Plant nuclear gene knockout reveals a role in plastid division for the homolog of the bacterial cell division protein FtsZ, an ancestral tubulin. Proc. Natl Acad. Sci. USA. 1998;95:4368–4373. doi: 10.1073/pnas.95.8.4368. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tautz D, Renz M. Simple sequences are ubiquitous repetitive components of eukaryotic genomes. Nucleic Acids Res. 1984;12:4127–4138. doi: 10.1093/nar/12.10.4127. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van Ooijen JW, Voorrips RE. JoinMap® 3.0, Software for the Calculation of Genetic Linkage Maps. Wageningen: The Netherlands: Plant Research International; 2001. [Google Scholar]
- Van Os H, Andrzejeweski S, Bakker E, et al. Construction of a 10,000-marker ultradense genetic recombination map of potato: providing a framework for accelerated gene isolation and a genomewide physical map. Genetics. 2006;173:1075–1087. doi: 10.1534/genetics.106.055871. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vos P, Hogers R, Bleeker M, et al. AFLP: a new technique for DNA fingerprinting. Nucleic Acids Res. 1995;23:4407–4414. doi: 10.1093/nar/23.21.4407. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang TL, Beutelmann P, Cove DJ. Cytokinin biosynthesis in mutants of the moss Physcomitrella patens. Plant Physiol. 1981;168:739–744. doi: 10.1104/pp.68.3.739. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.